Job Description
Roles & Responsibilities
Role & responsibilities
- 6-8+ years of Data Engineering experience with demonstrable production pipeline ownership
- Expert-level Python and PySpark for large-scale data transformation
- Deep experience with Azure Databricks, Delta Lake, and Unity Catalog
- Hands-on Oracle DB integration experience: JDBC drivers, REST APIs, Oracle GoldenGate or equivalent CDC tooling
- Proficiency with Azure Data Factory, Azure Event Hubs, and Azure Blob / ADLS Gen2
- Strong SQL skills including complex window functions, CTEs, and performance optimisation
- Experience building and publishing data APIs (FastAPI or similar) for downstream consumption
- Knowledge of data modelling patterns: medallion architecture, Kimball, Data Vault
- Infrastructure-as-Code familiarity: Terraform or Bicep for Azure resource provisioning
- Experience with CI/CD for data pipelines via Azure DevOps or GitHub Actions
Desired Candidate Profile
Preferred candidate profile
- Experience with dbt for data transformation and documentation
- Exposure to vector data stores and embedding pipelines for AI applications
- Knowledge of stream processing with Spark Structured Streaming or Flink
- Familiarity with MERN stack to aid integration with application-layer APIs
- Azure certifications (DP-203, DP-300)