

Professional with expertise in Azure technologies, including Azure Data Factory, Data Lake Storage Gen2, and Databricks. Demonstrated skills in data analytics and visualization using Power BI and Tableau, with additional experience in Snowflake and AWS S3. Proficient in version control and CI/CD practices with GitHub, GitLab, and Bitbucket, alongside Docker containerization and API development. Experienced in Agile/Scrum project management with JIRA and Confluence, enhancing team collaboration through SharePoint and Teams.
Tech Stack: Azure Data Factory (ADF) | Azure Data Lake Storage Gen2 (ADLS) | Azure Databricks | Azure SQL Database | Postgres | Azure Synapse
Worked on couple of projects for the Azure Data Engineering Trainer, where the company and data are confidential. The project is in relation to the replication of real time project, which was worked with dummy data.
Tech Stack: Azure Data Factory (ADF) | Azure Data Lake Storage Gen2 (ADLS) | Azure Databricks | Azure SQL Database | SSMS | Power BI
Data Source (Kaggle): Adventure Works / Amazon Prime & Netflix / Mutual Funds
Architecture
Project Overview
1. Data Ingestion (ADF)
· Ingested transactional, customer data from a Dockerized PostgreSQL source using ADF. Designed scalable pipelines to support data loads.
2. Data Lake Design (ADLS Gen2)
· Implemented Bronze–Silver–Gold architecturefor raw, refined, and aggregated data. Standardized data storage using Parquet format.
3. Data Transformation (Azure Databricks)
· Performed data cleansing, enrichment, normalization using PySpark. Calculated business KPIs.
4. Data Warehousing (Azure SQL DB & Snowflake)
· Published Gold-layer KPI datasets to Azure SQL Database and Snowflake. Enabled downstream analytics and reporting use cases.data cleansing, enrichment, normalization using PySpark. Calculated business KPIs., Data Warehousing (Azure SQL DB & Snowflake), Published Gold-layer KPI datasets to Azure SQL Database and Snowflake. Enabled downstream analytics and reporting use cases. ETL Processing Project, Azure Data Factory (ADF), Azure Data Lake Storage Gen2 (ADLS), On Prem SQL Server, Postgres SQL, Kaggle: Customer Churn Analytics / COVID-19 Analytics/E-Commerce Sales Analytics, Designed and implemented end-to-end ETL pipelines using ADF. Configured Self-Hosted Integration Runtime (SHIR) to securely integrate on-prem SQL Server and PostgreSQL., Implemented full and incremental data loads and ingested data into ADLS Gen2 using optimized Parquet/CSV formats., Organized data using Raw, Curated, and Presentation layers to support scalable analytics. Performed data transformations and data quality validations using ADF Mapping Data Flows., Scheduled, monitored, and optimized pipelines using ADF triggers, logging, alerting, and performance tuning.