Summary
Overview
Work History
Education
Skills
Certification
Accomplishments
Timeline
Generic

Kaaviya Karunanidhi

Dublin

Summary

Experienced and dedicated professional with over 4 years of expertise in data engineering, analytics, and machine learning. Demonstrated ability to develop on-premise and cloud-based data warehouses, conduct data analysis, design ETL pipelines, and manipulate large-scale data. Proficient in multiple cloud platforms, passionate about utilizing data-driven insights to deliver impactful business solutions.

Key Skills - Python, SQL, R, ETL, Azure Data Factory, Azure Databricks, Snowflake, and Power BI.

Overview

6
6
years of professional experience
1
1
Certification

Work History

Data Engineer

GeoDirectory AnPost
10.2022 - Current

Project 1:

  • Built various scalable data pipelines and orchestrated in Azure Data Factory to copy data from Oracle to Data Lake and then to Azure Synapse Data Warehouse.
  • Designed and implemented ETL pipeline with Azure Data Factory using Pyspark for transforming source data into Dimensional and Fact tables to load data into PostgreSQL database
  • Collaborated with Ergo and Openapp to manage virtual machines for API-driven products, achieving a 65% reduction in setup costs.
  • Proficiently managed to work with stakeholders presenting them with data designs, promptly addressed inquiries, and implemented improvement requests.


Project 2:

  • Implemented new products for GeoDirectory, which included raw data collection, cleaning, and transformation into structured data using Python (Pandas, Numpy, Shapely) and advanced SQL queries, contributing to a 45% increase in revenue.
  • Analyzed 30 million address aliases using Python libraries such as Fuzzywuzzy, GeoPy, and NLP techniques using Pyspark to improve geospatial accuracy through data cleaning, geocoding, and coordinate transformation achieving 95% accuracy in cleaning incorrect addresses.
  • Predicted property prices using machine learning models (LSTM, XGBoost, Decision Tree, Linear, Lasso) with an RMSE of 2.31, improving prediction accuracy by 70% through hyper parameter tuning with Bayesian methods.
  • Developed and implemented interactive Power BI dashboards and reports to visualize the status of residential and commercial buildings, utilizing census data. Conducted in-depth analyses of social cluster categories, identifying trends and preferences to provide actionable insights for improving lifestyle and urban planning to customers.
  • Led a team of seven to deliver a high-profile address matching project for the Department of Transport, generating approximately €50,000 profit ensuring client satisfaction and project success.


Project 3:

  • Spearheaded Quarterly-cut address data updates with complex SQL & PL/SQL queries into Oracle Database
  • Strong experience in optimizing the Oracle database by creating indexes and designed procedures, functions, tables, materialized views, CTE and sequences.
  • Written and fine-tuned multiple functions in Oracle and Python to implement faster extraction from various source systems and archive the data for future usage.

Data Engineer

Infosys Limited
10.2019 - 07.2021

Project 1:

  • Worked with banking-based client and migrated large-scale financial data marts from an on-premise Oracle data warehouse to a cloud-based Snowflake environment.
  • Architected and built ELT workflows on financial datasets using HDFS and HIVE, leveraging Hadoop MapReduce techniques for parallel data processing, which reduced data loading time by 35% compared to traditional methods.
  • Engineered recurrent ETL pipelines in Azure Data Factory for seamless historical and incremental data integration from Oracle source systems to Azure Blobs.
  • Knowledge of version control and Azure DevOps CI/CD pipelines for deploying Azure Data Factory pipelines to different environments.
  • Awarded for designing a comprehensive training roadmap for 20 new joiners, leading to a 30% reduction in onboarding time while improving skill acquisition efficiency.

Project 2:

  • Prepared patching prioritization, threat analysis reports and dashboards for financial data risk assessment using Power Query, Creation of measures, DAX measures and parameters in Power BI and helped reduce the team efforts by 40%.
  • Implemented predictive modeling and data-driven rules with Python scripts, deriving actionable insights from financial reports, resulting in a 20% reduction in fraudulent risks.
  • Power BI’s drill-through filters and custom visualizations to improve fraud detection and enhanced decision-making.
  • Collaborated cross-functionally to enhance productivity, improve internal processes, and propose better innovative solutions to the team.

Project 3:

  • Developed statistical models on financial data using R for anomaly detection on Z-score basis
  • Used Tableau for exploratory data analysis, elevating reporting transparency from 20% to over 70%.

Data Engineer Trainee

Infosys Limited
05.2019 - 10.2019
  • Utilized JIRA to track project tasks, data engineering issues, and bugs throughout the SDLC
  • Wrote and executed JUnit tests for data transformation logic and ETL pipeline components. Integrated these tests into the CI/CD pipeline to ensure data quality and transformation accuracy, reducing manual testing efforts by 30%.
  • Learned and applied Big Data technologies and Informatics for ETL workflows through case studies
  • Trained and certified in Project management methods (Agile, waterfall)


Education

Master in Data Science -

Trinity College Dublin
01.2022

Bachelor in Electronics & Instrumentation - undefined

Kongu Engineering College
01.2019

Skills

  • Technical Skills & Tools: Python, SQL, Power BI, MS Azure, Hadoop, Hive, Tableau, MS Excel, Sci-kit learn, Tensorflow, Keras, NLP, Snowflake, Junit Testing, Unix shell scripting, Selenium, Informatica, Qlik, Java, C, C
  • Analytical & ML Skills: Business Analysis, Statistical Data Analysis, Natural Language Processing, Artificial Intelligence
  • Soft Skills: Strategic Thinking, Problem Solving, Presentation Skills, Storytelling
  • Language Proficiency: English: Fluent, Tamil: Native Speaker, Hindi: Intermediate

Certification

IBM Data Scientist Professional Certificate, ISTQB Foundation, NLP using Python, KPMG & Accenture Data Analytics Virtual internship, Infosys Agile Developer.

Accomplishments

  • NCC 'A' Certificate Achievement: Awarded the NCC 'A' Certificate for exceptional dedication and top performance in Junior Division training, excelling in drills, fitness tests, and field craft skills while demonstrating leadership, discipline, and teamwork. (Sep 2013)
  • STUDENT AMBASSADOR: Led COVID-19 VACCINE4 ALL campaign on behalf of global health class. (Oct 2021)
  • DISSERTATION – Trinity College Dublin: Achieved Distinction in Final Year Dissertation on topic 'DNN – based Adaptive Cruise Control model for safe and efficient car-following using CARLA simulation' incorporating Artificial Intelligence - Double Q method with70% efficiency. (Sep 2022)

Timeline

Data Engineer

GeoDirectory AnPost
10.2022 - Current

Data Engineer

Infosys Limited
10.2019 - 07.2021

Data Engineer Trainee

Infosys Limited
05.2019 - 10.2019

Bachelor in Electronics & Instrumentation - undefined

Kongu Engineering College

Master in Data Science -

Trinity College Dublin
Kaaviya Karunanidhi