Data-driven professional with over 5 years of experience in data science, big data engineering, and research focused analytics across academic and industrial settings. Proven expertise in process monitoring, machine learning, and statistical modelling with a strong foundation in data architecture and analytical tools. Currently serving as a Postdoctoral Researcher at the Technological University of the Shannon, delivering industry focused AI and data analytics solutions while leading cross-functional collaborations and teaching engagements. Experience working with pharmaceutical industry partners in alignment with cGMP principles, ensuring data integrity and validation support for regulated environments.
Built and optimized scalable big data pipelines using Hadoop, Spark, Hive, and Kafka for enterprise data
ecosystems.
•
Designed ETL workflows and supported structured/unstructured data lakes improving retrieval and
reporting performance.
•
• Implemented automation tools, improving data ingestion and reducing manual workload by 40%.
• Collaborated with business stakeholders to define architecture and KPIs for real-time analytics dashboards.
Built a predictive model that increased production efficiency for ELI Lilly. • Optimized ETL processes that enhanced data quality and consistency. AI-Driven Production Stage Sensing for Smart Predictive Maintenance Analytics (Forthcoming 2024) Co-authored a paper currently under review, focusing on stage-wise process classification using deep learning models for predictive maintenance in smart manufacturing environments. • Banking Data Migration: Successfully migrated data pipelines for a top financial institution, significantly improving data retrieval speed and reducing storage redundancy. • ETL Optimization: Enhanced data quality and consistency by streamlining ETL pipelines, leading to improved analytics reliability across operational dashboards.
Hadoop Apache Spark Hive Kafka SQL NoSQL Regression Models Time-Series Forecasting Hypothesis Testing Anomaly Detection Predictive Modeling Neural Networks Data Preprocessing Transformation Pipeline Development Python R Power BI Plotly Dash Apache Superset Stakeholder Communication Presenting Actionable Insight