About
I’m a data/ML engineer focused on reliable ETL, scalable analytics platforms, and explainable ML systems. I enjoy turning messy, multi‑source data into decision‑ready products — from streaming ingestion and orchestration to model deployment and BI.
Open to roles across Data Engineering, Analytics Engineering, and ML Engineering.
Experience
Backend Developer Intern (Full‑time) — Inficloud · Virginia, USA (Jan 2024 – Jan 2025)
- Built & maintained AWS‑based ETL pipelines processing 500GB+ monthly with robust data validation.
- Supported Tableau/Power BI dashboards for faster trend identification and accurate reporting.
- RESTful API integrations for live streams; documented endpoints and validated outputs.
- Automations & production‑grade Python scripts, saving 200+ engineer hours annually.
Programmer — Aryagami Cloud Services · Hyderabad, India (Jul 2021 – Jul 2022)
- Co‑designed/operated backend for a SaaS analytics platform; optimized MySQL for reliability and speed.
- Containerized services with Docker and simplified CI/CD.
- Improved API latency by ~25% (400ms → 300ms) and added structured logging/monitoring.
Backend Developer Intern — Aryagami Cloud Services · Hyderabad, India (Jun 2020 – Jul 2021)
- Supported RESTful APIs contributing to 30% faster data retrieval for analytics dashboards.
- Automated ETL transformations; reduced data prep time by 20%.
- Deployed backend services on AWS/Docker; improved scalability and uptime.
Projects
Credit Card Fraud Detection
End‑to‑end fraud system with PySpark + AWS Glue; 92% recall, 90% F1. SHAP explainability, Airflow ETL, Power BI KPIs.
PySparkDatabricksAWS GlueAirflowPower BI
Collision Analysis — Manhattan
Analyzed 20M+ NYC collisions; peak risk Thu–Fri, 14:00–16:00. Tableau dashboards, predictive hotspots.
PythonPandasTableauGeo
Data Platform Starter
Cookie‑cutter stack for ingestion → warehousing → BI. Airflow/DBT on Postgres + Metabase. IaC scaffolding included.
AirflowdbtPostgresMetabaseDocker
Air Quality Dashboard
Kafka → Spark Structured Streaming → S3 + Athena; CDC and schema evolution handled.
KafkaSparkS3Athena
Skills
Data & ML
Python, PySpark, Pandas, scikit‑learn, SHAP, XGBoost, SQL, dbt
Platforms & Infra
AWS (Glue, Athena, S3, Lambda, ECS), Databricks, Docker, Airflow
Analytics
Power BI, Tableau, Metabase; metrics design, KPI dashboards
Ops
CI/CD, testing, observability, documentation
Resume
Prefer a quick download? Open the PDF.
Contact
Want to discuss a role or project? Email is best. Phone intentionally omitted.