About Skills Experience Projects Certifications Contact
Data Engineer  ·  Kathmandu, Nepal

Pragati
Kumar
Chaudhary

Recent CS graduate building scalable ETL/ELT pipelines, real-time streaming architectures, and data warehouses with Apache Spark, Airflow, Kafka & Snowflake.

Pragati Kumar Chaudhary
100K+Records per batch
5+Workflows automated
4Data projects
6Certifications
01

About

Education
Bachelor of Computer Science
IIMS College, Kathmandu
2022 – 2026
+2 Science
United Academy, Kathmandu
2016 – 2018
SEE
Nepal Police Higher Secondary School, Sanga
2014 – 2016

I'm a recent Computer Science graduate and entry-level Data Engineer based in Lalitpur, Nepal. I specialize in building robust, scalable ETL/ELT pipelines that move raw data from source to analytics-ready environments.

My hands-on work spans the modern data stack — from ingesting data from AWS S3 into Snowflake via bulk COPY INTO, to orchestrating complex workflows in Apache Airflow, to processing live event streams with PySpark and Kafka.

Proficient in Apache Spark (PySpark), Airflow, Kafka, and Snowflake — with experience in dimensional modeling, SCD Type 2, and Medallion Architecture (Bronze / Silver / Gold layers).

I care deeply about data quality and pipeline reliability. I containerize everything with Docker, version-control rigorously with Git, and design systems that are both maintainable and scalable. Currently seeking opportunities to grow on a team that values clean architecture and thoughtful engineering.

02

Skills

Languages
  • Python
  • SQL
  • Java (Basic)
Big Data & Orchestration
  • Apache Spark (PySpark)
  • Apache Kafka
  • Apache Airflow
  • ETL / ELT Pipelines
  • Pandas
Cloud & Databases
  • Snowflake
  • AWS — S3, EC2, EMR, IAM
  • PostgreSQL
  • MongoDB
Data Modeling
  • Star Schema
  • SCD Type 2
  • Bulk Loading (COPY INTO)
  • Medallion Architecture
  • Dimensional Modeling
Tools & DevOps
  • Docker
  • Git / GitHub
  • Streamlit
  • Dash
Soft Skills
  • Analytical Thinking
  • Problem Solving
  • Attention to Detail
  • Team Collaboration
  • Technical Communication
03

Experience

Mar 2025 – Jul 2025
Techkraft Inc Pvt. Ltd.
Chakupat, Lalitpur
Data Engineer Bootcamp Trainee
  • Built ETL pipelines using Python and PySpark to process 100K+ records per batch run with improved data consistency.
  • Automated 5+ data workflows using Apache Airflow DAGs, reducing manual effort significantly.
  • Implemented data cleaning and transformation workflows using Pandas, improving analytics readiness.
  • Containerized data processing tasks using Docker for consistent execution environments across dev and production.
May 2022 – Dec 2023
Vianet Communications
Jwalakhel, Lalitpur
Technical Support Representative
  • Troubleshot and resolved network connectivity and router configuration issues for end customers.
  • Analyzed customer-reported problems related to latency, service outages, and IPTV buffering.
04

Projects

COVID-19 ETL Pipeline
PythonSnowflakeAirflowDockerAWS S3
COVID-19 ETL Pipeline / Synthetic Healthcare Insights
End-to-end ETL pipeline ingesting 50K+ healthcare records from AWS S3 into Snowflake via COPY INTO. Designed staging, ODS, and warehouse layers (Medallion Architecture). Automated with Airflow DAGs and fully containerized with Docker.
View on GitHub →
🛒
PythonSQLSnowflakeAirflow
Walmart Retail Data Warehouse
Star Schema data warehouse for sales and inventory OLAP analytics in Snowflake. Implemented SCD Type 2 for historical dimension tracking, automated ETL pipelines with Python and Airflow, and prepared analytics-ready datasets for downstream BI reporting.
View on GitHub →
Weather Dashboard
PySparkKafkaDashPython
Real-Time Weather Analytics Dashboard
Real-time streaming pipeline using Apache Kafka for weather data ingestion and Spark Streaming for live data processing. Built an interactive Dash dashboard to visualize live weather metrics and monitor data quality in real time.
View on GitHub →
AI Resume Screener
PythonSBERTStreamlit
AI Resume Screener
AI-powered resume screening tool using Sentence-BERT for semantic similarity matching. Goes beyond keyword matching to rank resumes by contextual relevance to job descriptions. Interactive Streamlit UI for recruiters and HR teams.
View on GitHub →
05

Certifications

Techkraft Inc
Data Engineering Bootcamp
HL7 International
HL7® FHIR® Fundamentals Course
HackerRank
SQL — Basic
HackerRank
SQL — Intermediate
Databricks
Academy Accreditation — Databricks Fundamentals
Amazon Web Services
AWS Cloud Quest: Cloud Practitioner 2024
06

Get in Touch

I'm actively looking for data engineering opportunities. Whether you have a role, a project, or just want to talk data — I'd love to hear from you.