NAWALDEEP SINGH

Data Scientist

LinkedIn | GitHub

About

Data Scientist with 1.5+ years of experience in ML modeling, SQL analytics, dashboarding and MLOps, driving outcomes through data in fast-paced, cross-functional environments. Improved ETA accuracy by 72%, deployed pipelines with 90ms inference, and built optimized central reporting dashboards with 5+ KPIs to accelerate decision-making.

Work Experience

Data Scientist

BluSmart Mobility

Dec 2025 - Present

  • Built a direct ETA prediction model using XGBoost and S2 cells with feature store; reduced RMSE from 18 to 5 mins and cut driver penalty errors by 12% in production, while maintaining model documentation and audit logs.
  • Simulated, validated, and deployed the ETA model using XGBoost, gRPC APIs, and server-side S3 loading; achieved 45-55% better accuracy than OSRM and 30% over legacy estimates, while reducing latency from 500ms to 75-90ms.
  • Collaborated with business and product teams to translate data findings into operational strategy; delivered insights to non-technical stakeholders for rollout and prioritization decisions.
  • Led a routing model PoC by mapping GPS pings to road segments; planned integration into ETA engine to replace Google ODRD API with projected savings of ~INR 8 Cr/year focused on cost-efficiency and real-time decisioning..
  • Built interactive dashboards in SQL, Redash and DataBricks to track 5+ KPI metrics on supply-demand balance, delays, and cancellations; reduced weekly reporting effort by 50% and enabled real-time ops interventions.
  • Designed a home-charging validation logic with geo-fencing, driver-SOC mappings, and payout rules; developed custom ETL scripts to ingest and process raw charging data, ensuring under 5% variance in monthly settlements with partners.
  • Took initiative to conduct a literature review on battery analytics using LSTM, BERT, and GAN models; aim to forecast degradation trends ahead of summer fleet ramp-up.

Research Assistant

LNM Institute of Technology (Under PhD Mentorship)

Dec 2025 - Present

  • Applied LSTM, GRU, and ARIMA models to forecast Apple stock trends; achieved high predictive accuracy (LSTM: 2.93 RMSE, outperforming ARIMA by 84%)-demonstrating capabilities in forecasting and time-series analysis.
  • Analyzed performance drop (15%) from 2-layer to 3-layer LSTM/GRU models; observed that 1-layer LSTM over- predicted while GRU underpredicted, motivating a hybrid architecture with dropout to improve generalization.
  • Facilitated integration of sentiment analysis using FinBERT into a GAN-based prediction model; enhanced model accuracy to 1.827 RMSE with the researchers, improving insight generation by 38% over traditional architectures.

Education

Bachelor of Engineering in Computer Engineering

Trinity College Dublin

Upper Second-Class Honours CGPA: 9

Projects

Multi-Source RAG Agent

Implemented a multi-source NLP driven retrieval system using LangChain, FAISS, and LLMs; improved factual accuracy by 22-26% using RAGAS metrics (faithfulness, relevance, similarity). Build an interactive RAG-powered app with Streamlit and multimodal tools, enabling business teams to retrieve structured insights from unstructured PDFs; designed for reproducibility using MLflow for production deployment.

Skills

Languages

  • Python
  • SQL
  • Bash
  • BigQuery
  • C/C++

Libraries

  • Pandas
  • NumPy
  • Scikit-learn
  • XGBoost
  • PySpark
  • TensorFlow
  • Matplotlib
  • Seaborn

Analytics

  • A/B Testing
  • Exploratory Data Analysis (EDA)
  • Forecasting
  • Root Cause Analysis
  • Reporting Automation

BI

  • AWS (S3, QuickSight)
  • Google Cloud (BigQuery)
  • MongoDB
  • PowerBI
  • Tableau
  • Looker
  • Redash
  • Excel
  • MySQL

ML Ops

  • MLflow
  • Docker
  • Git
  • CI/CD
  • Model Deployment
  • Scheduled Pipelines

Tools

  • Databricks
  • Jupyter
  • VS Code
  • Linux
  • Airflow
  • Notebooks
  • Slack integration