Hello, I'm

Ankit Singh

|

Data Scientist with a PhD in Physics — applying machine learning, statistical modeling, and large-scale data pipelines to both industry and research problems.

Ankit Singh

// About Me

I build machine learning models, data pipelines, and deployed applications across domains — from classifying 151K road accidents and optimizing traffic signals with reinforcement learning, to RAG chatbots and quantitative finance toolkits. My PhD in Physics and 10+ years of research gave me deep experience with terabyte-scale datasets, statistical modeling, and HPC — skills that transfer directly to data science at scale.

Currently a Research Fellow at the University of Nottingham, I also hold two applied data science certifications from WorldQuant University (ML/CV and Applied DS) and have authored 12 peer-reviewed publications with 100+ citations.

// Skills & Tools

Programming & Data

Python SQL C Cython Bash Streamlit Gradio Flask Git Linux HPC (MPI/Slurm) Jupyter Streamlit Cloud GitHub Pages

Machine Learning & AI

PyTorch Scikit-learn LightGBM Deep RL CNNs GANs YOLOv8 ARIMA GARCH Prophet SMOTE/ADASYN K-Means PCA OpenCV

GenAI & NLP

RAG Pipelines FAISS ChromaDB Sentence-Transformers OpenAI API Gemini API Ollama LLM Engineering

Astrophysics & Science

Galaxy Evolution Cosmological Simulations SED Fitting Monte Carlo Methods Radiative Transfer Large-scale Structure AGN

// Featured Projects

UK Accident Severity Classification

Dual-strategy ML system on 151K UK road accidents: emergency response model achieving 92.4% severe recall and traffic management model with 81% macro recall using SMOTE+Tomek and ADASYN.

LightGBM Scikit-learn SMOTE

UK Visa RAG Chatbot

RAG chatbot using FAISS/ChromaDB vector stores and sentence-transformers to answer UK immigration questions from GOV.UK data.

RAG FAISS Streamlit NLP

RL Traffic Signal Control

Deep RL agent trained on real Transport for London API traffic flow data to learn adaptive signal timing policies for a single intersection, replacing fixed-cycle control.

Deep RL PyTorch API

Mutual Fund Analyzer

Deployed Streamlit app with Morningstar data for portfolio analysis — sectoral distribution, Sharpe ratio, and company exposure.

Streamlit Altair Scikit-learn
View All Projects →

// Recent Publications