Hello, I'm

Ankit Singh

|

Data Scientist with a PhD in Physics. I build ML models, data pipelines, and deployed apps — background in statistical modeling and terabyte-scale scientific computing.

Ankit Singh

// About Me

I’m a Research Fellow at the University of Nottingham with a PhD in Physics and 10+ years working with terabyte-scale simulation data, statistical modeling, and HPC. Recent projects include a dual-model classifier for 151K UK road accidents, a deep RL agent for traffic signal control, RAG chatbots, and quantitative finance tools.

I’m an ISO/IEC 42001 certified practitioner (AI Management Systems) and AI+ Foundation certified, with additional applied data science credentials from WorldQuant University. 12 peer-reviewed publications, 100+ citations.

// Skills & Tools

Programming & Data

Python SQL C Cython Bash Streamlit Gradio Flask Git Linux HPC (MPI/Slurm) Jupyter Streamlit Cloud GitHub Pages

Machine Learning & AI

PyTorch Scikit-learn LightGBM Deep RL CNNs GANs YOLOv8 ARIMA GARCH Prophet SMOTE/ADASYN K-Means PCA OpenCV

GenAI & NLP

RAG Pipelines FAISS ChromaDB Sentence-Transformers OpenAI API Gemini API Ollama LLM Engineering

Astrophysics & Science

Galaxy Evolution Cosmological Simulations SED Fitting Monte Carlo Methods Radiative Transfer Large-scale Structure AGN

// Featured Projects

UK Accident Severity Classification

Dual-strategy ML system on 151K UK road accidents: emergency response model achieving 92.4% severe recall and traffic management model with 81% macro recall using SMOTE+Tomek and ADASYN.

LightGBM Scikit-learn SMOTE

UK Visa RAG Chatbot

RAG chatbot using FAISS/ChromaDB vector stores and sentence-transformers to answer UK immigration questions from GOV.UK data.

RAG FAISS Streamlit NLP

RL Traffic Signal Control

Deep RL agent trained on real Transport for London API traffic flow data to learn adaptive signal timing policies for a single intersection, replacing fixed-cycle control.

Deep RL PyTorch API

Mutual Fund Analyzer

Deployed Streamlit app with Morningstar data for portfolio analysis — sectoral distribution, Sharpe ratio, and company exposure.

Streamlit Altair Scikit-learn
View All Projects →

// Recent Publications