My Projects

A collection of my documented AI/ML projects

Academic Chatbot using Graph RAG

Duke ProfMatch is an AI-powered academic discovery tool that leverages Graph Retrieval-Augmented Generation (RAG) to help Duke students find professors aligned with their research interests. It integrates entity extraction, knowledge graph construction, and vector search using Neo4j to deliver intelligent faculty recommendations. The platform features an interactive graph-based UI, enhancing research exploration and engagement.

Generative AILLMsGraph RAGNeo4jAPI

Diabetic Retinopathy Detection

Interpretable X-Ray Classification leverages the Neural Prototype Tree (ProtoTree) to enhance model transparency in chest X-ray diagnostics. Unlike traditional CNN-based models that act as black boxes, ProtoTree integrates decision tree-based interpretability into its deep learning pipeline, enabling clinicians to understand why a model reaches a diagnosis.

PyTorchCNNSoft Decision TreesCUDA

Graph Convolutional Neural Nets for Structured Documents

While extracting information from documents, traditional object detection and NLP fail to develop a semantic understanding of documents. This project gives code to convert Structured Documents to Graphs using Optical Character Recognition and a GCN implementation in TensorFlow. Read more on Towards Data Science (a Medium Publication).

Graph ConvNetsTensorFlowOptical Character RecognitionAWS

MS Capstone: Text to Image Generative AI Safety with Meta

This study analyzes public datasets for T2I model safety, identifying gaps in harm coverage, bias, and ethical risks to improve dataset selection and model robustness. I reduced non-compliant output by 15% for Meta's Emu diffusion models by curating 120K adversarial prompts and training content moderation classifiers. Additionally, I managed a large-scale crowdsourcing study on multilingual T2I safety, designing a full-stack platform to test 8 SOTA models for cultural biases across 10 languages.

Gen AI SafetyCrowdsourcingRed Teaming

Interpretable Churn Prediction

Interpretable Churn Prediction applies interpretable machine learning techniques to predict customer churn while providing actionable insights for decision-makers. Unlike traditional black-box models, this project utilizes Globally Optimized Sparse Decision Trees (GOSDT), Generalized Additive Models (GAMs), Sparse Generalized Linear Models (LOLearn), and Explainable Boosting Machines (EBMs) to balance accuracy and transparency.

Interpretable Machine LearningGlobally Optimized Sparse Decision TreesGeneralized Additive ModelsExplainable Boosting Machines

Bayesian Multi-Armed Bandits

This project implements Bayesian Bandit Testing in Python—an adaptive alternative to traditional A/B testing that reallocates traffic in real time using Thompson Sampling. It compares classic hypothesis testing, Bayesian A/B testing, and Bayesian Multi-Armed Bandits, with visualizations of traffic allocation, convergence, and regret. Simulations over 300-day campaigns showed up to a 20% gain in effective conversion rate and 3x faster adaptation to shifting user preferences, demonstrating the advantages of bandits in dynamic, data-driven environments.

Bayesian StatisticsA/B TestingMulti-Armed Bandits