Projects

Performed a literature search to research different approaches of using LLMs to generate unit tests
Developed an experimentation framework using Python and AWS Bedrock LLMs to explore how to best prompt LLMs to generate high coverage unit tests for C++ methods
Conducted six experiments to determine the most effective types and amounts of context (e.g., relevant constructors, methods) needed for LLMs to write accurate unit tests that compile
Built a Retrieval Augmented Generation (RAG) system that automatically retrieves and appends context (e.g., header file code chunks) to LLM prompts
Wrote a report summarizing methods, experiment results, and recommendations for future work

Fall AI Studio Project through Break Through Tech with Level Data
Collaborated in a team of 3 to predict student proficiency and identify which factors contribute most to sudent proficiency
Leveraged Python’s scikit-learn, Pandas, NumPy, Matplotlib, and Seaborn libraries to perform exploratory data analysis, create visualizations, and develop ML models
Trained linear regression, decision tree, random forest, and gradient boosted decision tree models to predict proficiency
Achieved F1 scores over 0.75
Results aim to help school administrators support students at risk of falling below proficiency
Learn more and find our code in our GitHub Repository

Project context: ADHD in females can be difficult to detect and there is evidence that many are left undiagnosed
Led a project team of 3 to build a Multi Output Classifier model using Ridge Classifier to predict sex and ADHD diagnoses based on fMRI data
Managed project timelines by scheduling meetings and assigning responsibilities for each sprint
I specifically focused on the exploratory data analysis and data preprocessing phases; for example, I used KNN to predict and fill null values, setting the foundation for a ridge classifier model that achieved an F1 score of 0.74
Our team placed in the top 20% of 1,075 teams