Intern Project: LLMs for Unit Testing
Jun 2025 - August 2025
- Performed a literature search to research different approaches of using LLMs to generate unit tests
- Developed an experimentation framework using Python and AWS Bedrock LLMs to explore how to best prompt LLMs to generate high coverage unit tests for C++ methods
- Conducted six experiments to determine the most effective types and amounts of context (e.g., relevant constructors, methods) needed for LLMs to write accurate unit tests that compile
- Built a Retrieval Augmented Generation (RAG) system that automatically retrieves and appends context (e.g., header file code chunks) to LLM prompts
- Wrote a report summarizing methods, experiment results, and recommendations for future work
Predicting Proficiency
Aug. 2024 - Dec. 2024
- Fall AI Studio Project through Break Through Tech with Level Data
- Collaborated in a team of 3 to predict student proficiency and identify which factors contribute most to sudent proficiency
- Leveraged Python’s scikit-learn, Pandas, NumPy, Matplotlib, and Seaborn libraries to perform exploratory data analysis, create visualizations, and develop ML models
- Trained linear regression, decision tree, random forest, and gradient boosted decision tree models to predict proficiency
- Achieved F1 scores over 0.75
- Results aim to help school administrators support students at risk of falling below proficiency
- Learn more and find our code in our
GitHub Repository
WiDS Datathon 2025
Jan. 2025 - April 2025
- Project context: ADHD in females can be difficult to detect and there is evidence that many are left undiagnosed
- Led a project team of 3 to build a Multi Output Classifier model using Ridge Classifier to predict sex and ADHD diagnoses based on fMRI data
- Managed project timelines by scheduling meetings and assigning responsibilities for each sprint
- I specifically focused on the exploratory data analysis and data preprocessing phases; for example, I used KNN to predict and fill null values, setting the foundation for a ridge classifier model that achieved an F1 score of 0.74
- Our team placed in the top 20% of 1,075 teams