Started Fellowship at Hult Prize. Will be learning about evaluating startups and their impact, applying my Data Scientist Perspective.
I worked on improving the SportsVector web platform by first mapping the full application structure to understand how components interacted. I fixed several front-end and back-end issues, cleaned and refactored React components for better reusability, and improved user flow for smoother navigation. I redesigned the Supabase database schema for team and player management by adding proper relationships, enums, and constraints to ensure reliable and consistent data. I then integrated these changes into the codebase and tested migrations to keep the production system stable. I primarily worked with Supabase, React, TypeScript, and FastAPI.
I researched molecular sequence generation models that could produce compounds with specific chemical properties. I conducted a literature review on state-of-the-art molecular generation methods using Transformers, Graph Neural Networks, and Reinforcement Learning. I developed and trained an LSTM model on the ZINC and MOSES datasets, refactored transformer codebases to follow best practices, and implemented reinforcement learning techniques such as policy gradients to optimize molecular properties like QED.
I worked on classifying student behavior patterns using unlabeled LMS data under Dr. Aymen Omri. I engineered features from raw educational interaction data, applied clustering and dimensionality reduction to uncover engagement trends, and conducted a systematic review of existing models to identify research gaps and propose new directions for improving student classification.
I assisted professors in programming, discrete structures, and database management labs, helping students understand course material and debug their code. I also led a peer tutoring program, coordinating a team of eight tutors and managing review sessions for over 50 students across multiple subjects.
I helped students with their writing assignments, guiding them through the process and improving grammar, ideas, research, and APA formatting. I also assisted with presentations, focusing on PowerPoint skills, audience, and style. Additionally, I ensured students followed APA guidelines for tables and figures. I regularly updated instructors on student progress and shared any concerns or recommendations.
I processed and cleaned data using Python libraries, automating tasks to ensure consistent and accurate input for machine learning models. I then trained a BERT model on medical guidelines, developing a RAG pipeline that achieved a higher similarity score than OpenAI. Additionally, I modified the pipeline to extract JSON and convert it into Excel format for more efficient information extraction.