About Me

I am an aspiring data scientist who enjoys connecting the dots: be it ideas from different disciplines, people from different teams, or applications from different industries. I have strong technical skills and an academic background in computers, statistics, and machine learning. Currently pursuing my Master's in Data Science at UC San Diego, I combine academic excellence with hands-on experience in developing innovative solutions using cutting-edge technologies.
📚 Education

Master's in Data Science

B.S. in Cognitive Science
Data Structures and OO Design, Supervised Machine Learning, AI Algorithms, Neural Networks and Deep Learning, Signal Processing
Experience
Computer Vision Researcher
mlPC Lab UC San Diego • San Diego • May 2025 - Present
- Researching diffusion models, specifically exploring their applications in generative modeling, under the supervision of Professor Zhuowen Tu
Software/Data Engineering Intern
Ivanti • San Jose • June 2024 - August 2024
- Developed GenAI-driven automation tester for Neurons AI (C#, JSON, Excel, SQL)
- Built an NLP-based Dynamic Expression Generator using Azure GPT-4, RAG, and ReACT
Instructional Assistant
Cognitive and Data Science Dept, UCSD • San Diego • March 2024 - Present
- Tutored 900+ students in Python and data science under Prof. Voytek and Prof. Ellis
- Led discussions, grading, and held office hours
VP of Finance
CSForEach Club, UCSD • San Diego • September 2022 - June 2024
- Managed club budget and sponsor relationships
- Supported K–12 outreach in CS, data science, and AI
Software Engineering Intern
HCLTech • Remote • June 2023 - September 2023
- Worked with product engineering team analyzing security and anomaly datasets
- Performed EDA and data visualization to extract insights
Projects
AutoBots: RL-Based Autonomous Driving
Reinforcement Learning, CARLA, DonkeySim • January 2025 - Present
- Trained self-driving agents using Reinforcement Learning (RL) on two platforms: CARLA (urban navigation) and DonkeySim (track-based circuits)
- Used LiDAR as the primary sensor for robust, light-invariant spatial perception; processed point cloud into 18D state vectors combining sector-wise distances and waypoint metadata
- Implemented Proximal Policy Optimization (PPO) and Actor-Critic methods for continuous action control (steering, throttle, brake), using custom neural networks for actor-critic separation
- Designed and iteratively refined reward functions to optimize navigation performance, emphasizing collision avoidance, lane following, waypoint tracking, and speed regulation
- Results evaluated on metrics like cumulative reward, episode length, distance traveled, and collision rate
- Explored implications for industrial logistics, real-world deployment, and ethical considerations such as safety, bias, transparency, and sim-to-real transfer
Comparative Analysis of Modern Object Detection Algorithms
Computer Vision, Deep Learning • March 2025
- Compared RCNN, YOLO-World, and GroundingDINO for object detection on a curated dataset
- Evaluated bounding box accuracy using Intersection over Union (IoU), precision, recall, and F1 score
- Constructed a Robustness Evaluation Dataset to assess model generalization with complex scenes, small/occluded objects, symbolic text, and abstract prompts
- YOLO-World showed strong recall but was highly sensitive to confidence thresholds and prompt phrasing, while GroundingDINO achieved the highest IoU and F1 across conditions, excelling in open-vocabulary tasks
sEMG Gesture-Controlled Toy Car
Signal Processing, Machine Learning • March 2024 - Present
- Developed a wearable system using surface electromyography (sEMG) to control a toy car with hand gestures—no remotes or screen interfaces required
- Recorded muscle activity via a Mindrove sEMG sensor and processed signals in real time through a five-stage pipeline: acquisition, filtering, feature extraction, classification, and command mapping
- Used a 4-channel 500 Hz signal, applied a 4th-order Butterworth bandpass filter (10–200 Hz), and a notch filter at 60 Hz to clean noise and preserve critical signal features
- Extracted time/frequency domain features (e.g., MAV, FFT) and classified gestures using K-Nearest Neighbors (KNN), achieving ~98% accuracy on three gestures (rest, clench, Spiderman)
Comparative ML Model Analysis
Machine Learning • October 2024 - January 2025
- Compared ANN, XGBoost, SVM, and Random Forest on multiple datasets
- Used SMOTE, encoding, and hyperparameter tuning for evaluation
NLP Sentiment Analysis: BERT vs. SVM
Natural Language Processing • August 2022 - January 2023
- Implemented sentiment classifiers in TensorFlow
- Authored a 5,000-word paper comparing model performance
Loan Default Risk Prediction using Machine Learning
Supervised Classification | Financial Modeling | Fairness in AI • December 2024
- Analyzed over 2.2 million LendingClub loan applications (2007–2018) to predict loan default risk using financial indicators like FICO score, debt-to-income (DTI) ratio, and loan grade
- Performed extensive EDA and feature engineering to extract impactful predictors; found FICO score, DTI, and loan grade to be the most significant indicators of default risk
- Implemented and compared SVM, K-Nearest Neighbors (KNN), and Random Forest models:
- Random Forest yielded the highest F1-score and recall for default detection
- KNN achieved the best precision
- SVM had the highest accuracy (80%) but failed to capture positive (default) cases
- Addressed dataset imbalance and ethical concerns by carefully filtering non-representative labels and excluding sensitive demographic attributes
- Applied fairness-aware methods and advocated for transparent, inclusive financial risk models with privacy-conscious practices
- Tools: pandas, scikit-learn, matplotlib, seaborn, GridSearchCV, StandardScaler, FunctionTransformer