Rachana Kadikar

Senior at the University of Southern California passionate about building creative and data-backed solutions. Check out my projects here, and please reach out via email or LinkedIn!

About Me
Projects

PROJECTS

MoneyTree

March 2025
Python, TensorFlow, Scikit-Learn, Streamlit, Pandas, BeautifulSoup, NumPy, Requests, VADER, Gemini API, YFinance API, Git/Github

  • Conceptualized and built end-to-end fintech product addressing beginner investor pain points, delivering AI-powered personal finance assistant with personalized investment recommendations and 2-year asset price predictions.
  • Conducted user research and market analysis to identify user needs (risk tolerance, financial goals, investing knowledge), designing matching algorithm that connects users to top 3 investment opportunities from 1,600+ curated assets
  • Led product development from ideation to launch, defining product requirements, user experience flows, and technical specifications while collaborating with development team to deliver award-winning solution
Cinephile

May 2025
Python, SQL (MySQL), NoSQL (MongoDB), NLP (Gemini API), TMDb API, Data Wrangling, Prompt Engineering, Schema Design, Git/GitHub

  • Developed a natural language query engine that translates user requests into SQL and MongoDB queries, enabling structured search across movie and TV datasets.
  • Engineered relational and NoSQL schemas optimized for media data (e.g., titles, genres, ratings, cast, streaming platforms), supporting efficient retrieval of complex queries.
  • Collected and wrangled large-scale datasets from the TMDb API, transforming unstructured JSON responses into structured SQL tables and NoSQL documents.
  • Applied prompt engineering with Gemini LLM to improve accuracy of automatically generated queries, overcoming challenges with joins, aggregations, and nested lookups.
  • Strengthened expertise in database design, data pipelines, and NLP-driven query automation
Housing Price Predictor

May 2025
Python, Pandas, Scikit-learn, Matplotlib, NumPy, Linear Regression, Data Visualization, Feature Engineering

  • Built an interactive ML application that filters housing data by user preferences and predicts prices using linear regression with R-squared accuracy reporting.
  • Implemented data preprocessing pipeline with one-hot encoding for categorical variables and multi-criteria filtering based on square footage, bedrooms, bathrooms, year built, and neighborhood.
  • Developed predictive model using scikit-learn with train-test split methodology, achieving quantifiable performance metrics through statistical evaluation.
  • Created comprehensive data visualizations displaying actual vs. predicted housing prices with scatter plots and rolling average trend lines for model performance analysis.
  • Designed end-to-end data science workflow from user input validation to automated visualization generation for real estate price analytics.
Determinants of Adult Income: A Longitudinal Analysis

December 2024
Python, STATA, Econometrics, Multiple Linear Regression, Longitudinal Data Analysis, Statistical Modeling, Data Visualization, Hypothesis Testing

  • Analyzed longitudinal data from 8,984 respondents over 24 years using multiple linear regression to identify childhood predictors of adult income, achieving 21.8% adjusted R-squared.
  • Applied backwards selection methodology to optimize model performance, examining relationships between family factors and adult earnings through systematic variable selection.
  • Conducted statistical analysis revealing significant income disparities by gender ($9,975 gap) and race ($5,120 gap) with p<0.001 significance levels.
  • Engineered quadratic features for parental education variables to capture diminishing returns effects and improve model explanatory power.
  • Translated complex statistical findings into actionable insights, demonstrating ability to communicate data-driven results for policy and business applications.
Exoplanet Candidate Classification

May 2025
Python, Scikit-learn, Pandas, Matplotlib, PCA, Cross-Validation, Classification Models, Feature Engineering, Model Optimization

  • Designed end-to-end ML workflow including data preprocessing, dimensionality reduction analysis, automated hyperparameter tuning, and model performance visualization for astronomical data classification.
  • Built multi-algorithm classification system comparing Logistic Regression, KNN, Decision Tree, and SVM models to predict exoplanet candidates using NASA Kepler mission data with orbital and stellar parameters.
  • Implemented automated model selection pipeline using RandomizedSearchCV and GridSearchCV for hyperparameter optimization, comparing PCA vs. non-PCA feature sets to maximize classification accuracy.
  • Performed comprehensive model evaluation using confusion matrices, classification reports, and cross-validation techniques to assess precision, recall, and F1-scores across different exoplanet classification categories.
  • Conducted feature correlation analysis to identify most influential predictors, discovering orbital inclination as the highest correlated variable with exoplanet detection probability.