Avikshith Reddy Yelakonda — Data Science Portfolio

PRODUCTION ML · ANALYTICS · GENAI SYSTEMS

I build production-grade machine learning, analytics, and generative AI systems that turn messy data into decision-ready products: predictive models, RAG pipelines, experimentation platforms, and monitored deployment workflows.

Python• SQL• LLMs• RAG• ML Pipelines• MLOps

Dallas, Texas MS Computer Science — AI/ML, SMU

A full portfolio of machine learning, analytics, and applied AI systems spanning predictive modeling, experimentation, recommendation, OCR, time-series forecasting, and retrieval-augmented generation. Each project is framed around a business problem, a technical approach, and an operational outcome.

Reinforcement Learning · Q-Learning · Interactive

Dino RL Sandbox

Built an interactive reinforcement-learning sandbox that makes Q-learning legible to non-specialists through environment controls, policy updates, and live value-map visualization.

Architecture

Grid World → State/Action → Reward Shaping → Q-Learning Update → Value Map

NLP · RAG · Reviews Intelligence

Executive Review Intelligence (LLM + RAG)

Designed a reviews intelligence system that clusters recurring issues, attaches grounded evidence, scores outputs with RAGAS, and routes executive alerts so teams can move from raw feedback to action quickly.

Architecture

Reviews → Cleaning → Embeddings → Topic Clustering → RAG Evidence → RAGAS Eval → Slack Alerts

OCR · CV · Automation

AI Expense Claim Processing (OCR + automation)

Automated a high-friction expense workflow with OCR, field extraction, confidence scoring, and human review, reducing manual handling while improving auditability and downstream analytics readiness.

Architecture

Receipts → OCR → Field Extraction → Validation → API → Review UI

LLM · RAG · Finance

Financial LLM Chatbot

Built a finance-focused LLM assistant that combines fine-tuning with retrieval over SEC filings to deliver cited answers, explain key metrics, and support analyst research workflows.

Architecture

Filings → Chunking → Embeddings → FAISS → LLM → Cited Answer

RAG · Education · Search

Teaching Assistant RAG

Built a multi-course teaching assistant that uses hybrid retrieval and course-isolated knowledge stores to deliver citation-grounded answers with lower hallucination risk and better student self-service.

Architecture

Syllabus/Notes → TF‑IDF + Embeddings → Vector Store → RAG → Streamlit UI

Experimentation · Causal Inference

A/B Testing & Uplift Modeling

Built an experimentation workflow for incremental-impact measurement, combining power analysis, uplift modeling, and segment targeting to identify the customers most likely to convert under treatment.

Architecture

Experiment Data → Power/Lift → Uplift Model → Segment Targeting

Forecasting · Retail

Walmart Demand Forecasting

Compared classical and deep-learning forecasting approaches for short-horizon retail demand planning, using feature engineering and error diagnostics to support staffing and inventory decisions.

Architecture

Sales Data → Feature Engineering → Model Compare → Forecast → Error Tracking

Classification · Lead Scoring

Lead Conversion Modeling (Consumer Affairs)

Built a lead-scoring pipeline for imbalanced conversion data, using feature engineering, calibrated XGBoost, and SHAP-based interpretation to prioritize outreach by expected business value.

Architecture

Leads → Feature Engineering → Model → SHAP → Ranked Outreach

A/B Testing · Marketing Mix

A/B Testing for Marketing Campaign Optimization

Evaluated marketing-channel performance with statistical testing and time-aware regression analysis to inform budget allocation and choose the higher-return campaign strategy.

Architecture

Channel Spend → Experiment → Statistical Test → ROI Decision

Ranking · Recommenders

Movie Recommendation & Ranking

Built a recommendation and ranking workflow over user-item interactions to generate personalized Top-N suggestions and support experimentation around engagement and discovery quality.

Architecture

User Events → Feature Engineering → Ranking Model → Top‑N Recommendations

Churn · Segmentation

Customer Churn & Segmentation

Combined churn prediction and segmentation to help retention teams identify at-risk customers, prioritize intervention, and tailor actions to higher-value behavioral segments.

Architecture

Customer Data → Feature Engineering → Churn Model → Segmentation → Retention Playbooks

Risk · Fraud Detection

Fraud Detection Modeling

Developed an imbalance-aware fraud detection pipeline that generates review-ready risk scores, helping teams surface suspicious transactions while controlling false positives.

Architecture

Transactions → Feature Engineering → Fraud Model → Risk Scoring → Review Queue

Healthcare · EHR Analytics

EHR Analytics & Predictive Modeling

Standardized noisy EHR data into modeling-ready datasets and built predictive baselines that support clinical risk analysis and more reliable downstream healthcare analytics.

Architecture

EHR Data → Cleaning → Feature Engineering → Predictive Model → Clinical Insights

Time Series · Finance

Stock Market Time‑Series Analysis

Analyzed equity time series with indicator engineering and visual diagnostics to compare trend, volatility, and portfolio behavior across multiple public-market assets.

Architecture

Price Data → Indicators → Diagnostics → Visualization

Clustering · Healthcare

Heart Disease Risk Clustering (Health Care)

Applied dimensionality reduction and clustering to uncover patient-risk groupings in partially labeled clinical data, enabling more targeted downstream analysis.

Architecture

Clinical Data → PCA → Clustering → Risk Groups

Matching · Ranking

Tenant Matcher

Built a preference-based property matching system that scores candidate listings, ranks them for fit, and automates recommendation delivery to renters.

Architecture

Preferences → Scoring → Ranking → Email Delivery

Data Collection · Scraping

Job Web Scraping (job_web)

Created reusable scraping pipelines that collect job-market listings, normalize them into structured datasets, and support downstream labor-market analysis.

Architecture

Sites → Scraper → Normalization → Dataset

PRODUCTION ML · ANALYTICS · GENAI SYSTEMS

From Data to Production Decision Systems

I bridge analytics, predictive modeling, and applied AI to ship systems that are measurable, production-aware, and useful to real teams.

Predictive Modeling

Applied AI Systems

Production Delivery

Selected Impact

Causal Impact Measurement

Mortgage Risk Modeling

Lead Scoring Performance

Forecasting & Analytics Delivery

Skills

Predictive Modeling & Decision Science

LLM, RAG & Applied AI Systems

Data Platforms & MLOps

Core Languages

ML / AI Stack

Data & Cloud Platforms

Analytics & Reporting

ML Skill Analysis

Top skill clusters

Signal cloud

Why these rank high

Projects

Dino RL Sandbox

Executive Review Intelligence (LLM + RAG)

AI Expense Claim Processing (OCR + automation)

Financial LLM Chatbot

Teaching Assistant RAG

A/B Testing & Uplift Modeling

Walmart Demand Forecasting

Lead Conversion Modeling (Consumer Affairs)

A/B Testing for Marketing Campaign Optimization

Movie Recommendation & Ranking

Customer Churn & Segmentation

Fraud Detection Modeling

EHR Analytics & Predictive Modeling

Stock Market Time‑Series Analysis

Heart Disease Risk Clustering (Health Care)

Tenant Matcher

Job Web Scraping (job_web)

Experience

Data Scientist — PioneerSoft (Client: Freddie Mac)

Data Analyst — Yogin

Additional Leadership & Operations Experience

Education & Professional Development

M.S. Computer Science — Artificial Intelligence & Machine Learning

B.Tech Computer Engineering

Leadership Roles

Recognition

Certifications

Hackathons & Events

Let's Connect