OPEN TO OPPORTUNITIES · SAN JOSE, CA

MayureshPramod Pandey

// Data Analyst · Data Engineer · Data Scientist

MS Data Analytics @ SJSU · 3+ years building scalable data pipelines, forecasting models, and analytics solutions across Azure, Snowflake, PySpark & Python.

3+
Years Experience
2B+
Records Processed
15–20%
Forecast Accuracy ↑
3
Publications
Mayuresh
// module_02 · biography

About Me

Data professional passionate about turning raw data into decisions — forecasting models, ETL pipelines, and ML systems at scale.

Core Strengths
📊 Statistical Analysis🔄 ETL/ELT Pipelines🤖 ML Forecasting☁️ Cloud Native🎯 Outcome-Focused
"I bridge the gap between raw data and real decisions — at any scale, across any domain."
Mission Log
2024 – PRESENT
MS Data Analytics
SJSU · May 2026
MAY – AUG 2025
Data Analyst / DE / DS Intern
Schneider National · WI, USA
AUG 2021 – JUN 2024
Data Analyst · Engineer · Scientist
Mu Sigma · Bangalore, India
JUN 2021
BE Information Technology
University of Mumbai
Education
MS Data Analytics
San Jose State University
Big Data · ML · Deep Learning · GenAI · Data Warehouse · Distributed Systems
Certifications
🏅 AWS Certified Solutions Architect – Associate
📄 3 Research Publications (IEEE, Elsevier, AI Journal)
// module_03 · missions

My Projects

Engineering Mindset
📐 Architecture-First🔁 End-to-End Pipelines📊 Insight-Driven⚡ Performance Obsessed
"Every pipeline I build is designed to scale, every model to generalize."
🔷
Azure E-Commerce Data Pipeline
End-to-end pipeline on Azure using ADF, Data Lake Gen2, Databricks (PySpark). Medallion Architecture. Synapse Analytics + Tableau KPI dashboards.
PySparkADFDatabricksSynapseTableau
🎵
Spotify Analytics Pipeline & Dashboard
Apache Airflow + Snowflake + dbt pipeline integrating Spotify API historical & real-time data. Power BI dashboards for artist & streaming insights.
AirflowSnowflakedbtPower BISpotify API
EV Charging Station Analysis
Power BI dashboard on 78K+ US Energy records. Python EDA + DAX visualizations (geospatial, tree maps, decomposition). ML forecasting for EV expansion.
Power BIPythonDAXML Forecasting
🧠
LLM-Based RAG Pipeline
Modular RAG pipeline using LangChain, OpenAI & FAISS for legal document summarization. 35% token reduction via dynamic chunking. Claude, Gemini, Mistral benchmarked.
LangChainFAISSOpenAIRAGPython
🏦
Precision Banking Prediction
94% accuracy & ROC-AUC 0.94 — Random Forest on Bank Marketing dataset with SMOTE, EDA, hyperparameter tuning. Deployed via Streamlit.
Scikit-LearnStreamlitSMOTERandom Forest
📰
MyNewsMate AI News App
Hybrid NLP recommender (TF-IDF + BART + VADER) with Celery + Redis. 40% CTR boost, 30% engagement up, 25% bounce rate down. Django REST + React + AWS.
DjangoReactAWSNLPCeleryRedis
// module_04 · systems online

My Skills

Full data stack — from raw ingestion to real-time insights. No percentages, just real experience.

Technical Arsenal
🐍 Python Expert☁️ Azure & AWS⚙️ PySpark & Spark🤖 ML & GenAI📊 BI Dashboards
"From raw ingestion to real-time insight — I own the full data stack."
Languages
PythonPySparkSQLRJavaScriptC / C++
Data Engineering
Apache SparkAirflowdbtDelta LakeHiveSnowflakeDatabricksData FactorySynapse
Databases
MySQLPostgreSQLMongoDBHadoopSpark SQLNeo4j
Cloud & DevOps
AzureAWSDockerKubernetesJenkinsAzure DevOps
Analytics & BI
Power BITableauExcel (Adv)DAXA/B TestingEDA
ML & Forecasting
ProphetARIMA/ARIMAXScikit-LearnTensorFlowPyTorchKalman FilterNeuralProphetSHAP
AI & GenAI
LangChainOpenAI APIFAISSHugging FaceAzure OpenAIRAG Pipelines
Other Tools
Git / GitHubStreamlitFastAPIGreat ExpectationsSphinx
// module_05 · flight log

My Experience

Impact-driven roles across data engineering, analytics, and machine learning at scale.

Impact Metrics
📉 40% Data Prep Time ↓📈 15–20% Forecast Accuracy ↑🎯 30% Manual Work ↓💰 8–12% Margin ↑
"Every metric I move is backed by a pipeline I built and a model I shipped."
SN
Data Analyst / Data Engineer / Data Scientist Intern
Schneider National · WI, USA
MAY 2025 – AUG 2025
Python, PySpark, Spark SQL, Prophet, Kalman Filtering, Azure, Oracle, Snowflake, Hive

Click to expand

μΣ
Data Analyst · Data Engineer · Data Scientist
Mu Sigma · Bangalore, India
AUG 2021 – JUN 2024
PySpark, Hive, Azure Data Lake, Azure Databricks, Azure Blob, Azure Key Vaults, Azure DevOps

Click to expand

// module_06 · research transmissions

Publications & Certifications

Peer-reviewed research across ML, IoT, and AI systems — published at IEEE, Elsevier, and international AI journals.

Research Focus
🧠 Machine Learning📡 IoT Systems🌐 Social Networks🏅 AWS Certified
"Published at IEEE, Elsevier, and AI journals — research that bridges theory and production."
📄
Mental Health Prediction for Juveniles using Machine Learning
Elsevier Conference · 2021 · SSRN: 3867291
📡
Smart Emergency Vehicle Detection using IoT & Machine Learning
IEEE INAC-4 · 2019
🌐
Identity Resolution in Social Networks using Recommender Systems
Journal of AI & Systems · 2019 · ISSN: 2642-2859
🏅
AWS Certified Solutions Architect – Associate
Amazon Web Services · Active Certification
// module_07 · comms array

Get In Touch

Open to Data Analyst, Data Engineer & Data Scientist roles— full-time or internship. I'd love to hear from you.

Availability
💼 Full-time Roles🎓 Internships🤝 Collaborations🌐 Remote Friendly
"Let's turn your data challenges into measurable business outcomes."
Send a Message