Dec 2021 – July 2024
Systems Engineer (Full Time)
Tata Consultancy Services (TCS) · Hyderabad, India
LLM Integration & RAG: Developed GPT-3.5–powered enterprise applications using Python (FastAPI), Azure OpenAI, and LangChain; designed modular prompt templates, system instructions, and multi-turn conversational flows to improve reliability and accuracy. Built RAG pipelines using Azure Cognitive Search, vector embeddings, chunking strategies, and metadata-based retrieval to deliver grounded, domain-specific answers for document-heavy enterprise use cases.
Statistical Software Development in R & Python: Developed production statistical computing solutions using R and Python supporting enterprise analytics serving 10K+ users; implemented data processing pipelines, statistical models, and machine learning algorithms in both languages; optimized computational performance through efficient data structures, vectorized operations, and algorithmic improvements reducing runtime by 40%; designed modular, reusable code libraries following software engineering best practices including version control (Git), unit testing (testthat in R, pytest in Python), and comprehensive documentation.
Data Structure Optimization & Package Development: Engineered advanced data structures in R and Python optimizing memory usage and performance for large-scale datasets (10M+ records); developed and maintained internal packages and libraries supporting analytical workflows; implemented efficient algorithms for data transformation, aggregation, and statistical computation; applied profiling tools (R profvis, Python cProfile) identifying performance bottlenecks and implementing optimization strategies; created scalable solutions supporting both interactive analysis and production deployments.
Data Engineering & Big Data Pipelines: Engineered large-scale data pipelines using Azure Data Factory and Azure Databricks; developed PySpark ETL jobs on Spark clusters, optimized distributed transformations, and implemented Delta Lake–based architectures on Data Lake Storage (ADLS) to ensure reliable, versioned, and ML-ready datasets.
Research Software Engineering & Repository Development: Built data repository systems managing large-scale datasets with versioning, metadata tracking, and quality validation; developed automated workflows for data ingestion, transformation, and analysis using R and Python; implemented parallel processing frameworks (R parallel, Python multiprocessing) for computationally intensive statistical operations; established software development best practices including code review, continuous integration, and automated testing; maintained production systems with comprehensive monitoring ensuring reliability and correctness of statistical computations. Created data visualizations using ggplot2 and matplotlib for technical and non-technical audiences. Applied statistical methods using R (tidyverse, caret, forecast) and Python (scikit-learn, statsmodels, SciPy) for hypothesis testing, regression modeling, time-series forecasting, and experimental design; developed reproducible statistical analysis workflows producing comprehensive reports and visualizations.
Data Pipeline Architecture & Azure Databricks Engineering: Engineered large-scale data pipelines using Azure Data Factory and Azure Databricks supporting enterprise AI applications; developed PySpark ETL jobs on Spark clusters processing 100K+ records daily, optimized distributed transformations with partitioning strategies, and implemented Delta Lake architectures on Azure Data Lake Storage (ADLS) ensuring reliable, versioned, analytics-ready datasets with 99.9% uptime across production environments.
Data Modeling & Database Optimization: Designed scalable data models supporting business intelligence and analytics workloads using Microsoft SQL Server and Azure cloud platforms; implemented indexing strategies and query optimization best practices reducing data retrieval times by 40%; collaborated cross-functionally with product managers, analysts, and engineers to translate business requirements into technical data architectures meeting governance and performance standards.
Document Intelligence & Cognitive Automation: Integrated Azure Document Intelligence, Cognitive Search, Blob Storage, and Function Apps to automate document ingestion, semantic indexing, and large-scale retrieval workflows. Implemented machine learning and computer vision models using TensorFlow, Keras, and OpenCV for classification.
Advanced Analytics Infrastructure & Semantic Search: Developed semantic search and retrieval systems using Azure Cognitive Search, vector embeddings, and metadata-based indexing to support analytics use cases; implemented chunking strategies and distributed data processing optimizing information retrieval performance; built scalable backend microservices using Python (FastAPI) and SQL supporting cross-database lookups, data aggregation, and structured content extraction for business intelligence applications.
Backend Engineering & Intelligent Retrieval: Developed scalable backend microservices supporting AI inference, vector search, prompt routing, and hybrid retrieval (keyword + embedding) for cross-database lookups, multi-document summarization, and structured content extraction using Python (FastAPI), Azure Cloud, and Microsoft SQL Server.
Production Systems Monitoring & Performance Enhancement: Deployed scalable data systems on Azure Cloud with comprehensive monitoring pipelines tracking data quality metrics, pipeline performance, and system reliability; implemented CI/CD workflows automating deployment processes; displayed strong ownership by iterating on data architectures based on performance analytics and stakeholder feedback; worked in fast-paced environment with regular releases, continuously enhancing data capabilities to solve business-critical problems.
Oct 2020 - June 2021
Startup Founder and AI/ML Engineer (Full Time)
Vaaraahi Farms (GST Registered) · Andhra Pradesh, India
Technology Leadership: Founded a registered services business offering technology-driven IoT and ML/data analytics solutions to agriculture-focused clients. Built regression, anomaly-detection, and forecasting models to predict nutrient imbalance, detect system faults, and optimize growing conditions in controlled-environment farms.
Statistical Computing & Predictive Modeling: implemented time-series forecasting models using R (forecast, zoo, tseries) and Python (statsmodels, Prophet) analyzing 1000+ IoT sensor streams; developed anomaly detection algorithms, regression models, and classification systems supporting operational decision-making; applied statistical rigor ensuring model validity, appropriate uncertainty quantification, and reproducible results.
Software Optimization & Algorithm Development: Optimized R and Python code for computational efficiency improving algorithm performance by 30%; implemented efficient data structures for time-series data, sensor measurements, and aggregated statistics; applied performance profiling identifying computational bottlenecks and implementing targeted optimizations; developed scalable solutions balancing statistical accuracy with computational efficiency.
Data-Driven Problem Solving & Requirements Gathering: Collaborated with stakeholders to gather business requirements and data needs; designed robust experiment plans combining quantitative data analysis with domain expertise; translated user requirements into technical data solutions achieving clarity from ambiguous problem statements and delivering measurable business value.
Predictive Maintenance: Implemented predictive maintenance algorithms that identified pump degradation, aeration failures, and chemical drift patterns using IoT telemetry, reducing downtime and enhancing system reliability.
Time-Series Data Processing & Predictive Analytics: Built production data pipelines processing IoT telemetry data using Python (scikit-learn, Pandas), SQL databases, and batch processing workflows; implemented LSTM-based time-series forecasting, anomaly detection algorithms, and optimization models analyzing sensor data patterns; ensured data quality and integrity across multiple data sources, proactively resolving discrepancies and improving data accuracy for operational decision-making in production environments.
ML Analytics: Developed ML-driven solutions for practical on-ground problems such as predicting lettuce tip-burn before it became visible, forecasting ammonia spikes in aquaponics to prevent fish mortality, estimating optimal harvest timing using microclimate patterns, and detecting abnormal pH drift caused by microbial activity resulting in proactive interventions and higher system stability.
Analytics Impact Measurement & Documentation: Measured and tracked impact of data-driven solutions through statistical analysis and controlled experiments; displayed strong ownership by continuously iterating on data models based on results and stakeholder feedback; documented data processes, analytical methodologies, and technical designs; worked closely with end users to understand data needs deeply and deliver analytics solutions improving operational efficiency and system reliability with quantifiable results.
Aug 2024 – Present
Graduate Student Employee (Part-Time)
University of Central Missouri · Warrensburg, Missouri
Data Collection & Validation: Collected, validated, and cleaned institutional library datasets from multiple source systems to support operational reporting and usage analysis.
Exploratory Data Analysis: Conducted exploratory data analysis (EDA) to identify resource demand trends, usage patterns, and data quality gaps across services.
Data Visualization & Reporting: Developed visualizations and analytical summaries to communicate insights and support data-driven planning decisions.
Stakeholder Collaboration: Collaborated with stakeholders to translate reporting requirements into structured, analysis-ready datasets and actionable insights.