Harshwardhan Raghunath Patil

Graduate Student, Indiana University, Bloomiongton

President and Director of Public Relations, Data Science Club at IU
Google Certified Advanced Data Analytics Professional

About Me

I am a Data Scientist at heart and a software developer by practice. Throughout my career, I have garnered 3+ years of experience in software development and 1 year in Machine Learning.

Currently, I am working as a Research Assistant at NLP Labs, where my responsibilities include developing an application(using Flask, postgreSQL) to create a state-of-the-art database. Additionally, I am collaborating and learning from researchers at the Development and Cognitive Sciences Lab to devise statistical analysis methodologies for multilevel and multimodal analysis(in Python and R). I am also involved in creating Machine Learning Pipelines using cutting-edge transformer models for developing classification models(using Pytorch).

In my free time, I enjoy playing chess, solving Rubik's Cube puzzles, and practicing origami. I am also cultivating a habit of reading self-development books. Currently, I am reading "Think and Grow Rich" by Napoleon Hill.

I firmly believe that the fusion of Data Analysis, Machine Learning, and Software Development, coupled with my people skills, would create a positive and impactful environment at my work place.

My skills

Languages :	Python
Database:	SQL, NoSQL, MongoDB
Frameworks:	Flask, Django, PyTorch, TensorFlow, spaCy, NLTK
Web Frameworks:	Node.js, React
ML Algorithms:	Regression, Classification, Clustering, Neural Networks, Decision Trees, Random Forests
Statistical Analysis:	Descriptive Statistics, Hypothesis Testing, A/B Testing, Probability Theory
Data Visualization:	Matplotlib, Seaborn, Tableau, PowerBI
Platforms:	Linux, AWS
Miscellaneous:	Git, Docker, Kubernetes, CI/CD, Hadoop, Spark

Experience

May 2023 - August 2023

Research Assistant (Data Mining)

O'Neill School of Public and Environmental Affairs · On-site

• Working under Professor Alberto Ortega as part of the Faculty Assistance in Data Science (FADS) Program by Luddy School of Informatics.

• Extracted and processed 5 years' worth of data from the National Directory of Mental Health Treatment Facilities' PDF files into a valuable source of contact information by utilizing Python, Regular Expressions, and web scraping techniques.
May 2023 - September 2023

Research Assistant (Data Analyst)

Consumer Psychology Lab · Remote

• Working under the guidance of Professor Minkyung Koo to review and summarize 40+ research papers, focusing on coupon use, discount, and deal proneness, to identify dependent variables, independent variables, mediators, moderators, and main observations.
May 2023 - December 2023

Research Assistant (NLP Engineer)

NLP lab, Indiana University · On-site

• Developing a Flask application for the Ellipsis and Elided Elements in Natural Language: The Hoosier Ellipsis Corpus (https://nlp-lab.org/ellipsis/) to create a state-of-the-art database under the guidance of Professor Damir Caver.
May 2023 - September 2023

Research Assistant (Data Analyst)

IU Developmental Cognitive Neuroscience Lab · On-site

• Contributing to devising statistical analysis methodologies for multilevel and multimodal analysis of physiological signals data and designing ML pipelines to develop a classification model under the guidance of Professor Bennett Bertenthal.
Nov 2020 - July 2022

System Engineer

Infosys Limited, Banglore · Full-time · Remote

• Worked on a data warehouse application, collecting, transforming, and cleaning raw data from different networking infrastructure systems.

• Utilized Microsoft Azure cloud, Linux servers, and Oracle Database in a Networking Decision and Support Database team.

• Played an instrumental role in creating, converting, and redesigning PL/SQL packages, Perl scripts, and Data Mappings for 13 projects.

• Optimized job run time by 60% through modification of extract, load, and transfer job scripts.

• Suggested and initiated the automation of 3 PVT and 2 monitoring activities as part of the Operational Excellence, which contributed to a 40% boost in the team's ticket resolution rate.

• Developed strong communication and collaboration skills through effective client and team interactions.

Education

August 2022 - Present

Master of Science (M.S.)

Computational Sciences (Data Science)

Indiana University, Bloomington

Achievements :

• Luddy Outstanding Service Award

Involvements :

• Vice-President and Director of Public Relations, Data Science Club at Indiana University

• Secretory, IEEE Indiana University Student Branch

Coursework :

Statistics, Algorithms, Exploratory Data Analysis, Machine Learning, Deep Learning, Computer Vision, Cloud Computing, Natural Language Processing.
August 2016 - March 2020

Bachelor of Technology (B.Tech.)

Computer Science and Technology

Shivaji University, Kolhapur

Involvements

• Co-Founder, Code-Space (Programming Club)

• Head, Student's Training and Placement Committee (2019-2020)

• Member, Student's Council of DOT (2017-2020)

• Anchor, cultural show 'Symphony' (editions 2K18, 2K19, 2K20)

• Member, 'Harit Sena Dal' (2017-2019)

• Player, DOT-CST's Kabaddi and Kho-Kho Team

Coursework :

Computational Mathematics, Data Structures and Algorithms, Operating Systems, Data Communication, Networking Systems, System Programming, Computer Security.

My Projects

Exploring Machine Learning Techniques for Sales Forecasting at Walmart Stores

Technology Domain : Machine Learning, Sales Forecasting

Tech Stack : Data Analysis Tools, Random Forest, XGBoost, Ensemble Algorithms

Description :

• Leveraged data analysis and visualization tools to investigate trends, patterns, and correlations in the historical sales data, holiday events, and store information data.

• Utilized random forest, XGBoost, and ensemble algorithms to predict weekly sales for 45 Walmart stores.

• Source Code
Data-driven Customer Segmentation for Personalized Marketing

Technology Domain : Data Analysis, Customer Segmentation

Tech Stack : K-means Clustering, R, ggplot2, plotly

Description :

• Conducted customer segmentation project for a grocery store using K-means clustering to segment customers based on their purchase behavior, such as the frequency and amount of their purchases.

• Performed exploratory data analysis and visualized the results of the analysis using R’s visualization packages, such as ggplot2 and plotly, to present insights.

• Source Code
Employee Attrition Analysis

Technology Domain : Data Analysis, Predictive Analytics

Tech Stack : Descriptive Analytics, Predictive Modeling

Description :

• Conducted descriptive and predictive analytics to gain insights into the key drivers associated with employee attrition.

• Developed models to assess the likelihood of employee turnover and explored potential interventions to mitigate attrition.

• Source Code
Scraping IMDb Movie Data using BeautifulSoup and Selenium

Technology Domain : Web Scraping, Data Analysis

Tech Stack : Python, BeautifulSoup, Selenium, Pandas, Matplotlib, Seaborn

Description :

• Designed and implemented a web scraping solution using Python, BeautifulSoup, and Selenium to collect and analyze movie data from IMDb, and performed data cleaning, preprocessing, and visualization using Pandas, Matplotlib, and Seaborn.
Analyzing Twitter Sentiment on COVID-19 using Tweepy

Technology Domain : Natural Language Processing, Sentiment Analysis

Tech Stack : Tweepy, NLP Techniques, Matplotlib, Tableau

Description :

• Utilized the Tweepy Python library to scrape tweets related to COVID-19 and conducted sentiment analysis on the collected data using natural language processing techniques such as tokenization, stemming, and sentiment analysis, and visualized the results using Matplotlib and Tableau.

• Source Code
A Data-Driven Approach to Email Spam Detection: Building and Evaluating Machine Learning Models

Technology Domain : Machine Learning, Natural Language Processing

Tech Stack : NLP Techniques, Feature Extraction, Model Selection

Description :

• Developed a spam detection system for a 3000-email dataset with an accuracy rate of 97.4% by utilizing natural language processing (NLP) techniques such as feature extraction (bag-of-words, TF-IDF, and n-grams), text preprocessing (stemming and stop-word removal), and model selection (Naive Bayes, Logistic Regression, and Random Forests).

• Source Code
Enhancing Information Extraction with Named Entity Recognition: A Machine Learning Approach

Technology Domain : Natural Language Processing, Machine Learning

Tech Stack : Named Entity Recognition, NLP Techniques, BERT Models, LSTM

Description :

• Implemented Named Entity Recognition with NLP techniques, fine-tuned pre-trained BERT models, optimized LSTM with dropout regularization and gradient clipping, and conducted error analysis and ablation studies for feature and model selection achieving 87% accuracy on CoNLL 2003 benchmark dataset.

• Source Code
Door With an Eye

Technology Domain : Computer Vision, Android-Application Developement

Tech Stack : Python3, OpenCV, Computer Vision, Flutter, Firebase

Description :

• A Door implemented with face unlock feature

• Source Code
MedEase - A Medical Record-keeping System

Technology Domain : Web Development, Database Management

Tech Stack : React, Django, Python, MySQL

Description :

• Created a record-keeping system using React, Django, Python, and MySQL to store and access patient records easily.

• Designed and implemented features for scheduling appointments and tracking patient health metrics, improving data accuracy and streamlining record-keeping processes.

• Source Code
PetShop - An E-commerce System

Technology Domain : Web Development, Database Management

Tech Stack : Django, Python

Description :

• Integrated the front-end interface with the back-end functionalities using Django templates. Also, implemented product search, filtering, sorting, and pagination for an enhanced user experience.

• Designed and implemented a collection display feature to showcase the available animals in the pet shop.

• Source Code

Harshwardhan Raghunath Patil

Graduate Student, Indiana University, Bloomiongton

About Me

My skills

Experience

Research Assistant (Data Mining)

O'Neill School of Public and Environmental Affairs · On-site

Research Assistant (Data Analyst)

Consumer Psychology Lab · Remote

Research Assistant (NLP Engineer)

NLP lab, Indiana University · On-site

Research Assistant (Data Analyst)

IU Developmental Cognitive Neuroscience Lab · On-site

System Engineer

Infosys Limited, Banglore · Full-time · Remote

Education

Master of Science (M.S.)

Indiana University, Bloomington

Achievements :

Involvements :

Coursework :

Bachelor of Technology (B.Tech.)

Shivaji University, Kolhapur

Involvements

Coursework :

My Projects

Exploring Machine Learning Techniques for Sales Forecasting at Walmart Stores

Technology Domain : Machine Learning, Sales Forecasting

Tech Stack : Data Analysis Tools, Random Forest, XGBoost, Ensemble Algorithms

Description :

Data-driven Customer Segmentation for Personalized Marketing

Technology Domain : Data Analysis, Customer Segmentation

Tech Stack : K-means Clustering, R, ggplot2, plotly

Description :

Employee Attrition Analysis

Technology Domain : Data Analysis, Predictive Analytics

Tech Stack : Descriptive Analytics, Predictive Modeling

Description :

Scraping IMDb Movie Data using BeautifulSoup and Selenium

Technology Domain : Web Scraping, Data Analysis

Tech Stack : Python, BeautifulSoup, Selenium, Pandas, Matplotlib, Seaborn

Description :

Analyzing Twitter Sentiment on COVID-19 using Tweepy

Technology Domain : Natural Language Processing, Sentiment Analysis

Tech Stack : Tweepy, NLP Techniques, Matplotlib, Tableau

Description :

A Data-Driven Approach to Email Spam Detection: Building and Evaluating Machine Learning Models

Technology Domain : Machine Learning, Natural Language Processing

Tech Stack : NLP Techniques, Feature Extraction, Model Selection

Description :

Enhancing Information Extraction with Named Entity Recognition: A Machine Learning Approach

Technology Domain : Natural Language Processing, Machine Learning

Tech Stack : Named Entity Recognition, NLP Techniques, BERT Models, LSTM

Description :

Door With an Eye

Technology Domain : Computer Vision, Android-Application Developement

Tech Stack : Python3, OpenCV, Computer Vision, Flutter, Firebase

Description :

MedEase - A Medical Record-keeping System

Technology Domain : Web Development, Database Management

Tech Stack : React, Django, Python, MySQL

Description :

PetShop - An E-commerce System

Technology Domain : Web Development, Database Management

Tech Stack : Django, Python

Description :