skip to content

gauravpendharkar.dev gauravp.dev

about me

My name is Gaurav Pendharkar, and I am deeply passionate about solving real-world problems through the application of my machine learning knowledge. I have worked with various data sources, including relational data, text, images, and documents across multiple domains such as cyber security, wildlife, law, aviation, and medicine.

experience

Research Intern, Generative Artificial Intelligence University of Technology, Sydney (Sep 2023 - Feb 2024) Tech Stack: Python, FastAPI, HuggingFace, PyTorch, JavaScript, Git
  • Designed and developed a multilingual rich text editor to study user interaction with AI for assistive writing
  • Integrated GPT-2 model from HuggingFace to generate four suggestions for each user prompt in English
  • Developed APIs for transliteration using the IndicXlit model and translation with the IndicTrans model
  • Collected user keystrokes from 7 writing sessions and observed overall GPT-2 usage of approximately 37.5%
Research Intern, Natural Language Processing Vellore Institute of Technology (May 2022 - Feb 2023) Tech Stack: Python, PdfPlumber, Selenium, spaCy, NLTK, Seaborn, Git
  • Expanded a legal document repository by over 200% through automated web scraping on Manupatra database
  • Wrangled unstructured Indian legal documents by employing regular expressions and named entity recognition
  • Created and Benchmarked a dataset for judgment classification and prediction against eight ML and DL models
  • Reduced information extraction time from 3 months to 12 hours with the designed automated NLP pipeline

my skills

programming languages
Python Python
SQL SQL
R R
Java Java
TypeScript TypeScript
HTML HTML
frameworks
PyTorch PyTorch
TensorFlow TensorFlow
HuggingFace HuggingFace
scikit-learn scikit-learn
Weights & Biases Weights & Biases
Weights & Biases Ultralytics
Pandas Pandas
Seaborn
Matplotlib
spaCy OpenCV
spaCy spaCy
FastAPI FastAPI
Selenium Selenium
MongoDB MongoDB
tools
GitHub GitHub
Jupyter Jupyter
LaTeX LaTeX
Markdown Markdown
GitHub Copilot GitHub Copilot
Grammarly Grammarly

educational background

Master of Science in Data Science, Columbia University (present)
Bachelor of Technology in Computer Science w/s in AI and ML, Vellore Institute of Technology
IITians Spectrum Edutech
Arya Vidya Mandir Juhu