about me
My name is Gaurav Pendharkar, and I am deeply passionate about solving real-world problems through the application of my machine learning knowledge. I have worked with various data sources, including relational data, text, images, and documents across multiple domains such as cyber security, wildlife, law, aviation, and medicine.
experience
Research Intern, Generative Artificial Intelligence University of Technology, Sydney (Sep 2023 - Feb 2024) Tech Stack: Python, FastAPI, HuggingFace, PyTorch, JavaScript, Git
- Designed and developed a multilingual rich text editor to study user interaction with AI for assistive writing
- Integrated GPT-2 model from HuggingFace to generate four suggestions for each user prompt in English
- Developed APIs for transliteration using the IndicXlit model and translation with the IndicTrans model
- Collected user keystrokes from 7 writing sessions and observed overall GPT-2 usage of approximately 37.5%
Research Intern, Natural Language Processing Vellore Institute of Technology (May 2022 - Feb 2023) Tech Stack: Python, PdfPlumber, Selenium, spaCy, NLTK, Seaborn, Git
- Expanded a legal document repository by over 200% through automated web scraping on Manupatra database
- Wrangled unstructured Indian legal documents by employing regular expressions and named entity recognition
- Created and Benchmarked a dataset for judgment classification and prediction against eight ML and DL models
- Reduced information extraction time from 3 months to 12 hours with the designed automated NLP pipeline
my skills
programming languages

frameworks
Seaborn
Matplotlib
tools
educational background
Master of Science in Data Science, Columbia University (present)
Bachelor of Technology in Computer Science w/s in AI and ML, Vellore Institute of Technology
IITians Spectrum Edutech
Arya Vidya Mandir Juhu