welcome to my website

Hi, I am Pushkar Ambastha

A college student, lifelong learner, and emerging life adventurer at

IIT GUWAHATI

CURRICULUM VITAE

Submitted to ACM Computing Surveys

Pushkar Ambastha, From AlphaFold 2 to AlphaFold 3: A Review on Advancements in Protein Structure Prediction (Submitted to ACM Computing Surveys (Impact Factor: 23.8). Currently under review)

- Investigated recent advances in protein structure prediction, like AlphaFold 3, which depicted a pattern toward the generalization ability of the models leading toward Large Language Models (LLMs).
- Reviewed the developments related to current approaches to protein structure prediction and protein design and highlighted a selection of successful applications they have enabled.

Draft

Here's My Story

I am a senior passionate about integrating different scientific disciplines to discover something extraordinary.

I am studying Bio-Engineering at the Indian Institute of Technology Guwahati. I want to learn more about Cognitive science, Neuroscience, and its intersection with Artificial Intelligence. I have conducted extensive research and taken rigorous courses spanning the fields of computation, medicine, biochemistry, and AI to gain a different perspective on my interests. I believe in creating impact using my work, supporting and bettering daily lives. I'm also a skilled vocalist, guitarist, gamer, and artist.

Research Experiences

Current and Previous Work experiences

Study of Synthetic Human Memories: AI-Edited Images and Videos which Implant False Memories and Distort Recollection

AT FLUID INTERFACES,

MIT MEDIA LAB

Aug 2024 - Present

Undergraduate Researcher, Advisor: Prof. Pattie Maes

- Developed multi-modal pipeline using original image, AI-enhanced images, image to video, and AI-edited images to video in a successive survey used on old images showing a positive sentiment for potential therapeutic memory reframing. The false recollection is 2.05x compared to control.

- Created the architecture of the study that examines the impact of AI-altered visuals on false memories, which are recollections of events that didn't occur or deviate from reality.

Report

Automating Detection of APP Abnormalities in Porcine Brain Histology for Post- Traumatic Epilepsy Analysis

AT

UNIVERSITY of PENNSYLVANIA

Aug 2024 - Present

Undergraduate Researcher, Advisor: Prof. Ragini Verma

- Implemented an automated image annotation pipeline for analyzing Amyloid Precursor Protein (APP) in pig brain histology, significantly reducing manual annotation time and computational resources through efficient tiling and preprocessing techniques.

- Developed a fine-tuned Segformer model for detecting injury patterns in fornix and fimbria regions, incorporating histogram normalization and handling high-resolution histology images (78064 x 65075) with memory optimization.

Report

Calibrating Agent-Based Models for Tumor-Immune Interactions using Spatial Biopsy Data and
Multi-Modal Pipelines with AgentTorch

AT CAMERA CULTURE,

MIT MEDIA LAB

June 2023 - Present

Undergraduate Researcher (Guide: Ayush Chopra)
- Developed methods to calibrate clinical Agent-Based Models (ABM) directly from biopsies to have a mean accuracy of 77% under the Spatial Agreement Measure (SAM) Metric, minimizing the number of biopsy samples taken.
- Applied gradient-based ABMs to diverse realms like morphogenesis, epidemiology, and opinion dynamics. The right image depicts the RDF metric measuring the intial and final grids with cluster size comparisons, hover depicts the grid consisting of tumor and immune cells.

Report

Optimizing Medical Segmentation with Integrated
Med-SAM and Fast-SAM Models for Enhanced Accuracy in Multi-Modal Imaging

AT

HUGGING FACE X HEALTHCARE

May 2023 - Aug 2023

Research Intern (Guide: Katie Link)

- Developed novel models derived from the cumulative performance and extrapolation of Segment Anything Model (SAM), Medical SAM (Med-SAM), and Fast-SAM. The right image depicts me discussing results at Conference Hall, IITG. Hover shows results of segmentation of Med-SAM on diverse modalities.

- Created a streamlined process that reduced the time it takes to analyze images by 68% and decreased the model size by 82% compared to Vanilla SAM model.

Report

Adaptive Biomedical Segmentation: Enhancing model Explainability through Domain Shift Analysis

AT

THE UNIVERSITY of UTAH

Nov 2022 - Feb 2023

Research Intern (Guide: Tushar Kataria)

- Fine-tuned U-Net, DeepLabV3 model on GlaS Dataset MICCAI 2015, CRAG, CPM15, and CPM17 to observe domain dependency of models on the dataset, created a pipeline to improve Image masks mIOU and Dice Score. This image in the left depicts 0.933 mIOU accuracy on GlaS.

- Analysed Domain Shift in biomedical image segmentation models as a critical insight into Model Explainability.Developed pipeline for binary segmentation (UNet and DeepLabV3) for domain adaptation in diverse datasets.

Report

My Projects

Current and Previous Projects Descriptions

Fluid Interfaces at Massachusetts Institute of Technology (MIT) Media Labs (Ongoing)

Study of Synthetic Human Memories: AI-Edited Images and Videos which Implant False Memories and Distort Recollection

- Developed multi-modal pipeline using original image, AI-enhanced images, image to video, and AI-edited images to video in a successive survey used on old images showing a positive sentiment for potential therapeutic memory reframing. The false recollection is 2.05x compared to control.

- Created the architecture of the study that examines the impact of AI-altered visuals on false memories, which are recollections of events that didn't occur or deviate from reality.

Report

Center for Biomedical Image Computing & Analytics at University of Pennsylvania (Ongoing)

Automating Detection of APP Abnormalities in Porcine Brain Histology for Post- Traumatic Epilepsy Analysis

- Implemented an automated image annotation pipeline for analyzing Amyloid Precursor Protein (APP) in pig brain histology, significantly reducing manual annotation time and computational resources through efficient tiling and preprocessing techniques.

- Developed a fine-tuned Segformer model for detecting injury patterns in fornix and fimbria regions, incorporating histogram normalization and handling high-resolution histology images (78064 x 65075) with memory optimization (12GB RAM).

Report

Camera Culture at Massachusetts Institute of Technology (MIT) Media Labs (Ongoing)

Calibrating Agent-Based Models for Tumor-Immune Interactions using Spatial Biopsy Data and Multi-Modal Pipelines with AgentTorch

- Developed methods to calibrate clinical Agent-Based Models(ABMs) directly from biopsies to have a mean accuracy of 77% under the Spatial Agreement Measure(SAM) Metric, minimizing the number of biopsy samples taken.

- Designed a novel multi-modal calibrated ABM pipeline to apply gradient-based ABMs to simulate tumor-immune cell interactions. (for Cytotoxic CD8+ T Cells in multiple carcinomas and melanoma cases)

Report

Hugging Face X Healthcare

Optimizing Medical Segmentation with Integrated Med-SAM and Fast-SAM Models for Enhanced Accuracy in Multi-Modal Imaging

- Developing novel models derived from the cumulative performance and extrapolation of Segment Anything Model (SAM), Medical SAM (Med-SAM), Fast-SAM.

- The results, when observed in Modalities such as Pathology, X-Ray, CT, and Ultrasound, gave an average improvement of 0.48 in mean Intersection of Union (mIOU) and 0.42 in Dice Score Coefficient (DSC).

Report

The University of Utah

Adaptive Biomedical Segmentation: Enhancing model Explainability through Domain Shift Analysis

- The hypothesis revolves around the fact that the models like U-Net get biased when trained on a specific dataset like CRAG. Then it loses its accuracy when tested on a similar dataset like GLAS. Also true for other combinations of binary and multi-class segmentation datasets.

- Fine-tuned U-Net, DeepLabV3 model on Dataset like GlaS from MICCAI (2015), CRAG, CPM15 to observe domain dependency of models on the dataset, created a pipeline to improve Image masks mean Intersection of Union (mIOU) and Dice Score Coefficient (DSC).

Report

Domain-specific Question Answering chatbot

Project on problem statement given by DevRev.ai

We develop pipelines to retrieve a knowledge base article from the database based on the query and answer the query using the retrieved passage. We optimize the pipeline for performance, latency, and resource usage. Developed question-answering pipeline using techniques like model distillation, sparsification, pruning, and fine- tuning the DebertaV3-Base model to decrease inference time and have a minimum loss in accuracy.

Github Report

Course: Computational Biology

ProteoSynth - Automated Custom Protein Sequence Generator

Developed Flask-based app generating 10,000 custom proteins with random amino acid sequences.Implemented custom options for sequence length, amino acid exclusion, and protein quantity, allowing users to generate up to 100 different protein sequences in a single request.

GrooveSynth - Protein Active Site Structure Generator

Developed a Flask-based app to analyze and visualize protein active binding sites, achieving a 20% increase in the accuracy of ligand-binding predictions as a continuation of ProteoSynth.Created a novel system to generate simplified Protein Data Bank (PDB) structures, reducing analysis time by 30% and aiding drug design efforts.

GitHub-1 GitHub-2

Text-based Captcha Breaker Project

Project by C&A Club, IITG

- Deployed a Computer Vision program using Streamlit library that recognizes Text-based Captcha images and converts them into writable text.
- Developed the pipeline using PyTorch involving the RCNN model,
giving the CTC Loss as 0.03.

Github

Re-colorisation of monochrome images using conditional GANs

Project by Coding Club, IITG

Trained a conditional Generative Adversarial Networks model (Discriminator and Generator) based on U-Net block with Resnet18 backbone and devised Image Processing strategies for colorization of monochrome images. Deployed a web app using Streamlit library on HuggingFace for the fine-tuned model over the COCO dataset.

Github HF-Spaces

Cover Generation using OpenAI tools

Project by IITG.ai Club, IITG

- Developed a multi-modal pipeline that converts audio/text input into images using state-of-the-art OpenAI tools. Generated optimal transcripts for the podcasts and songs with OpenAI Whisper to use in creating prompts.
- Designed pipeline with Latent Diffusion Models (DALL-E) to generate aesthetic cover images from created prompts using ChatGPT/GPT-2 models.

Github HF-Spaces

Super Resolution Photographic Mosaic

Project by Coding Club, IITG

- Developed a Computer Vision pipeline that enhances the images by super-resolution and image stitching.

- Designed multi-model pipeline that consists of mainly Latent Diffusion Upscaler model for super-resolution and Image Stitcher for creating a panorama.

Github

My Education

University and Schools

B.Tech in Bio-Engineering 2021 Nov - 2025 May

IIT Guwahati

High School Diploma 2007 April - 2019 May

Delhi Public School Patna

Awards

Curricular and Co-curricular achievements

Secured Gold Medal

INTER IIT TECH MEET 11.0

Dec 2022 - Feb 2023

- Expert Answers in a Flash: Improving Domain-Specific QA

- We developed pipelines to retrieve a knowledge base article from the database based on the query and answer the query using the retrieved passage. We optimized the pipeline for performance, latency, and resource usage.

- The availability of diverse knowledge bases makes this task challenging. We proposed novel methods to handle FAQs, generate synthetic queries, model fine-tuning, retrieve candidate sentences for answer matching and improve runtime efficiency.

Certificate

Hosted many hackathons, workshops and events with Hugging Face and Served as Research head of IITG.ai Club at IITG

May 2023 - Jul 2024

IITG.ai Club: The AI Community of IIT Guwahati

- Served as the Research Head of the Club
- The left image is during the Machine Learning Research Week workshop at IITG.ai. The speaker is Sayak Paul from Hugging Face who discussed diffusion models. (ML Engineer) - Hosted Machine Learning Research Week Seminar With Ramit Sawhney on Youtube to discuss NLP with Finance ( Global Head of Core AI & ML at Tower Research, RA at Georgia Tech & MBZUAI)

Certificate
Video Thumbnail

Extras

Extra-Curricular prowess

A skilled musician


I'm skilled guitarist, vocalist, and performer. I am also a member of Octaves (university music club) in IITG. (May 2022- Present)
I am one of the lead vocalists in the club. I am skilled in classical and Western music.

I performed on multiple occasions, tournaments, and concerts as a club band group.
The left image depicts when I won in the Parx Hunt tournament on campus for singing.
The right picture depicts the performance of the 'Numb' and 'Treat you better' songs performance conducted in the Auditorium at IIT Guwahati.

Paintings

Imagination is more important than knowledge. Knowledge is limited. Imagination circles the world.

~ Albert Einstein

Just as mathematics turned out to be the right description language for physics, we think AI will prove to be the right method for understanding Biology.

~ Demis Hassabis