Shamanthak Hegde

I am a final year Computer Science Master's student at Arizona State University where I am advised by Yezhou Yang. My current research interests lies in combining information from different sources—like text, images, and video—to help machines develop more robust, reliable, and safe commonsense reasoning.

I received my Bachelors in Computer Science Engineering from KLE Technological University in 2023 where I was advised by Shankar Gangisetty on various projects in the VQA domain.

Email  /  CV  /  Google Scholar  /  X  /  GitHub

profile photo

News

Research

I'm interested in computer vision, machine learning, generative AI, and natural language processing. Representative papers are highlighted.

chartqax_overview ChartQA-X: Generating Explanations for Charts
Shamanthak Hegde, Pooyan Fazli, Hasti Seifi
Under Review

Paper | bibtex
dcpo_overview Dual Caption Preference Optimization for Diffusion Models
Amir Saeidi*, Yiran Luo*, Agneet Chatterjee, Shamanthak Hegde, Bimsara Pathiraja, Yezhou Yang, Chitta Baral
Under Review

Paper | Project | Code | bibtex
mtvtvqam Evaluating Multimodal Large Language Models Across Distribution Shifts and Augmentations
Aayush Atul Verma*, Amir Saeidi*, Shamanthak Hegde*, Ajay Therala*, Fenil Denish Bardoliya*, Nagaraju Machavarapu*, Shri Ajay Kumar Ravindhiran*, Srija Malyala*, Agneet Chatterjee*, Yezhou Yang, Chitta Baral
CVPR EvGenFM Workshop, 2024

Paper | bibtex
mtvtvqam Making the V in Text-VQA Matter
Shamanthak Hegde, Soumya Jahagirdar, Shankar Gangisetty
CVPR O-DRUM Workshop, 2023

Paper | bibtex
wsvqag Weakly Supervised Visual Question Answer Generation
Charani Alampalle, Shamanthak Hegde, Soumya Jahagirdar, Shankar Gangisetty
CVPR O-DRUM Workshop, 2023

Paper | bibtex

Thank you to Jon Barron for the source code for the website!