|
Shamanthak Hegde
I am a final year Computer Science Master's student at Arizona State University where I am advised by Yezhou Yang. My current research interests lies in combining information from different sources-like text, images, and video to help machines develop more robust, reliable, and safe commonsense reasoning.
I received my Bachelors in Computer Science Engineering from KLE Technological University in 2023 where I was advised by Shankar Gangisetty on various projects in the VQA domain.
Email /
Resume /
Google Scholar /
X /
GitHub
|
|
News
- November, 2025 - ChartQA-X: Generating Explanations for Visual Chart Reasoning has been accepted to Winter Conference on Applications of Computer Vision (WACV), 2026
- October, 2025 - Dual Caption Preference Optimization for Diffusion Models has been accepted to Transactions on Machine Learning Research (TMLR), 2025
- June, 2025 - New preprint ChartQA-X: Generating Explanations for Visual Chart Reasoning is out!
- June, 2025 - Nominated as a reviewer for NeurIPS 2025!
- April, 2025 - Nominated as a reviewer for ACM MM 2025!
- March, 2025 - Nominated as a reviewer for ACL 2025!
- February, 2025 - New preprint Dual Caption Preference Optimization for Diffusion Models is out!
- October, 2024 - Nominated as a reviewer for RBFM Workshop NeurIPS 2024!
- June, 2024 - 1 paper accepted to EvGenFM Workshop CVPR 2024!
- August, 2023 - Started Master's in Computer Science at ASU!
- July, 2023 - Graduated with a Bachelors's Degree in Computer Science Engineering from KLE Technological University!
- June, 2023 - 2 papers accepted to O-DRUM Workshop CVPR 2023!
- January, 2023 - Joining Bosch Global Software Technologies as a Software Development Intern.
|
Research
I'm interested in computer vision, machine learning, generative AI, and natural language processing. Representative papers are highlighted.
|
|
ChartQA-X: Generating Explanations for Visual Chart Reasoning
Shamanthak Hegde,
Pooyan Fazli,
Hasti Seifi
Winter Conference on Applications of Computer Vision (WACV), 2026
Paper |
Dataset |
bibtex
@article{hegde2025chartqa,
title={ChartQA-X: Generating Explanations for Charts},
author={Hegde, Shamanthak and Fazli, Pooyan and Seifi, Hasti},
journal={arXiv preprint arXiv:2504.13275},
year={2025}
}
|
|
Dual Caption Preference Optimization for Diffusion Models
Amir Saeidi*,
Yiran Luo*,
Agneet Chatterjee,
Shamanthak Hegde,
Bimsara Pathiraja,
Yezhou Yang,
Chitta Baral
Transactions on Machine Learning Research (TMLR), 2025
Paper |
Project |
Code |
bibtex
@article{saeidi2025dual,
title={Dual Caption Preference Optimization for Diffusion Models},
author={Amir Saeidi and Yiran Lawrence Luo and Agneet Chatterjee and Shamanthak Hegde and Bimsara Pathiraja and Yezhou Yang and Chitta Baral},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
year={2025},
url={https://openreview.net/forum?id=ruZksIJBBd},
note={}
}
|
|
Evaluating Multimodal Large Language Models Across Distribution Shifts and Augmentations
Aayush Atul Verma*,
Amir Saeidi*,
Shamanthak Hegde*,
Ajay Therala*,
Fenil Denish Bardoliya*,
Nagaraju Machavarapu*,
Shri Ajay Kumar Ravindhiran*,
Srija Malyala*,
Agneet Chatterjee*,
Yezhou Yang,
Chitta Baral
CVPR EvGenFM Workshop, 2024
Paper |
bibtex
@InProceedings{Verma_2024_CVPR,
author = {Verma, Aayush Atul and Saeidi, Amir and Hegde, Shamanthak and Therala, Ajay and Bardoliya, Fenil Denish and Machavarapu, Nagaraju and Ravindhiran, Shri Ajay Kumar and Malyala, Srija and Chatterjee, Agneet and Yang, Yezhou and Baral, Chitta},
title = {Evaluating Multimodal Large Language Models Across Distribution Shifts and Augmentations},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2024},
pages = {5314-5324}
}
|
|
Making the V in Text-VQA Matter
Shamanthak Hegde,
Soumya Jahagirdar,
Shankar Gangisetty
CVPR O-DRUM Workshop, 2023
Paper |
bibtex
@inproceedings{hegde2023making,
title={Making the v in text-VQA matter},
author={Hegde, Shamanthak and Jahagirdar, Soumya and Gangisetty, Shankar},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={5580--5588},
year={2023}
}
|
|
Weakly Supervised Visual Question Answer Generation
Charani Alampalle,
Shamanthak Hegde,
Soumya Jahagirdar,
Shankar Gangisetty
CVPR O-DRUM Workshop, 2023
Paper |
bibtex
@inproceedings{alampalle2023weakly,
title={Weakly Supervised Visual Question Answer Generation},
author={Alampalle, Charani and Hegde, Shamanthak and Jahagirdar, Soumya and Gangisetty, Shankar},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={5589--5597},
year={2023}
}
|
|