Sukrut Rao (PhD Student)

Sukrut Sridhar Rao

Address: Max-Planck-Institut für Informatik
Saarland Informatics Campus
Campus E1 4
66123 Saarbrücken
Location: E1 4 - 616
Phone: +49 681 9325 2146
Fax: +49 681 9325 2099
E-mail: Get email via email

Personal Information

Web Pages: Personal, MPI
Publication Profiles: Google Scholar, DBLP, ORCiD
Social: LinkedIn, Twitter
Code: GitHub

Publications

2025

Paper

D2RG3

Y. Wang, S. Rao, J.-U. Lee, M. Jobanputra, and V. Demberg

“B-cos LM: Efficiently Transforming Pre-trained Language Models for Improved Explainability,” 2025. [Online]. Available: https://arxiv.org/abs/2502.12992.

Abstract

Post-hoc explanation methods for black-box models often struggle with
faithfulness and human interpretability due to the lack of explainability in
current neural models. Meanwhile, B-cos networks have been introduced to
improve model explainability through architectural and computational
adaptations, but their application has so far been limited to computer vision
models and their associated training pipelines. In this work, we introduce
B-cos LMs, i.e., B-cos networks empowered for NLP tasks. Our approach directly
transforms pre-trained language models into B-cos LMs by combining B-cos
conversion and task fine-tuning, improving efficiency compared to previous
B-cos methods. Our automatic and human evaluation results demonstrate that
B-cos LMs produce more faithful and human interpretable explanations than post
hoc methods, while maintaining task performance comparable to conventional
fine-tuning. Our in-depth analysis explores how B-cos LMs differ from
conventionally fine-tuned models in their learning processes and explanation
patterns. Finally, we provide practical guidelines for effectively building
B-cos LMs based on our findings. Our code is available at
anonymous.4open.science/r/bcos_lm.

BibTeX

@online{Wang2502.12992,
TITLE = {B-cos {LM}: Efficiently Transforming Pre-trained Language Models for Improved Explainability},
AUTHOR = {Wang, Yifan and Rao, Sukrut and Lee, Ji-Ung and Jobanputra, Mayank and Demberg, Vera},
LANGUAGE = {eng},
URL = {https://arxiv.org/abs/2502.12992},
EPRINT = {2502.12992},
EPRINTTYPE = {arXiv},
YEAR = {2025},
MARGINALMARK = {$\bullet$},
ABSTRACT = {Post-hoc explanation methods for black-box models often struggle with<br>faithfulness and human interpretability due to the lack of explainability in<br>current neural models. Meanwhile, B-cos networks have been introduced to<br>improve model explainability through architectural and computational<br>adaptations, but their application has so far been limited to computer vision<br>models and their associated training pipelines. In this work, we introduce<br>B-cos LMs, i.e., B-cos networks empowered for NLP tasks. Our approach directly<br>transforms pre-trained language models into B-cos LMs by combining B-cos<br>conversion and task fine-tuning, improving efficiency compared to previous<br>B-cos methods. Our automatic and human evaluation results demonstrate that<br>B-cos LMs produce more faithful and human interpretable explanations than post<br>hoc methods, while maintaining task performance comparable to conventional<br>fine-tuning. Our in-depth analysis explores how B-cos LMs differ from<br>conventionally fine-tuned models in their learning processes and explanation<br>patterns. Finally, we provide practical guidelines for effectively building<br>B-cos LMs based on our findings. Our code is available at<br>https://anonymous.4open.science/r/bcos_lm.<br>},
}

Endnote

%0 Report
%A Wang, Yifan
%A Rao, Sukrut
%A Lee, Ji-Ung
%A Jobanputra, Mayank
%A Demberg, Vera
%+ External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
Multimodal Language Processing, MPI for Informatics, Max Planck Society
%T B-cos LM: Efficiently Transforming Pre-trained Language Models for
  Improved Explainability : 
%G eng
%U http://hdl.handle.net/21.11116/0000-0010-C156-3
%U https://arxiv.org/abs/2502.12992
%D 2025
%X   Post-hoc explanation methods for black-box models often struggle with<br>faithfulness and human interpretability due to the lack of explainability in<br>current neural models. Meanwhile, B-cos networks have been introduced to<br>improve model explainability through architectural and computational<br>adaptations, but their application has so far been limited to computer vision<br>models and their associated training pipelines. In this work, we introduce<br>B-cos LMs, i.e., B-cos networks empowered for NLP tasks. Our approach directly<br>transforms pre-trained language models into B-cos LMs by combining B-cos<br>conversion and task fine-tuning, improving efficiency compared to previous<br>B-cos methods. Our automatic and human evaluation results demonstrate that<br>B-cos LMs produce more faithful and human interpretable explanations than post<br>hoc methods, while maintaining task performance comparable to conventional<br>fine-tuning. Our in-depth analysis explores how B-cos LMs differ from<br>conventionally fine-tuned models in their learning processes and explanation<br>patterns. Finally, we provide practical guidelines for effectively building<br>B-cos LMs based on our findings. Our code is available at<br>https://anonymous.4open.science/r/bcos_lm.<br>
%K Computer Science, Computation and Language, cs.CL,Computer Science, Artificial Intelligence, cs.AI

2024

Conference paper

S. Arya, S. Rao, M. Boehle, and B. Schiele

“B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable,” in Advances in Neural Information Processing Systems 37 (NeurIPS 2024), Vancouver, Canada, 2024.

Abstract

B-cos Networks have been shown to be effective for obtaining highly human
interpretable explanations of model decisions by architecturally enforcing
stronger alignment between inputs and weight. B-cos variants of convolutional
networks (CNNs) and vision transformers (ViTs), which primarily replace linear
layers with B-cos transformations, perform competitively to their respective
standard variants while also yielding explanations that are faithful by design.
However, it has so far been necessary to train these models from scratch, which
is increasingly infeasible in the era of large, pre-trained foundation models.
In this work, inspired by the architectural similarities in standard DNNs and
B-cos networks, we propose 'B-cosification', a novel approach to transform
existing pre-trained models to become inherently interpretable. We perform a
thorough study of design choices to perform this conversion, both for
convolutional neural networks and vision transformers. We find that
B-cosification can yield models that are on par with B-cos models trained from
scratch in terms of interpretability, while often outperforming them in terms
of classification performance at a fraction of the training cost. Subsequently,
we apply B-cosification to a pretrained CLIP model, and show that, even with
limited data and compute cost, we obtain a B-cosified version that is highly
interpretable and competitive on zero shot performance across a variety of
datasets. We release our code and pre-trained model weights at
github.com/shrebox/B-cosification.

BibTeX

@inproceedings{Arya_Neurips24,
TITLE = {B-cosification: {T}ransforming Deep Neural Networks to be Inherently Interpretable},
AUTHOR = {Arya, Shreyash and Rao, Sukrut and Boehle, Moritz and Schiele, Bernt},
LANGUAGE = {eng},
PUBLISHER = {Curran Associates, Inc.},
YEAR = {2024},
MARGINALMARK = {$\bullet$},
ABSTRACT = {B-cos Networks have been shown to be effective for obtaining highly human<br>interpretable explanations of model decisions by architecturally enforcing<br>stronger alignment between inputs and weight. B-cos variants of convolutional<br>networks (CNNs) and vision transformers (ViTs), which primarily replace linear<br>layers with B-cos transformations, perform competitively to their respective<br>standard variants while also yielding explanations that are faithful by design.<br>However, it has so far been necessary to train these models from scratch, which<br>is increasingly infeasible in the era of large, pre-trained foundation models.<br>In this work, inspired by the architectural similarities in standard DNNs and<br>B-cos networks, we propose 'B-cosification', a novel approach to transform<br>existing pre-trained models to become inherently interpretable. We perform a<br>thorough study of design choices to perform this conversion, both for<br>convolutional neural networks and vision transformers. We find that<br>B-cosification can yield models that are on par with B-cos models trained from<br>scratch in terms of interpretability, while often outperforming them in terms<br>of classification performance at a fraction of the training cost. Subsequently,<br>we apply B-cosification to a pretrained CLIP model, and show that, even with<br>limited data and compute cost, we obtain a B-cosified version that is highly<br>interpretable and competitive on zero shot performance across a variety of<br>datasets. We release our code and pre-trained model weights at<br>https://github.com/shrebox/B-cosification.<br>},
BOOKTITLE = {Advances in Neural Information Processing Systems 37 (NeurIPS 2024)},
EDITOR = {Globerson, A. and Mackey, L. and Belgrave, D. and Fan, A. and Paquet, U. and Tomczak, J. and Zhang, C.},
PAGES = {62756--62786},
ADDRESS = {Vancouver, Canada},
}

Endnote

%0 Conference Proceedings
%A Arya, Shreyash
%A Rao, Sukrut
%A Boehle, Moritz
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T B-cosification: Transforming Deep Neural Networks to be Inherently
  Interpretable : 
%G eng
%U http://hdl.handle.net/21.11116/0000-0010-0FBE-9
%D 2024
%B 38th Conference on Neural Information Processing Systems
%Z date of event: 2024-12-10 - 2024-12-15
%C Vancouver, Canada
%X   B-cos Networks have been shown to be effective for obtaining highly human<br>interpretable explanations of model decisions by architecturally enforcing<br>stronger alignment between inputs and weight. B-cos variants of convolutional<br>networks (CNNs) and vision transformers (ViTs), which primarily replace linear<br>layers with B-cos transformations, perform competitively to their respective<br>standard variants while also yielding explanations that are faithful by design.<br>However, it has so far been necessary to train these models from scratch, which<br>is increasingly infeasible in the era of large, pre-trained foundation models.<br>In this work, inspired by the architectural similarities in standard DNNs and<br>B-cos networks, we propose 'B-cosification', a novel approach to transform<br>existing pre-trained models to become inherently interpretable. We perform a<br>thorough study of design choices to perform this conversion, both for<br>convolutional neural networks and vision transformers. We find that<br>B-cosification can yield models that are on par with B-cos models trained from<br>scratch in terms of interpretability, while often outperforming them in terms<br>of classification performance at a fraction of the training cost. Subsequently,<br>we apply B-cosification to a pretrained CLIP model, and show that, even with<br>limited data and compute cost, we obtain a B-cosified version that is highly<br>interpretable and competitive on zero shot performance across a variety of<br>datasets. We release our code and pre-trained model weights at<br>https://github.com/shrebox/B-cosification.<br>
%K Computer Science, Computer Vision and Pattern Recognition, cs.CV,Computer Science, Artificial Intelligence, cs.AI,Computer Science, Learning, cs.LG
%B Advances in Neural Information Processing Systems 37
%E Globerson, A.; Mackey, L.; Belgrave, D.; Fan, A.; Paquet, U.; Tomczak, J.; Zhang, C.
%P 62756 - 62786
%I Curran Associates, Inc.
%U https://proceedings.neurips.cc/paper_files/paper/2024/file/72d50a87b218d84c175d16f4557f7e12-Paper-Conference.pdf

Conference paper

A. Parchami-Araghi, M. Böhle, S. S. Rao, and B. Schiele

“Good Teachers Explain: Explanation-Enhanced Knowledge Distillation,” in Computer Vision -- ECCV 2024, Milano, Italy, 2024.

@inproceedings{ParchamiAraghiECCV24,
TITLE = {Good Teachers Explain: Explanation-Enhanced Knowledge Distillation},
AUTHOR = {Parchami-Araghi, Amin and B{\"o}hle, Moritz and Rao, Sukrut Sridhar and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-3-031-73464-9},
DOI = {10.1007/978-3-031-73464-9_18},
PUBLISHER = {Springer},
YEAR = {2024},
MARGINALMARK = {$\bullet$},
DATE = {2024},
BOOKTITLE = {Computer Vision -- ECCV 2024},
EDITOR = {Leonardis, Ale{\v s} and Ricci, Elisa and Roth, Stefan and Russakovsky, Olga and Sattler, Torsten and Varol, G{\"u}l},
PAGES = {293--310},
SERIES = {Lecture Notes in Computer Science},
VOLUME = {15131},
ADDRESS = {Milano, Italy},
}

Endnote

%0 Conference Proceedings
%A Parchami-Araghi, Amin
%A B&#246;hle, Moritz
%A Rao, Sukrut Sridhar
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Good Teachers Explain: Explanation-Enhanced Knowledge Distillation : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000F-5534-7
%R 10.1007/978-3-031-73464-9_18
%D 2024
%B 18th European Conference on Computer Vision 
%Z date of event: 2024-09-29 - 2024-10-04
%C Milano, Italy
%B Computer Vision -- ECCV 2024 
%E Leonardis, Ale&#353;; Ricci, Elisa; Roth, Stefan; Russakovsky, Olga; Sattler, Torsten; Varol, G&#252;l
%P 293 - 310
%I Springer
%@ 978-3-031-73464-9
%B Lecture Notes in Computer Science
%N 15131

Conference paper

S. Rao, S. Mahajan, M. Böhle, and B. Schiele

“Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery,” in Computer Vision -- ECCV 2024, Milan, Italy, 2024.

@inproceedings{RaoECCV24,
TITLE = {Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery},
AUTHOR = {Rao, Sukrut and Mahajan, Sweta and B{\"o}hle, Moritz and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-3-031-72979-9},
DOI = {10.1007/978-3-031-72980-5_26},
PUBLISHER = {Springer},
YEAR = {2024},
MARGINALMARK = {$\bullet$},
DATE = {2024},
BOOKTITLE = {Computer Vision -- ECCV 2024},
EDITOR = {Leonardis, Ale{\v s} and Ricci, Elisa and Roth, Stefan and Russakovsky, Olga and Sattler, Torsten and Varol, G{\"u}l},
PAGES = {444--461},
SERIES = {Lecture Notes in Computer Science},
VOLUME = {15135},
ADDRESS = {Milan, Italy},
}

Endnote

%0 Conference Proceedings
%A Rao, Sukrut
%A Mahajan, Sweta
%A B&#246;hle, Moritz
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000F-9A97-9
%R 10.1007/978-3-031-72980-5_26
%D 2024
%B 18th European Conference on Computer Vision 
%Z date of event: 2024-09-29 - 2024-10-04
%C Milan, Italy
%B Computer Vision -- ECCV 2024
%E Leonardis, Ale&#353;; Ricci, Elisa; Roth, Stefan; Russakovsky, Olga; Sattler, Torsten; Varol, G&#252;l
%P 444 - 461
%I Springer
%@ 978-3-031-72979-9
%B Lecture Notes in Computer Science
%N 15135
%U https://rdcu.be/dZWlW

Article

S. Rao, M. Boehle, and B. Schiele

“Better Understanding Differences in Attribution Methods via Systematic Evaluations,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 6, 2024.

@article{RaoTPAMI24,
TITLE = {Better Understanding Differences in Attribution Methods via Systematic Evaluations},
AUTHOR = {Rao, Sukrut and Boehle, Moritz and Schiele, Bernt},
LANGUAGE = {eng},
DOI = {10.1109/TPAMI.2024.3353528},
PUBLISHER = {IEEE},
ADDRESS = {Piscataway, NJ},
YEAR = {2024},
MARGINALMARK = {$\bullet$},
DATE = {2024},
JOURNAL = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
VOLUME = {46},
NUMBER = {6},
PAGES = {4090--4101},
}

Endnote

%0 Journal Article
%A Rao, Sukrut
%A Boehle, Moritz
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Better Understanding Differences in Attribution Methods via Systematic Evaluations : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000F-5532-9
%R 10.1109/TPAMI.2024.3353528
%7 2024
%D 2024
%J IEEE Transactions on Pattern Analysis and Machine Intelligence
%V 46
%N 6
%& 4090
%P 4090 - 4101
%I IEEE
%C Piscataway, NJ

2023

Conference paper

S. Rao, M. Böhle, A. Parchami-Araghi, and B. Schiele

“Studying How to Efficiently and Effectively Guide Models with Explanations,” in IEEE/CVF International Conference on Computer Vision (ICCV 2023), Paris, France, 2023.

@inproceedings{Rao_ICCV23,
TITLE = {Studying How to Efficiently and Effectively Guide Models with Explanations},
AUTHOR = {Rao, Sukrut and B{\"o}hle, Moritz and Parchami-Araghi, Amin and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {979-8-3503-0718-4},
DOI = {10.1109/ICCV51070.2023.00184},
PUBLISHER = {IEEE},
YEAR = {2023},
MARGINALMARK = {$\bullet$},
DATE = {2023},
BOOKTITLE = {IEEE/CVF International Conference on Computer Vision (ICCV 2023)},
PAGES = {1922--1933},
ADDRESS = {Paris, France},
}

Endnote

%0 Conference Proceedings
%A Rao, Sukrut
%A B&#246;hle, Moritz
%A Parchami-Araghi, Amin
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Studying How to Efficiently and Effectively Guide Models with Explanations : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000D-CA7B-6
%R 10.1109/ICCV51070.2023.00184
%D 2023
%B IEEE/CVF International Conference on Computer Vision
%Z date of event: 2023-10-02 - 2023-10-06
%C Paris, France
%B IEEE/CVF International Conference on Computer Vision
%P 1922 - 1933
%I IEEE
%@ 979-8-3503-0718-4

2022

Conference paper

D4D2

S. Rao, M. Böhle, and B. Schiele

“Towards Better Understanding Attribution Methods,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), New Orleans, LA, USA, 2022.

@inproceedings{Rao_CVPR2022,
TITLE = {Towards Better Understanding Attribution Methods},
AUTHOR = {Rao, Sukrut and B{\"o}hle, Moritz and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-1-6654-6946-3},
DOI = {10.1109/CVPR52688.2022.00998},
PUBLISHER = {IEEE},
YEAR = {2022},
BOOKTITLE = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022)},
PAGES = {10213--10222},
ADDRESS = {New Orleans, LA, USA},
}

Endnote

%0 Conference Proceedings
%A Rao, Sukrut
%A B&#246;hle, Moritz
%A Schiele, Bernt
%+ Computer Graphics, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Towards Better Understanding Attribution Methods : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000A-6F91-6
%R 10.1109/CVPR52688.2022.00998
%D 2022
%B 35th IEEE/CVF Conference on Computer Vision and Pattern Recognition
%Z date of event: 2022-06-19 - 2022-06-24
%C New Orleans, LA, USA
%B IEEE/CVF Conference on Computer Vision and Pattern Recognition
%P 10213 - 10222
%I IEEE
%@ 978-1-6654-6946-3

2020

Conference paper

D4D2

S. Rao, D. Stutz, and B. Schiele

“Adversarial Training Against Location-Optimized Adversarial Patches,” in Computer Vision -- ECCV Workshops 2020, Glasgow, UK, 2021.

@inproceedings{DBLP:conf/eccv/RaoSS20,
TITLE = {Adversarial Training Against Location-Optimized Adversarial Patches},
AUTHOR = {Rao, Sukrut and Stutz, David and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-3-030-68237-8},
DOI = {10.1007/978-3-030-68238-5_32},
PUBLISHER = {Springer},
YEAR = {2020},
DATE = {2021},
BOOKTITLE = {Computer Vision -- ECCV Workshops 2020},
EDITOR = {Bartoli, Adrian and Fusiello, Andrea},
PAGES = {429--448},
SERIES = {Lecture Notes in Computer Science},
VOLUME = {12539},
ADDRESS = {Glasgow, UK},
}

Endnote

%0 Conference Proceedings
%A Rao, Sukrut
%A Stutz, David
%A Schiele, Bernt
%+ Computer Graphics, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Adversarial Training Against Location-Optimized Adversarial Patches : 
%G eng
%U http://hdl.handle.net/21.11116/0000-0008-1662-1
%R 10.1007/978-3-030-68238-5_32
%D 2021
%B 16th European Conference on Computer Vision
%Z date of event: 2020-08-23 - 2020-08-28
%C Glasgow, UK
%B Computer Vision -- ECCV Workshops 2020 
%E Bartoli, Adrian; Fusiello, Andrea
%P 429 - 448
%I Springer
%@ 978-3-030-68237-8
%B Lecture Notes in Computer Science
%N 12539