Moritz Böhle (PhD Student)

Personal Information

Publications

2025

Conference paper

S. Gairola, M. Böhle, F. Locatello, and B. Schiele

“How to Probe: Simple Yet Effective Techniques for Improving Post-hoc Explanations,” in The Thirteenth International Conference on Learning Representations (ICLR 2025 ), Singapore, 2025.

mehr

BibTeX

@inproceedings{Gairola_ICLR25,
TITLE = {How to Probe: Simple Yet Effective Techniques for Improving Post-hoc Explanations},
AUTHOR = {Gairola, Siddhartha and B{\"o}hle, Moritz and Locatello, Francesco and Schiele, Bernt},
LANGUAGE = {eng},
PUBLISHER = {OpenReview.net},
YEAR = {2025},
MARGINALMARK = {$\bullet$},
BOOKTITLE = {The Thirteenth International Conference on Learning Representations (ICLR 2025 )},
ADDRESS = {Singapore},
}

Endnote

%0 Conference Proceedings
%A Gairola, Siddhartha
%A B&#246;hle, Moritz
%A Locatello, Francesco
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T How to Probe: Simple Yet Effective Techniques for Improving Post-hoc
  Explanations : 
%G eng
%U http://hdl.handle.net/21.11116/0000-0010-DACE-1
%D 2025
%B Thirteenth International Conference on Learning Representations
%Z date of event: 2025-04-24 - 2025-04-28
%C Singapore
%B The Thirteenth International Conference on Learning Representations
%I OpenReview.net
%U https://openreview.net/forum?id=57NfyYxh5f

2024

Conference paper

S. Arya, S. Rao, M. Boehle, and B. Schiele

“B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable,” in Advances in Neural Information Processing Systems 37 (NeurIPS 2024), Vancouver, Canada, 2024.

mehr

Abstract

B-cos Networks have been shown to be effective for obtaining highly human
interpretable explanations of model decisions by architecturally enforcing
stronger alignment between inputs and weight. B-cos variants of convolutional
networks (CNNs) and vision transformers (ViTs), which primarily replace linear
layers with B-cos transformations, perform competitively to their respective
standard variants while also yielding explanations that are faithful by design.
However, it has so far been necessary to train these models from scratch, which
is increasingly infeasible in the era of large, pre-trained foundation models.
In this work, inspired by the architectural similarities in standard DNNs and
B-cos networks, we propose 'B-cosification', a novel approach to transform
existing pre-trained models to become inherently interpretable. We perform a
thorough study of design choices to perform this conversion, both for
convolutional neural networks and vision transformers. We find that
B-cosification can yield models that are on par with B-cos models trained from
scratch in terms of interpretability, while often outperforming them in terms
of classification performance at a fraction of the training cost. Subsequently,
we apply B-cosification to a pretrained CLIP model, and show that, even with
limited data and compute cost, we obtain a B-cosified version that is highly
interpretable and competitive on zero shot performance across a variety of
datasets. We release our code and pre-trained model weights at
github.com/shrebox/B-cosification.

BibTeX

@inproceedings{Arya_Neurips24,
TITLE = {B-cosification: {T}ransforming Deep Neural Networks to be Inherently Interpretable},
AUTHOR = {Arya, Shreyash and Rao, Sukrut and Boehle, Moritz and Schiele, Bernt},
LANGUAGE = {eng},
PUBLISHER = {Curran Associates, Inc.},
YEAR = {2024},
MARGINALMARK = {$\bullet$},
ABSTRACT = {B-cos Networks have been shown to be effective for obtaining highly human<br>interpretable explanations of model decisions by architecturally enforcing<br>stronger alignment between inputs and weight. B-cos variants of convolutional<br>networks (CNNs) and vision transformers (ViTs), which primarily replace linear<br>layers with B-cos transformations, perform competitively to their respective<br>standard variants while also yielding explanations that are faithful by design.<br>However, it has so far been necessary to train these models from scratch, which<br>is increasingly infeasible in the era of large, pre-trained foundation models.<br>In this work, inspired by the architectural similarities in standard DNNs and<br>B-cos networks, we propose 'B-cosification', a novel approach to transform<br>existing pre-trained models to become inherently interpretable. We perform a<br>thorough study of design choices to perform this conversion, both for<br>convolutional neural networks and vision transformers. We find that<br>B-cosification can yield models that are on par with B-cos models trained from<br>scratch in terms of interpretability, while often outperforming them in terms<br>of classification performance at a fraction of the training cost. Subsequently,<br>we apply B-cosification to a pretrained CLIP model, and show that, even with<br>limited data and compute cost, we obtain a B-cosified version that is highly<br>interpretable and competitive on zero shot performance across a variety of<br>datasets. We release our code and pre-trained model weights at<br>https://github.com/shrebox/B-cosification.<br>},
BOOKTITLE = {Advances in Neural Information Processing Systems 37 (NeurIPS 2024)},
EDITOR = {Globerson, A. and Mackey, L. and Belgrave, D. and Fan, A. and Paquet, U. and Tomczak, J. and Zhang, C.},
PAGES = {62756--62786},
ADDRESS = {Vancouver, Canada},
}

Endnote

%0 Conference Proceedings
%A Arya, Shreyash
%A Rao, Sukrut
%A Boehle, Moritz
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T B-cosification: Transforming Deep Neural Networks to be Inherently
  Interpretable : 
%G eng
%U http://hdl.handle.net/21.11116/0000-0010-0FBE-9
%D 2024
%B 38th Conference on Neural Information Processing Systems
%Z date of event: 2024-12-10 - 2024-12-15
%C Vancouver, Canada
%X   B-cos Networks have been shown to be effective for obtaining highly human<br>interpretable explanations of model decisions by architecturally enforcing<br>stronger alignment between inputs and weight. B-cos variants of convolutional<br>networks (CNNs) and vision transformers (ViTs), which primarily replace linear<br>layers with B-cos transformations, perform competitively to their respective<br>standard variants while also yielding explanations that are faithful by design.<br>However, it has so far been necessary to train these models from scratch, which<br>is increasingly infeasible in the era of large, pre-trained foundation models.<br>In this work, inspired by the architectural similarities in standard DNNs and<br>B-cos networks, we propose 'B-cosification', a novel approach to transform<br>existing pre-trained models to become inherently interpretable. We perform a<br>thorough study of design choices to perform this conversion, both for<br>convolutional neural networks and vision transformers. We find that<br>B-cosification can yield models that are on par with B-cos models trained from<br>scratch in terms of interpretability, while often outperforming them in terms<br>of classification performance at a fraction of the training cost. Subsequently,<br>we apply B-cosification to a pretrained CLIP model, and show that, even with<br>limited data and compute cost, we obtain a B-cosified version that is highly<br>interpretable and competitive on zero shot performance across a variety of<br>datasets. We release our code and pre-trained model weights at<br>https://github.com/shrebox/B-cosification.<br>
%K Computer Science, Computer Vision and Pattern Recognition, cs.CV,Computer Science, Artificial Intelligence, cs.AI,Computer Science, Learning, cs.LG
%B Advances in Neural Information Processing Systems 37
%E Globerson, A.; Mackey, L.; Belgrave, D.; Fan, A.; Paquet, U.; Tomczak, J.; Zhang, C.
%P 62756 - 62786
%I Curran Associates, Inc.
%U https://proceedings.neurips.cc/paper_files/paper/2024/file/72d50a87b218d84c175d16f4557f7e12-Paper-Conference.pdf

Conference paper

A. Parchami-Araghi, M. Böhle, S. S. Rao, and B. Schiele

“Good Teachers Explain: Explanation-Enhanced Knowledge Distillation,” in Computer Vision -- ECCV 2024, Milano, Italy, 2024.

mehr

BibTeX

@inproceedings{ParchamiAraghiECCV24,
TITLE = {Good Teachers Explain: Explanation-Enhanced Knowledge Distillation},
AUTHOR = {Parchami-Araghi, Amin and B{\"o}hle, Moritz and Rao, Sukrut Sridhar and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-3-031-73464-9},
DOI = {10.1007/978-3-031-73464-9_18},
PUBLISHER = {Springer},
YEAR = {2024},
MARGINALMARK = {$\bullet$},
DATE = {2024},
BOOKTITLE = {Computer Vision -- ECCV 2024},
EDITOR = {Leonardis, Ale{\v s} and Ricci, Elisa and Roth, Stefan and Russakovsky, Olga and Sattler, Torsten and Varol, G{\"u}l},
PAGES = {293--310},
SERIES = {Lecture Notes in Computer Science},
VOLUME = {15131},
ADDRESS = {Milano, Italy},
}

Endnote

%0 Conference Proceedings
%A Parchami-Araghi, Amin
%A B&#246;hle, Moritz
%A Rao, Sukrut Sridhar
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Good Teachers Explain: Explanation-Enhanced Knowledge Distillation : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000F-5534-7
%R 10.1007/978-3-031-73464-9_18
%D 2024
%B 18th European Conference on Computer Vision 
%Z date of event: 2024-09-29 - 2024-10-04
%C Milano, Italy
%B Computer Vision -- ECCV 2024 
%E Leonardis, Ale&#353;; Ricci, Elisa; Roth, Stefan; Russakovsky, Olga; Sattler, Torsten; Varol, G&#252;l
%P 293 - 310
%I Springer
%@ 978-3-031-73464-9
%B Lecture Notes in Computer Science
%N 15131

Conference paper

S. Rao, S. Mahajan, M. Böhle, and B. Schiele

“Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery,” in Computer Vision -- ECCV 2024, Milan, Italy, 2024.

mehr

BibTeX

@inproceedings{RaoECCV24,
TITLE = {Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery},
AUTHOR = {Rao, Sukrut and Mahajan, Sweta and B{\"o}hle, Moritz and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-3-031-72979-9},
DOI = {10.1007/978-3-031-72980-5_26},
PUBLISHER = {Springer},
YEAR = {2024},
MARGINALMARK = {$\bullet$},
DATE = {2024},
BOOKTITLE = {Computer Vision -- ECCV 2024},
EDITOR = {Leonardis, Ale{\v s} and Ricci, Elisa and Roth, Stefan and Russakovsky, Olga and Sattler, Torsten and Varol, G{\"u}l},
PAGES = {444--461},
SERIES = {Lecture Notes in Computer Science},
VOLUME = {15135},
ADDRESS = {Milan, Italy},
}

Endnote

%0 Conference Proceedings
%A Rao, Sukrut
%A Mahajan, Sweta
%A B&#246;hle, Moritz
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000F-9A97-9
%R 10.1007/978-3-031-72980-5_26
%D 2024
%B 18th European Conference on Computer Vision 
%Z date of event: 2024-09-29 - 2024-10-04
%C Milan, Italy
%B Computer Vision -- ECCV 2024
%E Leonardis, Ale&#353;; Ricci, Elisa; Roth, Stefan; Russakovsky, Olga; Sattler, Torsten; Varol, G&#252;l
%P 444 - 461
%I Springer
%@ 978-3-031-72979-9
%B Lecture Notes in Computer Science
%N 15135
%U https://rdcu.be/dZWlW

Article

M. Böhle, N. Singh, M. Fritz, and B. Schiele

“B-Cos Alignment for Inherently Interpretable CNNs and Vision Transformers,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 6, 2024.

mehr

BibTeX

@article{BoehleTPAMI24,
TITLE = {B-Cos Alignment for Inherently Interpretable {CNNs} and Vision Transformers},
AUTHOR = {B{\"o}hle, Moritz and Singh, Navdeeppal and Fritz, Mario and Schiele, Bernt},
LANGUAGE = {eng},
ISSN = {0162-8828},
DOI = {10.1109/TPAMI.2024.3355155},
PUBLISHER = {IEEE},
ADDRESS = {Piscataway, NJ},
YEAR = {2024},
MARGINALMARK = {$\bullet$},
DATE = {2024},
JOURNAL = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
VOLUME = {46},
NUMBER = {6},
PAGES = {4504--4518},
}

Endnote

%0 Journal Article
%A B&#246;hle, Moritz
%A Singh, Navdeeppal
%A Fritz, Mario
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T B-Cos Alignment for Inherently Interpretable CNNs and Vision Transformers : 
%G eng
%U http://hdl.handle.net/21.11116/0000-0011-0BCA-E
%R 10.1109/TPAMI.2024.3355155
%7 2024-01-17
%D 2024
%J IEEE Transactions on Pattern Analysis and Machine Intelligence
%O IEEE Trans. Pattern Anal. Mach. Intell. TPAMI
%V 46
%N 6
%& 4504
%P 4504 - 4518
%I IEEE
%C Piscataway, NJ
%@ false

Article

S. Rao, M. Boehle, and B. Schiele

“Better Understanding Differences in Attribution Methods via Systematic Evaluations,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 6, 2024.

mehr

BibTeX

@article{RaoTPAMI24,
TITLE = {Better Understanding Differences in Attribution Methods via Systematic Evaluations},
AUTHOR = {Rao, Sukrut and Boehle, Moritz and Schiele, Bernt},
LANGUAGE = {eng},
DOI = {10.1109/TPAMI.2024.3353528},
PUBLISHER = {IEEE},
ADDRESS = {Piscataway, NJ},
YEAR = {2024},
MARGINALMARK = {$\bullet$},
DATE = {2024},
JOURNAL = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
VOLUME = {46},
NUMBER = {6},
PAGES = {4090--4101},
}

Endnote

%0 Journal Article
%A Rao, Sukrut
%A Boehle, Moritz
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Better Understanding Differences in Attribution Methods via Systematic Evaluations : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000F-5532-9
%R 10.1109/TPAMI.2024.3353528
%7 2024
%D 2024
%J IEEE Transactions on Pattern Analysis and Machine Intelligence
%V 46
%N 6
%& 4090
%P 4090 - 4101
%I IEEE
%C Piscataway, NJ

Thesis

D2IMPR-CS

M. Böhle

“Towards Designing Inherently Interpretable Deep Neural Networks for Image Classification,” Universität des Saarlandes, Saarbrücken, 2024.

mehr

BibTeX

@phdthesis{BoehlePhD24,
TITLE = {Towards Designing Inherently Interpretable Deep Neural Networks for Image Classification},
AUTHOR = {B{\"o}hle, Moritz},
LANGUAGE = {eng},
URL = {urn:nbn:de:bsz:291--ds-421904},
DOI = {10.22028/D291-42190},
SCHOOL = {Universit{\"a}t des Saarlandes},
ADDRESS = {Saarbr{\"u}cken},
YEAR = {2024},
MARGINALMARK = {$\bullet$},
DATE = {2024},
}

Endnote

%0 Thesis
%A B&#246;hle, Moritz
%Y Schiele, Bernt
%Y Fritz, Mario
%A referee: Akata, Zeynep
%A referee: Brendel, Wieland
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
International Max Planck Research School, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
External Organizations
%T Towards Designing Inherently Interpretable Deep Neural Networks for Image Classification : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000F-76AA-D
%U urn:nbn:de:bsz:291--ds-421904
%R 10.22028/D291-42190
%F OTHER: hdl:20.500.11880/37907
%I Universit&#228;t des Saarlandes
%C Saarbr&#252;cken
%D 2024
%P XIII, 265 p.
%V phd
%9 phd
%U https://publikationen.sulb.uni-saarland.de/handle/20.500.11880/37907

2023

Conference paper

A. Kukleva, M. Boehle, B. Schiele, H. Kuehne, and C. Rupprecht

“Temperature Schedules for Self-Supervised Contrastive Methods on Long-Tail Data,” in Eleventh International Conference on Learning Representations (ICLR 2023), Kigali, Rwanda, 2023.

mehr

BibTeX

@inproceedings{Kukleva_ICLR23,
TITLE = {Temperature Schedules for Self-Supervised Contrastive Methods on Long-Tail Data},
AUTHOR = {Kukleva, Anna and Boehle, Moritz and Schiele, Bernt and Kuehne, Hilde and Rupprecht, Christian},
LANGUAGE = {eng},
URL = {https://openreview.net/group?id=ICLR.cc/2023/Conference#poster},
PUBLISHER = {OpenReview.net},
YEAR = {2023},
MARGINALMARK = {$\bullet$},
BOOKTITLE = {Eleventh International Conference on Learning Representations (ICLR 2023)},
ADDRESS = {Kigali, Rwanda},
}

Endnote

%0 Conference Proceedings
%A Kukleva, Anna
%A Boehle, Moritz
%A Schiele, Bernt
%A Kuehne, Hilde
%A Rupprecht, Christian
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
External Organizations
%T Temperature Schedules for Self-Supervised Contrastive Methods on Long-Tail Data  : 
%G eng
%U http://hdl.handle.net/21.11116/0000-0010-45D5-0
%D 2023
%B Eleventh International Conference on Learning Representations
%Z date of event: 2023-05-01 - 2023-05-05
%C Kigali, Rwanda
%B Eleventh International Conference on Learning Representations
%I OpenReview.net
%U https://openreview.net/group?id=ICLR.cc/2023/Conference#poster
%U https://openreview.net/forum?id=ejHUr4nfHhD

Conference paper

S. Rao, M. Böhle, A. Parchami-Araghi, and B. Schiele

“Studying How to Efficiently and Effectively Guide Models with Explanations,” in IEEE/CVF International Conference on Computer Vision (ICCV 2023), Paris, France, 2023.

mehr

BibTeX

@inproceedings{Rao_ICCV23,
TITLE = {Studying How to Efficiently and Effectively Guide Models with Explanations},
AUTHOR = {Rao, Sukrut and B{\"o}hle, Moritz and Parchami-Araghi, Amin and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {979-8-3503-0718-4},
DOI = {10.1109/ICCV51070.2023.00184},
PUBLISHER = {IEEE},
YEAR = {2023},
MARGINALMARK = {$\bullet$},
DATE = {2023},
BOOKTITLE = {IEEE/CVF International Conference on Computer Vision (ICCV 2023)},
PAGES = {1922--1933},
ADDRESS = {Paris, France},
}

Endnote

%0 Conference Proceedings
%A Rao, Sukrut
%A B&#246;hle, Moritz
%A Parchami-Araghi, Amin
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Studying How to Efficiently and Effectively Guide Models with Explanations : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000D-CA7B-6
%R 10.1109/ICCV51070.2023.00184
%D 2023
%B IEEE/CVF International Conference on Computer Vision
%Z date of event: 2023-10-02 - 2023-10-06
%C Paris, France
%B IEEE/CVF International Conference on Computer Vision
%P 1922 - 1933
%I IEEE
%@ 979-8-3503-0718-4

Article

M. Böhle, M. Fritz, and B. Schiele

“Optimising for Interpretability: Convolutional Dynamic Alignment Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 6, 2023.

mehr

BibTeX

@article{Boehle2109.13004,
TITLE = {Optimising for Interpretability: Convolutional Dynamic Alignment Networks},
AUTHOR = {B{\"o}hle, Moritz and Fritz, Mario and Schiele, Bernt},
LANGUAGE = {eng},
ISSN = {0162-8828},
DOI = {10.1109/TPAMI.2022.3226041},
PUBLISHER = {IEEE},
ADDRESS = {Piscataway, NJ},
YEAR = {2023},
MARGINALMARK = {$\bullet$},
DATE = {2023},
JOURNAL = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
VOLUME = {45},
NUMBER = {6},
PAGES = {7625--7638},
}

Endnote

%0 Journal Article
%A B&#246;hle, Moritz
%A Fritz, Mario
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Optimising for Interpretability: Convolutional Dynamic Alignment
  Networks : 
%G eng
%U http://hdl.handle.net/21.11116/0000-0009-8113-F
%R 10.1109/TPAMI.2022.3226041
%7 2022-12-01
%D 2023
%J IEEE Transactions on Pattern Analysis and Machine Intelligence
%O IEEE Trans. Pattern Anal. Mach. Intell.
%V 45
%N 6
%& 7625
%P 7625 - 7638
%I IEEE
%C Piscataway, NJ
%@ false

Paper

M. Böhle, M. Fritz, and B. Schiele

“Holistically Explainable Vision Transformers,” 2023. [Online]. Available: https://arxiv.org/abs/2301.08669.

mehr

Abstract

Transformers increasingly dominate the machine learning landscape across many
tasks and domains, which increases the importance for understanding their
outputs. While their attention modules provide partial insight into their inner
workings, the attention scores have been shown to be insufficient for
explaining the models as a whole. To address this, we propose B-cos
transformers, which inherently provide holistic explanations for their
decisions. Specifically, we formulate each model component - such as the
multi-layer perceptrons, attention layers, and the tokenisation module - to be
dynamic linear, which allows us to faithfully summarise the entire transformer
via a single linear transform. We apply our proposed design to Vision
Transformers (ViTs) and show that the resulting models, dubbed Bcos-ViTs, are
highly interpretable and perform competitively to baseline ViTs on ImageNet.
Code will be made available soon.

BibTeX

@online{Boehle_2301.08669,
TITLE = {Holistically Explainable Vision Transformers},
AUTHOR = {B{\"o}hle, Moritz and Fritz, Mario and Schiele, Bernt},
LANGUAGE = {eng},
URL = {https://arxiv.org/abs/2301.08669},
EPRINT = {2301.08669},
EPRINTTYPE = {arXiv},
YEAR = {2023},
MARGINALMARK = {$\bullet$},
ABSTRACT = {Transformers increasingly dominate the machine learning landscape across many<br>tasks and domains, which increases the importance for understanding their<br>outputs. While their attention modules provide partial insight into their inner<br>workings, the attention scores have been shown to be insufficient for<br>explaining the models as a whole. To address this, we propose B-cos<br>transformers, which inherently provide holistic explanations for their<br>decisions. Specifically, we formulate each model component -- such as the<br>multi-layer perceptrons, attention layers, and the tokenisation module -- to be<br>dynamic linear, which allows us to faithfully summarise the entire transformer<br>via a single linear transform. We apply our proposed design to Vision<br>Transformers (ViTs) and show that the resulting models, dubbed Bcos-ViTs, are<br>highly interpretable and perform competitively to baseline ViTs on ImageNet.<br>Code will be made available soon.<br>},
}

Endnote

%0 Report
%A B&#246;hle, Moritz
%A Fritz, Mario
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Holistically Explainable Vision Transformers : 
%G eng
%U http://hdl.handle.net/21.11116/0000-0010-45E3-0
%U https://arxiv.org/abs/2301.08669
%D 2023
%X   Transformers increasingly dominate the machine learning landscape across many<br>tasks and domains, which increases the importance for understanding their<br>outputs. While their attention modules provide partial insight into their inner<br>workings, the attention scores have been shown to be insufficient for<br>explaining the models as a whole. To address this, we propose B-cos<br>transformers, which inherently provide holistic explanations for their<br>decisions. Specifically, we formulate each model component - such as the<br>multi-layer perceptrons, attention layers, and the tokenisation module - to be<br>dynamic linear, which allows us to faithfully summarise the entire transformer<br>via a single linear transform. We apply our proposed design to Vision<br>Transformers (ViTs) and show that the resulting models, dubbed Bcos-ViTs, are<br>highly interpretable and perform competitively to baseline ViTs on ImageNet.<br>Code will be made available soon.<br>
%K Computer Science, Computer Vision and Pattern Recognition, cs.CV,Statistics, Machine Learning, stat.ML

2022

Conference paper

M. Böhle, M. Fritz, and B. Schiele

“B-cos Networks: Alignment is All We Need for Interpretability,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), New Orleans, LA, USA, 2022.

mehr

BibTeX

@inproceedings{Boehle_CVPR2022,
TITLE = {B-cos Networks: {A}lignment is All We Need for Interpretability},
AUTHOR = {B{\"o}hle, Moritz and Fritz, Mario and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-1-6654-6946-3},
DOI = {10.1109/CVPR52688.2022.01008},
PUBLISHER = {IEEE},
YEAR = {2022},
BOOKTITLE = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022)},
PAGES = {10319--10328},
ADDRESS = {New Orleans, LA, USA},
}

Endnote

%0 Conference Proceedings
%A B&#246;hle, Moritz
%A Fritz, Mario
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T B-cos Networks: Alignment is All We Need for Interpretability : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000A-6F96-1
%R 10.1109/CVPR52688.2022.01008
%D 2022
%B 35th IEEE/CVF Conference on Computer Vision and Pattern Recognition
%Z date of event: 2022-06-19 - 2022-06-24
%C New Orleans, LA, USA
%B IEEE/CVF Conference on Computer Vision and Pattern Recognition
%P 10319 - 10328
%I IEEE
%@ 978-1-6654-6946-3

Conference paper

D4D2

S. Rao, M. Böhle, and B. Schiele

“Towards Better Understanding Attribution Methods,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), New Orleans, LA, USA, 2022.

mehr

BibTeX

@inproceedings{Rao_CVPR2022,
TITLE = {Towards Better Understanding Attribution Methods},
AUTHOR = {Rao, Sukrut and B{\"o}hle, Moritz and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-1-6654-6946-3},
DOI = {10.1109/CVPR52688.2022.00998},
PUBLISHER = {IEEE},
YEAR = {2022},
BOOKTITLE = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022)},
PAGES = {10213--10222},
ADDRESS = {New Orleans, LA, USA},
}

Endnote

%0 Conference Proceedings
%A Rao, Sukrut
%A B&#246;hle, Moritz
%A Schiele, Bernt
%+ Computer Graphics, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Towards Better Understanding Attribution Methods : 
%G eng
%U http://hdl.handle.net/21.11116/0000-000A-6F91-6
%R 10.1109/CVPR52688.2022.00998
%D 2022
%B 35th IEEE/CVF Conference on Computer Vision and Pattern Recognition
%Z date of event: 2022-06-19 - 2022-06-24
%C New Orleans, LA, USA
%B IEEE/CVF Conference on Computer Vision and Pattern Recognition
%P 10213 - 10222
%I IEEE
%@ 978-1-6654-6946-3

2021

Conference paper

M. Böhle, M. Fritz, and B. Schiele

“Convolutional Dynamic Alignment Networks for Interpretable Classifications,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021), Nashville, TN, USA (Virtual), 2021.

mehr

BibTeX

@inproceedings{Boehle_CVPR21,
TITLE = {Convolutional Dynamic Alignment Networks for Interpretable Classifications},
AUTHOR = {B{\"o}hle, Moritz and Fritz, Mario and Schiele, Bernt},
LANGUAGE = {eng},
ISBN = {978-1-6654-4509-2},
DOI = {10.1109/CVPR46437.2021.00990},
PUBLISHER = {IEEE},
YEAR = {2021},
BOOKTITLE = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021)},
PAGES = {10029--10038},
ADDRESS = {Nashville, TN, USA (Virtual)},
}

Endnote

%0 Conference Proceedings
%A B&#246;hle, Moritz
%A Fritz, Mario
%A Schiele, Bernt
%+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
External Organizations
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society
%T Convolutional Dynamic Alignment Networks for Interpretable Classifications : 
%G eng
%U http://hdl.handle.net/21.11116/0000-0008-1863-E
%R 10.1109/CVPR46437.2021.00990
%D 2021
%B 34th IEEE Conference on Computer Vision and Pattern Recognition
%Z date of event: 2021-06-19 - 2021-06-25
%C Nashville, TN, USA (Virtual)
%B IEEE/CVF Conference on Computer Vision and Pattern Recognition
%P 10029 - 10038
%I IEEE
%@ 978-1-6654-4509-2