Abstract
With the rise of neural networks, especially in high-stakes applications,
these networks need two properties (i) robustness and (ii) interpretability to
ensure their safety. Recent advances in classifiers with 3D volumetric object
representations have demonstrated a greatly enhanced robustness in
out-of-distribution data. However, these 3D-aware classifiers have not been
studied from the perspective of interpretability. We introduce CAVE - Concept
Aware Volumes for Explanations - a new direction that unifies interpretability
and robustness in image classification. We design an inherently-interpretable
and robust classifier by extending existing 3D-aware classifiers with concepts
extracted from their volumetric representations for classification. In an array
of quantitative metrics for interpretability, we compare against different
concept-based approaches across the explainable AI literature and show that
CAVE discovers well-grounded concepts that are used consistently across images,
while achieving superior robustness.
BibTeX
@online{Pham2503.13429, TITLE = {Escaping Plato's Cave: Robust Conceptual Reasoning through Interpretable {3D} Neural Object Volumes}, AUTHOR = {Pham, Nhi and Schiele, Bernt and Kortylewski, Adam and Fischer, Jonas}, LANGUAGE = {eng}, URL = {https://arxiv.org/abs/2503.13429}, EPRINT = {2503.13429}, EPRINTTYPE = {arXiv}, YEAR = {2025}, MARGINALMARK = {$\bullet$}, ABSTRACT = {With the rise of neural networks, especially in high-stakes applications,<br>these networks need two properties (i) robustness and (ii) interpretability to<br>ensure their safety. Recent advances in classifiers with 3D volumetric object<br>representations have demonstrated a greatly enhanced robustness in<br>out-of-distribution data. However, these 3D-aware classifiers have not been<br>studied from the perspective of interpretability. We introduce CAVE -- Concept<br>Aware Volumes for Explanations -- a new direction that unifies interpretability<br>and robustness in image classification. We design an inherently-interpretable<br>and robust classifier by extending existing 3D-aware classifiers with concepts<br>extracted from their volumetric representations for classification. In an array<br>of quantitative metrics for interpretability, we compare against different<br>concept-based approaches across the explainable AI literature and show that<br>CAVE discovers well-grounded concepts that are used consistently across images,<br>while achieving superior robustness.<br>}, }
Endnote
%0 Report %A Pham, Nhi %A Schiele, Bernt %A Kortylewski, Adam %A Fischer, Jonas %+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society Visual Computing and Artificial Intelligence, MPI for Informatics, Max Planck Society Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society %T Escaping Plato's Cave: Robust Conceptual Reasoning through Interpretable 3D Neural Object Volumes : %G eng %U http://hdl.handle.net/21.11116/0000-0010-EB66-3 %U https://arxiv.org/abs/2503.13429 %D 2025 %X With the rise of neural networks, especially in high-stakes applications,<br>these networks need two properties (i) robustness and (ii) interpretability to<br>ensure their safety. Recent advances in classifiers with 3D volumetric object<br>representations have demonstrated a greatly enhanced robustness in<br>out-of-distribution data. However, these 3D-aware classifiers have not been<br>studied from the perspective of interpretability. We introduce CAVE - Concept<br>Aware Volumes for Explanations - a new direction that unifies interpretability<br>and robustness in image classification. We design an inherently-interpretable<br>and robust classifier by extending existing 3D-aware classifiers with concepts<br>extracted from their volumetric representations for classification. In an array<br>of quantitative metrics for interpretability, we compare against different<br>concept-based approaches across the explainable AI literature and show that<br>CAVE discovers well-grounded concepts that are used consistently across images,<br>while achieving superior robustness.<br> %K Computer Science, Computer Vision and Pattern Recognition, cs.CV