Publications - The Year Before Last
2023
- “Class-Incremental Exemplar Compression for Class-Incremental Learning,” in 36th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.
- “Transitivity Recovering Decompositions: Interpretable and Robust Fine-Grained Relationships,” in Advances in Neural Information Processing Systems 36 (NeurIPS 2023), New Orleans, LA, USA, 2023.
- “LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching,” in Advances in Neural Information Processing Systems 36 (NeurIPS 2023), New Orleans, LA, USA, 2023.
- “Differentiable Architecture Search: a One-Shot Method?,” in AutoML Conference 2023, Potsdam/Berlin, Germany, 2023.
- “A Polyhedral Study of Lifted Multicuts,” Discrete Optimization, vol. 47, 2023.
- “SoftMatch: Addressing the Quantity-Quality Tradeoff in Semi-supervised Learning,” in Eleventh International Conference on Learning Representations (ICLR 2023), Kigali, Rwanda, 2023.
- “Towards Robust Object Detection Invariant to Real-World Domain Shifts,” in Eleventh International Conference on Learning Representations (ICLR 2023), Kigali, Rwanda, 2023.
- “Neural Architecture Design and Robustness: A Dataset,” in Eleventh International Conference on Learning Representations (ICLR 2023), Kigali, Rwanda.more
Abstract
Deep learning models have proven to be successful in a wide
range of machine learning tasks. Yet, they are often highly sensitive to
perturbations on the input data which can lead to incorrect decisions
with high confidence, hampering their deployment for practical
use-cases. Thus, finding architectures that are (more) robust against
perturbations has received much attention in recent years. Just like the
search for well-performing architectures in terms of clean accuracy,
this usually involves a tedious trial-and-error process with one
additional challenge: the evaluation of a network's robustness is
significantly more expensive than its evaluation for clean accuracy.
Thus, the aim of this paper is to facilitate better streamlined research
on architectural design choices with respect to their impact on
robustness as well as, for example, the evaluation of surrogate measures
for robustness. We therefore borrow one of the most commonly considered
search spaces for neural architecture search for image classification,
NAS-Bench-201, which contains a manageable size of 6466 non-isomorphic
network designs. We evaluate all these networks on a range of common
adversarial attacks and corruption types and introduce a database on
neural architecture design and robustness evaluations. We further
present three exemplary use cases of this dataset, in which we (i)
benchmark robustness measurements based on Jacobian and Hessian matrices
for their robustness predictability, (ii) perform neural architecture
search on robust accuracies, and (iii) provide an initial analysis of
how architectural design choices affect robustness. We find that
carefully crafting the topology of a network can have substantial impact
on its robustness, where networks with the same parameter count range in
mean adversarial robust accuracy from 20%-41%. - “Temperature Schedules for Self-Supervised Contrastive Methods on Long-Tail Data,” in Eleventh International Conference on Learning Representations (ICLR 2023), Kigali, Rwanda, 2023.
- “FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning,” in Eleventh International Conference on Learning Representations (ICLR 2023), Kigali, Rwanda, 2023.
- “Visual Coherence Loss for Coherent and Visually Grounded Story Generation,” in Findings of the Association for Computational Linguistics (ACL 2023), Toronto, Canada, 2023.more
Abstract
Local coherence is essential for long-form text generation models. We identify two important aspects of local coherence within the visual storytelling task: (1) the model needs to represent re-occurrences of characters within the image sequence in order to mention them correctly in the story; (2) character representations should enable us to find instances of the same characters and distinguish different characters. In this paper, we propose a loss function inspired by a linguistic theory of coherence for self-supervised learning for image sequence representations. We further propose combining features from an object and a face detector to construct stronger character features. To evaluate input-output relevance that current reference-based metrics don't measure, we propose a character matching metric to check whether the models generate referring expressions correctly for characters in input image sequences. Experiments on a visual story generation dataset show that our proposed features and loss function are effective for generating more coherent and visually grounded stories.
- “Weakly-Supervised Domain Adaptive Semantic Segmentation With Prototypical Contrastive Learning,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.
- “HGFormer: Hierarchical Grouping Transformer for Domain Generalized Semantic Segmentation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.
- “Federated Incremental Semantic Segmentation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.
- “Continuous Pseudo-Label Rectified Domain Adaptive Semantic Segmentation With Implicit Neural Representations,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.
- “Improving Robustness of Vision Transformers by Reducing Sensitivity To Patch Corruptions,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.
- “MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.
- “A Meta-Learning Approach to Predicting Performance and Data Requirements,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.
- “Self-Supervised Pre-Training With Masked Shape Prediction for 3D Scene Understanding,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.
- “Continual Detection Transformer for Incremental Object Detection,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.
- “Object Pop-Up: Can We Infer 3D Objects and their Poses from Human Interactions Alone?,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.
- “DSVT: Dynamic Sparse Voxel Transformer With Rotated Sets,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.
- “Virtual Sparse Convolution for Multimodal 3D Object Detection,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.
- “Visibility Aware Human-Object Interaction Tracking from Single RGB Camera,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.
- “ConQueR: Query Contrast Voxel-DETR for 3D Object Detection,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023), Vancouver, Canada, 2023.
- “TrajectoryFormer: 3D Object Tracking Transformer with Predictive Trajectory Hypotheses,” in IEEE/CVF International Conference on Computer Vision (ICCV 2023), Paris, France, 2023.
- “SSB: Simple but Strong Baseline for Boosting Performance of Open-Set Semi-Supervised Learning,” in IEEE/CVF International Conference on Computer Vision (ICCV 2023), Paris, France, 2023.
- “Robustifying Token Attention for Vision Transformers,” in IEEE/CVF International Conference on Computer Vision (ICCV 2023), Paris, France, 2023.
- “Studying How to Efficiently and Effectively Guide Models with Explanations,” in IEEE/CVF International Conference on Computer Vision (ICCV 2023), Paris, France, 2023.
- “DARTH: Holistic Test-time Adaptation for Multiple Object Tracking,” in IEEE/CVF International Conference on Computer Vision (ICCV 2023), Paris, France, 2023.
- “In-Style: Bridging Text and Uncurated Videos with Style Transfer for Text-Video Retrieval,” in IEEE/CVF International Conference on Computer Vision (ICCV 2023), Paris, France, 2023.
- “Learning by Sorting: Self-supervised Learning with Group Ordering Constraints,” in IEEE/CVF International Conference on Computer Vision (ICCV 2023), Paris, France, 2023.
- “UniTR: A Unified and Efficient Multi-Modal Transformer for Bird’s-Eye-View Representation,” in IEEE/CVF International Conference on Computer Vision (ICCV 2023), Paris, France, 2023.
- “SimNP: Learning Self-Similarity Priors Between Neural Points,” in IEEE/CVF International Conference on Computer Vision (ICCV 2023), Paris, France, 2023.
- “NSF: Neural Surface Fields for Human Modeling from Monocular Depth,” in IEEE/CVF International Conference on Computer Vision (ICCV 2023), Paris, France, 2023.
- “On the Unreasonable Vulnerability of Transformers for Image Restoration – and an Easy Fix,” in IEEE/CVF International Conference on Computer Vision Workshops (ICCVW 2023), Paris, France, 2023.
- “Classification Robustness to Common Optical Aberrations,” in IEEE/CVF International Conference on Computer Vision Workshops (ICCVW 2023), Paris, France, 2023.
- “HRFuser: A Multi-resolution Sensor Fusion Architecture for 2D Object Detection,” in IEEE 26th International Conference on Intelligent Transportation Systems (ITSC), Bilbao, Spain, 2023.
- “Test-time Domain Adaptation for Monocular Depth Estimation,” in IEEE International Conference on Robotics and Automation (ICRA 2023), London, UK, 2023.
- “TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction,” in IEEE International Conference on Robotics and Automation (ICRA 2023), London, UK, 2023.
- “Optimising for Interpretability: Convolutional Dynamic Alignment Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 6, 2023.
- “LayerNet: High-Resolution Semantic 3D Reconstruction of Clothed People,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 2, 2023.
- “Binaural SoundNet: Predicting Semantics, Depth and Motion with Binaural Sounds,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 1, 2023.
- “A Deeper Look into DeepCap,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 4, 2023.more
Abstract
Human performance capture is a highly important computer vision problem with
many applications in movie production and virtual/augmented reality. Many
previous performance capture approaches either required expensive multi-view
setups or did not recover dense space-time coherent geometry with
frame-to-frame correspondences. We propose a novel deep learning approach for
monocular dense human performance capture. Our method is trained in a weakly
supervised manner based on multi-view supervision completely removing the need
for training data with 3D ground truth annotations. The network architecture is
based on two separate networks that disentangle the task into a pose estimation
and a non-rigid surface deformation step. Extensive qualitative and
quantitative evaluations show that our approach outperforms the state of the
art in terms of quality and robustness. This work is an extended version of
DeepCap where we provide more detailed explanations, comparisons and results as
well as applications. - “Higher-Order Multicuts for Geometric Model Fitting and Motion Segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 1, 2023.more
Abstract
Minimum cost lifted multicut problem is a generalization of the multicut problem and is a means to optimizing a decomposition of a graph w.r.t. both positive and negative edge costs. Its main advantage is that multicut-based formulations do not require the number of components given a priori; instead, it is deduced from the solution. However, the standard multicut cost function is limited to pairwise relationships between nodes, while several important applications either require or can benefit from a higher-order cost function, i.e. hyper-edges. In this paper, we propose a pseudo-boolean formulation for a multiple model fitting problem. It is based on a formulation of any-order minimum cost lifted multicuts, which allows to partition an undirected graph with pairwise connectivity such as to minimize costs defined over any set of hyper-edges. As the proposed formulation is NP-hard and the branch-and-bound algorithm is too slow in practice, we propose an efficient local search algorithm for inference into resulting problems. We demonstrate versatility and effectiveness of our approach in several applications: geometric multiple model fitting, homography and motion estimation, motion segmentation.
- “Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure DNN Accelerators,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, 2023.
- “Urban Scene Semantic Segmentation With Low-Cost Coarse Annotation,” in 2023 IEEE Winter Conference on Applications of Computer Vision (WACV 2023), Waikoloa Village, HI, USA, 2023.
- “Control-NeRF: Editable Feature Volumes for Scene Rendering and Manipulation,” in 2023 IEEE Winter Conference on Applications of Computer Vision (WACV 2023), Waikoloa Village, HI, USA, 2023.
- “Jointly Learning Band Selection and Filter Array Design for Hyperspectral Imaging,” in 2023 IEEE Winter Conference on Applications of Computer Vision (WACV 2023), Waikoloa Village, HI, USA, 2023.
- “Intra-Source Style Augmentation for Improved Domain Generalization,” in 2023 IEEE Winter Conference on Applications of Computer Vision (WACV 2023), Waikoloa Village, HI, USA, 2023.
- “Revisiting Consistency Regularization for Semi-supervised Learning,” International Journal of Computer Vision, vol. 131, 2023.
- “Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with Self-Supervised Depth Estimation,” International Journal of Computer Vision, 2023.
- “3D Object Detection for Autonomous Driving: A Comprehensive Survey,” International Journal of Computer Vision, 2023.
- “Improving Primary-Vertex Reconstruction with a Minimum-Cost Lifted Multicut Graph Partitioning Algorithm,” Journal of Instrumentation, vol. 18, 2023.
- “Towards Understanding Climate Change Perceptions: A Social Media Dataset,” in NeurIPS 2023 Workshop on Tackling Climate Change with Machine Learning, New Orleans, LA, USA, 2023.
- “Learning Comprehensive Global Features in Person Re-identification: Ensuring Discriminativeness of more Local Regions,” Pattern Recognition, vol. 134, 2023.
- “Certified Robust Models with Slack Control and Large Lipschitz Constants,” in Pattern Recognition (DAGM GCPR 2023), Heidelberg, Germany, 2023.
- “An Evaluation of Zero-Cost Proxies - From Neural Architecture Performance Prediction to Model Robustness,” in Pattern Recognition (DAGM GCPR 2023), Heidelberg, Germany, 2023.
- “FullFormer: Generating Shapes Inside Shapes,” in Pattern Recognition (DAGM GCPR 2023), Heidelberg, Germany, 2023.
- “Online Hyperparameter Optimization for Class-Incremental Learning,” in Proceedings of the 37th AAAI Conference on Artificial Intelligence, Washington, DC, USA, 2023.
- “Joint Self-Supervised Image-Volume Representation Learning with Intra-Inter Contrastive Clustering,” in Proceedings of the 37th AAAI Conference on Artificial Intelligence, Washington, DC, USA, 2023.
- “Learning Context-Aware Classifier for Semantic Segmentation,” in Proceedings of the 37th AAAI Conference on Artificial Intelligence, Washington, DC, USA, 2023.
- “ClusterFuG: Clustering Fully connected Graphs by Multicut,” in Proceedings of the 40th International Conference on Machine Learning (ICML 2023), Honolulu, Hawaii, USA, 2023.
- “Discovering Class-Specific GAN Controls for Semantic Image Synthesis,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2023), Vancouver, Canada, 2023.
- “Visual Writing Prompts: Character-Grounded Story Generation with Curated Image Sequences,” Transactions of the Association for Computational Linguistics, vol. 11, 2023.
- “Improving Native CNN Robustness with Filter Frequency Regularization,” Transactions on Machine Learning Research, vol. 2023, 2023.
- “Modelling 3D Humans : Pose, Shape, Clothing and Interactions,” Universität des Saarlandes, Saarbrücken, 2023.
- “Holistically Explainable Vision Transformers,” 2023. [Online]. Available: https://arxiv.org/abs/2301.08669.more
Abstract
Transformers increasingly dominate the machine learning landscape across many
tasks and domains, which increases the importance for understanding their
outputs. While their attention modules provide partial insight into their inner
workings, the attention scores have been shown to be insufficient for
explaining the models as a whole. To address this, we propose B-cos
transformers, which inherently provide holistic explanations for their
decisions. Specifically, we formulate each model component - such as the
multi-layer perceptrons, attention layers, and the tokenisation module - to be
dynamic linear, which allows us to faithfully summarise the entire transformer
via a single linear transform. We apply our proposed design to Vision
Transformers (ViTs) and show that the resulting models, dubbed Bcos-ViTs, are
highly interpretable and perform competitively to baseline ViTs on ImageNet.
Code will be made available soon. - “Learning from Imperfect Data Incremental Learning and Few-shot Learning,” Universität des Saarlandes, Saarbrücken, 2023.
- “Improving Quality and Controllability in GAN-based Image Synthesis,” Universität des Saarlandes, Saarbrücken, 2023.