MSc Steffen Jung
- Address
- Max-Planck-Institut für Informatik
Saarland Informatics Campus
Campus E1 4
66123 Saarbrücken - Location
- E1 4 - 624
- Phone
- +49 681 9325 2131
- Fax
- +49 681 9325 2099
- Get email via email
Deep learning models have proven to be successful in a wide
range of machine learning tasks. Yet, they are often highly sensitive to
perturbations on the input data which can lead to incorrect decisions
with high confidence, hampering their deployment for practical
use-cases. Thus, finding architectures that are (more) robust against
perturbations has received much attention in recent years. Just like the
search for well-performing architectures in terms of clean accuracy,
this usually involves a tedious trial-and-error process with one
additional challenge: the evaluation of a network's robustness is
significantly more expensive than its evaluation for clean accuracy.
Thus, the aim of this paper is to facilitate better streamlined research
on architectural design choices with respect to their impact on
robustness as well as, for example, the evaluation of surrogate measures
for robustness. We therefore borrow one of the most commonly considered
search spaces for neural architecture search for image classification,
NAS-Bench-201, which contains a manageable size of 6466 non-isomorphic
network designs. We evaluate all these networks on a range of common
adversarial attacks and corruption types and introduce a database on
neural architecture design and robustness evaluations. We further
present three exemplary use cases of this dataset, in which we (i)
benchmark robustness measurements based on Jacobian and Hessian matrices
for their robustness predictability, (ii) perform neural architecture
search on robust accuracies, and (iii) provide an initial analysis of
how architectural design choices affect robustness. We find that
carefully crafting the topology of a network can have substantial impact
on its robustness, where networks with the same parameter count range in
mean adversarial robust accuracy from 20%-41%.
The minimum cost multicut problem is the NP-hard/APX-hard combinatorial
optimization problem of partitioning a real-valued edge-weighted graph such as
to minimize the total cost of the partition. While graph convolutional neural
networks (GNN) have proven to be promising in the context of combinatorial
optimization, most of them are only tailored to or tested on positive-valued
edge weights, i.e. they do not comply to the nature of the multicut problem. We
therefore adapt various GNN architectures including Graph Convolutional
Networks, Signed Graph Convolutional Networks and Graph Isomorphic Networks to
facilitate the efficient encoding of real-valued edge costs. Moreover, we
employ a reformulation of the multicut ILP constraints to a polynomial program
as loss function that allows to learn feasible multicut solutions in a scalable
way. Thus, we provide the first approach towards end-to-end trainable
multicuts. Our findings support that GNN approaches can produce good solutions
in practice while providing lower computation times and largely improved
scalability compared to LP solvers and optimized heuristics, especially when
considering large instances.
The Minimum Cost Multicut Problem (MP) is a popular way for obtaining a graph
decomposition by optimizing binary edge labels over edge costs. While the
formulation of a MP from independently estimated costs per edge is highly
flexible and intuitive, solving the MP is NP-hard and time-expensive. As a
remedy, recent work proposed to predict edge probabilities with awareness to
potential conflicts by incorporating cycle constraints in the prediction
process. We argue that such formulation, while providing a first step towards
end-to-end learnable edge weights, is suboptimal, since it is built upon a
loose relaxation of the MP. We therefore propose an adaptive CRF that allows to
progressively consider more violated constraints and, in consequence, to issue
solutions with higher validity. Experiments on the BSDS500 benchmark for
natural image segmentation as well as on electron microscopic recordings show
that our approach yields more precise edge detection and image segmentation.