Abstract
We consider the well-studied Robust $(k, z)$-Clustering problem, which
generalizes the classic $k$-Median, $k$-Means, and $k$-Center problems. Given a
constant $z\ge 1$, the input to Robust $(k, z)$-Clustering is a set $P$ of $n$
weighted points in a metric space $(M,\delta)$ and a positive integer $k$.
Further, each point belongs to one (or more) of the $m$ many different groups
$S_1,S_2,\ldots,S_m$. Our goal is to find a set $X$ of $k$ centers such that
$\max_{i \in [m]} \sum_{p \in S_i} w(p) \delta(p,X)^z$ is minimized.
This problem arises in the domains of robust optimization [Anthony, Goyal,
Gupta, Nagarajan, Math. Oper. Res. 2010] and in algorithmic fairness. For
polynomial time computation, an approximation factor of $O(\log m/\log\log m)$
is known [Makarychev, Vakilian, COLT $2021$], which is tight under a plausible
complexity assumption even in the line metrics. For FPT time, there is a
$(3^z+\epsilon)$-approximation algorithm, which is tight under GAP-ETH [Goyal,
Jaiswal, Inf. Proc. Letters, 2023].
Motivated by the tight lower bounds for general discrete metrics, we focus on
\emph{geometric} spaces such as the (discrete) high-dimensional Euclidean
setting and metrics of low doubling dimension, which play an important role in
data analysis applications. First, for a universal constant $\eta_0 >0.0006$,
we devise a $3^z(1-\eta_{0})$-factor FPT approximation algorithm for discrete
high-dimensional Euclidean spaces thereby bypassing the lower bound for general
metrics. We complement this result by showing that even the special case of
$k$-Center in dimension $\Theta(\log n)$ is $(\sqrt{3/2}- o(1))$-hard to
approximate for FPT algorithms. Finally, we complete the FPT approximation
landscape by designing an FPT $(1+\epsilon)$-approximation scheme (EPAS) for
the metric of sub-logarithmic doubling dimension.
BibTeX
@online{Abbasi2305.07316, TITLE = {Parameterized Approximation for Robust Clustering in Discrete Geometric Spaces}, AUTHOR = {Abbasi, Fateme and Banerjee, Sandip and Byrka, Jaros{\l}aw and Chalermsook, Parinya and Gadekar, Ameet and Khodamoradi, Kamyar and Marx, D{\'a}niel and Sharma, Roohani and Spoerhase, Joachim}, LANGUAGE = {eng}, URL = {https://arxiv.org/abs/2305.07316}, EPRINT = {2305.07316}, EPRINTTYPE = {arXiv}, YEAR = {2024}, MARGINALMARK = {$\bullet$}, ABSTRACT = {We consider the well-studied Robust $(k, z)$-Clustering problem, which<br>generalizes the classic $k$-Median, $k$-Means, and $k$-Center problems. Given a<br>constant $z\ge 1$, the input to Robust $(k, z)$-Clustering is a set $P$ of $n$<br>weighted points in a metric space $(M,\delta)$ and a positive integer $k$.<br>Further, each point belongs to one (or more) of the $m$ many different groups<br>$S_1,S_2,\ldots,S_m$. Our goal is to find a set $X$ of $k$ centers such that<br>$\max_{i \in [m]} \sum_{p \in S_i} w(p) \delta(p,X)^z$ is minimized.<br> This problem arises in the domains of robust optimization [Anthony, Goyal,<br>Gupta, Nagarajan, Math. Oper. Res. 2010] and in algorithmic fairness. For<br>polynomial time computation, an approximation factor of $O(\log m/\log\log m)$<br>is known [Makarychev, Vakilian, COLT $2021$], which is tight under a plausible<br>complexity assumption even in the line metrics. For FPT time, there is a<br>$(3^z+\epsilon)$-approximation algorithm, which is tight under GAP-ETH [Goyal,<br>Jaiswal, Inf. Proc. Letters, 2023].<br> Motivated by the tight lower bounds for general discrete metrics, we focus on<br>\emph{geometric} spaces such as the (discrete) high-dimensional Euclidean<br>setting and metrics of low doubling dimension, which play an important role in<br>data analysis applications. First, for a universal constant $\eta_0 >0.0006$,<br>we devise a $3^z(1-\eta_{0})$-factor FPT approximation algorithm for discrete<br>high-dimensional Euclidean spaces thereby bypassing the lower bound for general<br>metrics. We complement this result by showing that even the special case of<br>$k$-Center in dimension $\Theta(\log n)$ is $(\sqrt{3/2}- o(1))$-hard to<br>approximate for FPT algorithms. Finally, we complete the FPT approximation<br>landscape by designing an FPT $(1+\epsilon)$-approximation scheme (EPAS) for<br>the metric of sub-logarithmic doubling dimension.<br>}, }
Endnote
%0 Report %A Abbasi, Fateme %A Banerjee, Sandip %A Byrka, Jarosław %A Chalermsook, Parinya %A Gadekar, Ameet %A Khodamoradi, Kamyar %A Marx, Dániel %A Sharma, Roohani %A Spoerhase, Joachim %+ External Organizations External Organizations External Organizations External Organizations External Organizations External Organizations External Organizations Algorithms and Complexity, MPI for Informatics, Max Planck Society External Organizations %T Parameterized Approximation for Robust Clustering in Discrete Geometric Spaces : %G eng %U http://hdl.handle.net/21.11116/0000-0010-8837-7 %U https://arxiv.org/abs/2305.07316 %D 2024 %X We consider the well-studied Robust $(k, z)$-Clustering problem, which<br>generalizes the classic $k$-Median, $k$-Means, and $k$-Center problems. Given a<br>constant $z\ge 1$, the input to Robust $(k, z)$-Clustering is a set $P$ of $n$<br>weighted points in a metric space $(M,\delta)$ and a positive integer $k$.<br>Further, each point belongs to one (or more) of the $m$ many different groups<br>$S_1,S_2,\ldots,S_m$. Our goal is to find a set $X$ of $k$ centers such that<br>$\max_{i \in [m]} \sum_{p \in S_i} w(p) \delta(p,X)^z$ is minimized.<br> This problem arises in the domains of robust optimization [Anthony, Goyal,<br>Gupta, Nagarajan, Math. Oper. Res. 2010] and in algorithmic fairness. For<br>polynomial time computation, an approximation factor of $O(\log m/\log\log m)$<br>is known [Makarychev, Vakilian, COLT $2021$], which is tight under a plausible<br>complexity assumption even in the line metrics. For FPT time, there is a<br>$(3^z+\epsilon)$-approximation algorithm, which is tight under GAP-ETH [Goyal,<br>Jaiswal, Inf. Proc. Letters, 2023].<br> Motivated by the tight lower bounds for general discrete metrics, we focus on<br>\emph{geometric} spaces such as the (discrete) high-dimensional Euclidean<br>setting and metrics of low doubling dimension, which play an important role in<br>data analysis applications. First, for a universal constant $\eta_0 >0.0006$,<br>we devise a $3^z(1-\eta_{0})$-factor FPT approximation algorithm for discrete<br>high-dimensional Euclidean spaces thereby bypassing the lower bound for general<br>metrics. We complement this result by showing that even the special case of<br>$k$-Center in dimension $\Theta(\log n)$ is $(\sqrt{3/2}- o(1))$-hard to<br>approximate for FPT algorithms. Finally, we complete the FPT approximation<br>landscape by designing an FPT $(1+\epsilon)$-approximation scheme (EPAS) for<br>the metric of sub-logarithmic doubling dimension.<br> %K Computer Science, Data Structures and Algorithms, cs.DS,Computer Science, Computational Geometry, cs.CG,Computer Science, Learning, cs.LG