QuTE: Answering Quantity Queries from Web Tables
Quantity queries, with filter conditions on quantitative measures of entities, are so far out of reach of search engines and QA assistants. To enable such queries over web contents, this paper develops the first method for automatically extracting quantity facts from ad-hoc web tables. This involves recognizing quantities, with normalized values and units, aligning them with the proper entities, and contextualizing these pairs with informative cues to match sophisticated queries with modifiers. Our method performs joint inference on entity linking and on entity-quantity column alignment. The latter was oversimplified in prior works by assuming a single subject-column per table, whereas our approach is geared for complex tables and leverages external corpora as evidence. For contextualization, we identify informative cues from text and structural markup that surrounds a table. For query-time fact ranking, we devise a new scoring technique that exploits both context similarity, and inter-fact consistency. Comparisons of our building blocks against state-of-the-art baselines and extrinsic experiments with two query benchmarks demonstrate the benefits of our method.
Publications
QuTE: Answering Quantity Queries from Web Tables (Demo paper)
Vinh Thinh Ho, Koninika Pal, and Gerhard Weikum
In Proc. SIGMOD 2021
Extracting Contextualized Quantity Facts from Web Tables
Vinh Thinh Ho, Koninika Pal, Simon Razniewski, Klaus Berberich, and Gerhard Weikum
In Proc. WWW 2021
Links
- Try our demo: https://qsearch.mpi-inf.mpg.de/table/
- 1.8M Wikipedia tables (Mar 2020): download
- TableL dataset (2.6M tables): download
- 618K tables with EL-CA annotation: download
- Supplemental materials: download
- Code and search APIs will be made available at: https://github.com/hovinhthinh/Qsearch
If you have any questions, please contact the author at: hvthinh@mpi-inf.mpg.de or hovinhthinh@gmail.com.