Abstract
This article presents the QUASAR system for question answering over
unstructured text, structured tables, and knowledge graphs, with unified
treatment of all sources. The system adopts a RAG-based architecture, with a
pipeline of evidence retrieval followed by answer generation, with the latter
powered by a moderate-sized language model. Additionally and uniquely, QUASAR
has components for question understanding, to derive crisper input for evidence
retrieval, and for re-ranking and filtering the retrieved evidence before
feeding the most informative pieces into the answer generation. Experiments
with three different benchmarks demonstrate the high answering quality of our
approach, being on par with or better than large GPT models, while keeping the
computational cost and energy consumption orders of magnitude lower.
BibTeX
@online{Christmann_2412.07420, TITLE = {{RAG}-based Question Answering over Heterogeneous Data and Text}, AUTHOR = {Christmann, Philipp and Weikum, Gerhard}, LANGUAGE = {eng}, URL = {https://arxiv.org/abs/2412.07420}, EPRINT = {2412.07420}, EPRINTTYPE = {arXiv}, YEAR = {2024}, MARGINALMARK = {$\bullet$}, ABSTRACT = {This article presents the QUASAR system for question answering over<br>unstructured text, structured tables, and knowledge graphs, with unified<br>treatment of all sources. The system adopts a RAG-based architecture, with a<br>pipeline of evidence retrieval followed by answer generation, with the latter<br>powered by a moderate-sized language model. Additionally and uniquely, QUASAR<br>has components for question understanding, to derive crisper input for evidence<br>retrieval, and for re-ranking and filtering the retrieved evidence before<br>feeding the most informative pieces into the answer generation. Experiments<br>with three different benchmarks demonstrate the high answering quality of our<br>approach, being on par with or better than large GPT models, while keeping the<br>computational cost and energy consumption orders of magnitude lower.<br>}, }
Endnote
%0 Report %A Christmann, Philipp %A Weikum, Gerhard %+ Databases and Information Systems, MPI for Informatics, Max Planck Society Databases and Information Systems, MPI for Informatics, Max Planck Society %T RAG-based Question Answering over Heterogeneous Data and Text : %G eng %U http://hdl.handle.net/21.11116/0000-0010-546F-4 %U https://arxiv.org/abs/2412.07420 %D 2024 %X This article presents the QUASAR system for question answering over<br>unstructured text, structured tables, and knowledge graphs, with unified<br>treatment of all sources. The system adopts a RAG-based architecture, with a<br>pipeline of evidence retrieval followed by answer generation, with the latter<br>powered by a moderate-sized language model. Additionally and uniquely, QUASAR<br>has components for question understanding, to derive crisper input for evidence<br>retrieval, and for re-ranking and filtering the retrieved evidence before<br>feeding the most informative pieces into the answer generation. Experiments<br>with three different benchmarks demonstrate the high answering quality of our<br>approach, being on par with or better than large GPT models, while keeping the<br>computational cost and energy consumption orders of magnitude lower.<br> %K Computer Science, Computation and Language, cs.CL,Computer Science, Information Retrieval, cs.IR