Query Generation Pipeline with Enhanced Answerability Assessment for Financial Information Retrieval
Published 7 Nov 2025 · arXiv · Hyunkyu Kim
Overview
The paper presents a novel methodology for creating domain-specific information retrieval (IR) benchmarks, focusing on the banking sector. It introduces KoBankIR, a benchmark comprising 815 queries derived from 204 official banking documents, using a pipeline that combines LLM-based query generation with enhanced answerability assessment.
Key Insights
- KoBankIR Benchmark: Comprises 815 queries from 204 banking documents, highlighting the complexity of financial information retrieval.
- Enhanced Answerability Assessment: The methodology improves alignment with human judgments, offering a more accurate assessment of query answerability.
- Retrieval Model Challenges: Existing models struggle with KoBankIR's complex queries, underscoring the need for advanced retrieval techniques.
BFSI Relevance
- Why Relevant: Accurate information retrieval is critical for reliable AI services in banking, impacting decision-making and customer service.
- Primary Sector: Banking
- Subsectors: Retail Banking, Corporate Banking
- Actionable Implications: BFSI professionals should invest in developing or adopting advanced retrieval models to handle complex queries effectively.
researcher peer-reviewed-paper global