TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework

Overview

TeaRAG is a framework designed to improve the efficiency of Retrieval-Augmented Generation (RAG) systems by reducing token usage during retrieval and reasoning processes. This is achieved through semantic retrieval compression and optimized reasoning steps.

Key Insights

Token Efficiency: TeaRAG reduces token output by 61% on Llama3-8B-Instruct and 59% on Qwen2.5-14B-Instruct datasets.
Performance Improvement: The framework improves the average Exact Match by 4% and 2% on the respective datasets.
Methodology: Utilizes chunk-based semantic retrieval with graph retrieval and Personalized PageRank to compress retrieval content. Iterative Process-aware Direct Preference Optimization (IP-DPO) is used to streamline reasoning steps.

BFSI Relevance

Why Relevant: Efficient data retrieval and processing are critical in BFSI sectors for decision-making and customer service.
Primary Sector: Financial Services
Subsectors: Asset Management, Claims Processing
Actionable Implications: BFSI professionals should explore integrating token-efficient frameworks like TeaRAG to enhance data processing capabilities while managing computational costs.