BudgetMem: Learning Selective Memory Policies for Cost-Efficient Long-Context Processing in Language Models
Published 7 Nov 2025 · arXiv · Chandra Vamsi Krishna Alla
Overview
BudgetMem is a novel architecture designed to address the computational and memory challenges faced by large language models (LLMs) when processing long contexts. It uses selective memory policies to efficiently manage memory usage.
Key Insights
- Memory Efficiency: BudgetMem reduces memory usage by 72.4% compared to baseline retrieval augmented generation (RAG) systems, with only a 1.0% degradation in F1 score.
- Evidence: Experimentation on 700 question-answer pairs across document lengths.
- Verifiable: Yes, based on experimental results.
- Selective Memory Policies: Utilizes feature-based salience scoring to determine which information to store, improving efficiency under budget constraints.
- Evidence: Uses entity density, TF-IDF, discourse markers, and position bias.
- Verifiable: Yes, through methodology description.
BFSI Relevance
- Why Relevant: Efficient long-context processing is crucial for sectors requiring analysis of extensive documents, such as financial services.
- Primary Sector: Financial Services
- Subsectors: Asset Management, Corporate Banking
- Actionable Implications:
- Implement BudgetMem to enhance document processing capabilities.
- Leverage reduced memory usage for cost savings in data processing.
Entities
- Organizations: None mentioned.
- Vendors: None mentioned.
Author Authority
- Authors are researchers contributing to advancements in computational and language processing technologies.
Is Academic
- Yes, this is a peer-reviewed academic paper.
Citations
- None provided in the summary.
Key Figures
- Memory Usage Reduction: 72.4%
- Context: Compared to baseline RAG systems.
- F1 Score Degradation: 1.0%
- Context: Performance impact of using BudgetMem.
researcher peer-reviewed-paper global