Cognitive Edge Computing: A Comprehensive Survey on Optimizing Large Models and AI Agents for Pervasive Deployment
Published 7 Nov 2025 · arXiv · Xubin Wang
Overview
The paper provides a comprehensive survey on cognitive edge computing, focusing on optimizing large AI models for deployment on resource-constrained devices. It covers model optimization, system architecture, and adaptive intelligence.
Key Insights
- Model Optimization: Techniques like quantization, sparsity, and distillation are crucial for deploying large models on devices with limited resources.
- System Architecture: Strategies such as on-device inference and cloud-edge collaboration help balance latency, energy, and privacy.
- Adaptive Intelligence: Methods like context compression and dynamic routing tailor computation to device constraints.
- Evaluation Protocol: A standardized protocol for measuring latency, throughput, energy, and privacy is proposed.
- Challenges: Issues like energy reporting and safety evaluation remain.
BFSI Relevance
- Why Relevant: Efficient deployment of AI models on edge devices can enhance real-time decision-making in BFSI sectors.
- Primary Sector: Financial Services
- Subsectors: Asset Management, Retail Banking
- Actionable Implications: BFSI professionals should explore edge computing to improve service delivery and operational efficiency.
researcher peer-reviewed-paper global