First is Not Really Better Than Last: Evaluating Layer Choice and Aggregation Strategies in Language Model Data Influence Estimation

Overview

The paper investigates the effectiveness of different layers in language models for data influence estimation, challenging the notion that first layers are superior. It presents evidence that middle attention layers are more effective.

Key Insights

Middle Layers Superior: Middle attention layers provide better data influence estimation than first layers.
- Evidence: Theoretical and empirical analysis.
- Verifiable: Yes
New Aggregation Methods: Ranking and vote-based methods outperform standard averaging.
- Evidence: Experimental results.
- Verifiable: Yes
Noise Detection Rate (NDR): A new metric for evaluating influence score efficacy.
- Evidence: Demonstrated strong predictive capability.
- Verifiable: Yes

BFSI Relevance

Why Relevant: Understanding model influence is crucial for auditing AI systems in BFSI, ensuring transparency and compliance.
Primary Sector: Financial Services
Subsectors: Asset Management, Risk Management
Actionable Implications:
- Implement new aggregation methods for better model auditing.
- Use NDR for evaluating AI model influence in compliance checks.