【阅】本周阅读摘选2026-05-18 → 2026-05-24

Posted by Cao Zihang on May 25, 2026 Word Count:
本周阅读摘选
2026-05-18 → 2026-05-24
目录

学术相关

HippoRAG

One-sentence positioning: A retrieval framework inspired by the hippocampal indexing theory that extracts an open knowledge graph via LLM and combines it with the Personalized PageRank algorithm to achieve cross-passage multi-hop knowledge integration in a single retrieval step.

Key innovation: Uses Personalized PageRank to perform pattern completion on the knowledge graph, compressing traditional iterative multi-hop retrieval into a single-step graph traversal, achieving performance comparable to or better than iterative retrieval while reducing cost by 10-30x and improving speed by 6-13x.


0. Execution Overview

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Offline Phase (one-time construction)
  ├─ ① Named entity recognition (extract named entities from each passage)
  ├─ ② OpenIE extraction of KG triples (two-stage: NER → OpenIE extracts noun phrase nodes N and relation edges E)
  ├─ ③ Retrieval encoder supplements synonym edges (add E' between entities with cosine similarity > τ)
  └─ ④ Construct Passage-Node co-occurrence matrix P (|N| × |P|, recording each noun phrase's occurrence count in each passage)
         ↓
Online Phase (per query)
  ├─ ⑤ Query entity extraction (LLM extracts salient named entities from query)
  ├─ ⑥ Query node mapping (entity vectorized encoding, mapped to most similar KG node via cosine similarity)
  ├─ ⑦ Run Personalized PageRank (query nodes as seeds, PPR distributes probability on graph)
  │      - Initialization: query nodes equal probability, rest are 0
  │      - Node-specific weighting: si = |Pi|^{-1} (similar to TF-IDF)
  │      - Transition probability constructed via adjacency matrix
  └─ ⑧ Aggregate PPR node probabilities with co-occurrence matrix P to obtain passage ranking scores
         ↓
  ⑨ Retrieved passages input to LLM for final answer generation

1. High-level Design (Indexing → Retrieval → Generation)

1.1 Indexing

Dimension Approach
Chunking strategy Processed by passage (notes do not document chunking parameters)
Index structure Graph index (open KG + synonym edges + Passage-Node co-occurrence matrix)
Knowledge representation Entity-relation graph (schemaless OpenIE triples) + synonym relations + co-occurrence statistics
Construction cost Medium (LLM two-stage extraction + encoder similarity computation)
Core characteristic Artificial neocortex (LLM) handles extraction, artificial hippocampus (open KG) handles indexing, parahippocampal region (retrieval encoder) handles connection

1.2 Retrieval

Dimension Approach
Retrieval method Graph traversal (Personalized PageRank)
Retrieval granularity Passage
Iteration strategy Single retrieval (PPR achieves multi-hop effect in a single step)
Query processing Entity extraction → Query node mapping (cosine similarity)
Core characteristic PPR uses query nodes as seeds to diffuse probability on the graph, achieving in one step what traditional methods require iteration for

1.3 Generation

Dimension Approach
Context injection Retrieved and ranked passages as context input to LLM
Citation tracing Based on Passage-Node co-occurrence matrix associating nodes with original passages
Quality control Notes do not explicitly document special mechanisms in the generation phase
Core characteristic Retrieval framework positioning; generation phase relies on standard LLM generation

2. Offline Construction: Indexing (Detailed Execution)

Step 2.1 Named Entity Recognition

Item Description
Input Raw passage collection P
Operation Extract named entities from each passage
Key decision First step of two-stage extraction: extract named entities first, then add them to the OpenIE prompt, balancing generality and named entity bias
Output Named entity set for each passage

Step 2.2 OpenIE Extraction of KG Triples

Item Description
Input Passages + named entities
Operation LLM performs open information extraction (OpenIE) via 1-shot prompting, extracting noun phrase nodes N and relation edges E
Key decision Add named entities to the OpenIE prompt to extract final triples containing concepts beyond named entities (noun phrases)
Output Nodes N and edges E of schemaless open KG

Step 2.3 Supplement Synonym Edges

Item Description
Input Node set N + retrieval encoder M
Operation Compute cosine similarity between node representations; add synonym relation edges E’ when similarity exceeds threshold τ
Key decision Uses off-the-shelf dense retrieval encoders to establish additional edges between similar but non-identical noun phrases, aiding downstream pattern completion
Output Extended edge set (E + E’)

Step 2.4 Construct Passage-Node Co-occurrence Matrix

Item Description
Input Final node set N + original passages P
Operation Count the occurrence of each noun phrase in each original passage
Output |N| × |P| co-occurrence matrix P (recording each noun phrase’s occurrence count in each passage)

3. Online Query: Retrieval (Detailed Execution)

3.1 Retrieval Procedure

Step 3.1 Query Entity Extraction
  • Operation: LLM extracts salient named entities (query named entities) from the query
  • Purpose: Transform natural language query into seed nodes on the graph
Step 3.2 Query Node Mapping
  • Operation: Vectorize query named entities using the retrieval encoder, map to most similar nodes in the KG based on cosine similarity
  • Output: Query nodes
Step 3.3 Personalized PageRank Execution
  • Initialize personalized probability distribution: All query nodes have equal probability, other nodes probability is 0
  • Node-specific weighting: si = Pi ^{-1}, similar to TF-IDF’s inverse document frequency idea
  • Transition probability: Constructed via adjacency matrix
  • Core mechanism: PPR distributes probability on the graph only through user-defined source nodes (query nodes), simulating pattern completion in hippocampal neural pathways
Step 3.4 Passage Ranking
  • Operation: Multiply the updated PPR node probability distribution with the co-occurrence matrix P
  • Output: Final ranking score for each passage

4. Online Generation: Generation (Detailed Execution)

HippoRAG is positioned as a retrieval framework; the notes do not explicitly document special design in the generation phase. Retrieved and ranked passages serve as context input to the downstream LLM for answer generation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
User query
    │
    ▼
┌─────────────────┐
│ Query entity    │  ← LLM extracts named entities
│ extraction      │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Query node      │  ← Cosine similarity matches KG nodes
│ mapping         │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ PPR probability │  ← Execute Personalized PageRank on graph
│ diffusion       │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Passage ranking │  ← PPR probability × co-occurrence matrix P
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ LLM generates   │  ← Retrieved passages as context
│ answer          │
└─────────────────┘

5. Key Design Decisions

Decision Point HippoRAG’s Choice Alternative Rationale
Index structure Open KG + synonym edges + co-occurrence matrix Pure vector index / structured KG / text chunk index Schemaless OpenIE flexibly adapts to any corpus; co-occurrence matrix associates nodes with original passages
Retrieval algorithm Personalized PageRank single-step graph traversal Iterative multi-hop retrieval (e.g., IRCoT) PPR achieves multi-hop effect in one step, reducing cost by 10-30x and improving speed by 6-13x
Entity extraction Two-stage: NER → OpenIE Single-stage OpenIE Balances generality and named entity bias
Synonym edge construction Retrieval encoder cosine similarity Exact string matching / semantic models Establishes connections between similar but non-identical phrases, aiding pattern completion
Node-specific weighting si = |Pi|^{-1} (similar to TF-IDF) Uniform weighting / other weighting strategies Reduces weight of high-frequency nodes, improving retrieval precision

6. Evaluation

6.1 Evaluation Metrics

Metric Meaning System vs. Baseline
recall@2 / recall@5 Retrieval recall Superior to traditional RAG methods
EM Exact match Single-step retrieval comparable to or better than IRCoT
F1 F1 score Up to 20% improvement over SOTA

6.2 Comparative Experimental Setup

Condition Description
HippoRAG Full system (OpenIE KG + PPR retrieval)
BM25 Sparse retrieval baseline
Contriever / GTR / ColBERTv2 Dense retrieval baselines
Propositionizer Proposition-level retrieval baseline
RAPTOR Hierarchical summary retrieval baseline
IRCoT Iterative retrieval baseline

Benchmark datasets: MuSiQue, 2WikiMultiHopQA, HotpotQA

Key findings:

  • Single-step retrieval achieves performance comparable to or better than iterative retrieval IRCoT
  • Cost is 1/10 to 1/30 of IRCoT, speed is 6-13x faster than IRCoT
  • Integration into IRCoT yields further improvements

7. Limitations and Applicability

Limitation Specific Manifestation Mitigation
NER design limitation Cannot extract sufficient information from queries for retrieval, accounting for approximately half of all errors Improve NER module design; consider richer query understanding mechanisms
Entity-centric bias Strong bias toward concepts, many contextual signals not utilized Introduce context-aware retrieval mechanisms
Missing contextual cues Ignoring contextual cues accounts for approximately 48% of errors Incorporate more contextual information in indexing and retrieval

Best Applicable Scenarios

  • Multi-hop QA tasks requiring cross-passage knowledge integration
  • Scenarios sensitive to retrieval latency (single-step PPR is 6-13x faster than iterative retrieval)
  • Cost-constrained environments (10-30x cheaper than iterative retrieval)
  • Scenarios where query information can primarily be expressed through entity relations

Unsuitable Scenarios

  • Scenarios where key information in queries cannot be extracted via NER
  • Queries requiring extensive contextual cues rather than entity relations
  • Scenarios with extremely high demands on fine-grained semantic differences between concepts

8. Quick Reference

What You Want to Know See Which Section
What is the complete pipeline? 0. Execution Overview
High-level design comparison? 1. High-level Design
How is the index constructed? 2. Offline Construction
How are queries answered? 3. Online Query + 4. Online Generation
Why is it better than iterative retrieval? 5. Key Design Decisions
How does it perform? 6. Evaluation
When should it NOT be used? 7. Limitations and Applicability

HippoRAG2

One-sentence positioning: An improved version of HippoRAG that introduces passage nodes into the knowledge graph to achieve dense-sparse fusion of concepts and context, combining Query-to-Triple retrieval with LLM filtering to comprehensively outperform standard RAG on factual memory, sense-making, and associative memory tasks.

Key innovation: Introduces passage nodes on top of PPR to achieve dense-sparse fusion of concepts and context, and through the Query-to-Triple + LLM filtering retrieval strategy, addresses the performance degradation of knowledge graph RAG on factual memory tasks.


0. Execution Overview

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Offline Phase (one-time construction)
  ├─ ① Entity extraction
  ├─ ② Extract KG triples (OpenIE)
  ├─ ③ Synonym edge completion (cosine similarity > τ)
  └─ ④ Dense-Sparse fusion (core design)
         - Sparse layer: phrase nodes
         - Dense layer: complete vector representation of each passage abstracted as passage node
         - Bridging: passage nodes connected to corresponding entities via contains edges
         ↓
Online Phase (per query)
  ├─ ⑤ Query embedding
  ├─ ⑥ Vector recall
  │      - top-k passages (embedding similarity)
  │      - top-k KG triples (Query to Triple)
  ├─ ⑦ LLM filters triples (produces T' ⊆ T)
  ├─ ⑧ Seed node selection
  │      - If filtered set is empty → directly return top-k passages
  │      - Otherwise
  │        · Phrase nodes: identified from T', top-k selected by average ranking score, probabilities assigned by normalized ranking scores
  │        · Passage nodes: all recalled passages serve as seeds, probability = embedding similarity × weight factor
  ├─ ⑨ Merge & global normalization
  ├─ ⑩ Execute Personalized PageRank
  └─ ⑪ Sort and output retrieval results
         ↓
  ⑫ Retrieved passages input to LLM for final answer generation

1. High-level Design (Indexing → Retrieval → Generation)

1.1 Indexing

Dimension Approach
Chunking strategy Processed by passage
Index structure Graph index (open KG + passage nodes + synonym edges + contains edges)
Knowledge representation Entity-relation graph + passage nodes + dense-sparse fusion
Construction cost Medium (LLM extraction + encoder similarity computation + passage vector encoding)
Core characteristic Dense-Sparse fusion — phrase nodes as sparse encoding, passage nodes as dense encoding, contains edges bridge the two

1.2 Retrieval

Dimension Approach
Retrieval method Hybrid (graph traversal PPR + vector recall of triples/passages)
Retrieval granularity Passage
Iteration strategy Single retrieval (PPR single-step achieves multi-hop effect)
Query processing Query to Triple (embedding recall of top-k triples + LLM filtering)
Core characteristic Query-to-Triple incorporates richer contextual information from KG; phrase + passage nodes jointly serve as PPR seeds

1.3 Generation

Dimension Approach
Context injection Retrieved and ranked passages as context input to LLM
Citation tracing Based on passage-node associations
Quality control Notes do not explicitly document special mechanisms in the generation phase
Core characteristic Retrieval framework positioning; generation phase relies on standard LLM generation

2. Offline Construction: Indexing (Detailed Execution)

Step 2.1 Entity Extraction

Item Description
Input Raw passages
Operation Extract named entities
Output Named entity set

Step 2.2 Extract KG Triples

Item Description
Input Passages + named entities
Operation Extract noun phrase nodes and relation edges via OpenIE
Output Nodes and edges of schemaless open KG

Step 2.3 Synonym Edge Completion

Item Description
Input Node set + retrieval encoder
Operation Add synonym edges between entities with cosine similarity above threshold τ
Output Extended edge set

Step 2.4 Dense-Sparse Fusion (Core Design)

Item Description
Input KG nodes + passages
Operation Abstract the complete vector representation of each passage as a passage node; passage nodes connected to corresponding entities via contains edges
Key decision Inspired by brain’s dense-sparse integration — phrase nodes as sparse encoding of extracted concepts, passage nodes as dense encoding of the context from which concepts originate
Output Enhanced KG (containing phrase nodes, passage nodes, contains edges)

Why this approach? Concepts are concise and generalizable but lose information; context is semantically rich but adds complexity. Dense-sparse fusion retains the advantages of both, resolving the concept-context trade-off.


3. Online Query: Retrieval (Detailed Execution)

3.1 Retrieval Mode Overview

HippoRAG 2 supports three query mapping methods, with Query to Triple as the default:

Method Mechanism Description
NER to Node Extract query entities → embedding matches KG nodes HippoRAG’s original method
Query to Node Entire query embedding directly matches KG nodes Does not extract individual entities
Query to Triple Entire query embedding matches triples in the graph Default, incorporates richer contextual information

3.2 Retrieval Procedure

Step 3.1 Query Embedding
  • Operation: Encode the query as a vector representation
Step 3.2 Vector Recall
  • top-k passages: Recalled via embedding similarity
  • top-k triples: Query to Triple, recall triples in the graph via embedding similarity
Step 3.3 LLM Filters Triples
  • Operation: Use LLM to filter retrieved triples T, producing T’ ⊆ T
  • Purpose: Improve retrieval quality, remove irrelevant triples
Step 3.4 Seed Node Selection
  • If filtered set is empty: Directly return embedding-recalled top-k passages
  • Otherwise:
    • Phrase nodes: Identify phrase nodes from filtered triples T’, select top-k by average ranking score, assign probabilities based on normalized ranking scores
    • Passage nodes: All recalled passage nodes also serve as seed nodes, with initial probability being embedding similarity multiplied by a weight factor (§6.2), balancing the influence of phrase nodes and passage nodes
Step 3.5 Merge and Global Normalization
  • Merge probability distributions of phrase nodes and passage nodes
  • Global normalization
Step 3.6 Personalized PageRank
  • Execute PPR using the merged seed node probability distribution as the personalized vector
Step 3.7 Sort and Output
  • PPR output node probabilities multiplied by the co-occurrence matrix to obtain final passage ranking scores

4. Online Generation: Generation (Detailed Execution)

HippoRAG 2 is positioned as a retrieval framework; the notes do not explicitly document special design in the generation phase. Retrieved and ranked passages serve as context input to the downstream LLM for answer generation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
User query
    │
    ▼
┌─────────────────┐
│ Query embedding │
└────────┬────────┘
         │
    ┌────┴────┐
    ▼         ▼
┌────────┐ ┌─────────────┐
│ top-k  │ │ top-k       │
│passages│ │ triples     │
└───┬────┘ └──────┬──────┘
    │             │
    │             ▼
    │      ┌─────────────┐
    │      │ LLM filters │
    │      └──────┬──────┘
    │             │
    │      ┌──────┴──────┐
    │      ▼             ▼
    │  ┌────────┐   ┌──────────┐
    │  │ T' is  │   │ T' is    │
    │  │ empty  │   │ non-empty│
    │  └───┬────┘   └────┬─────┘
    │      │             │
    ▼      ▼             ▼
┌─────────────────────────────────┐
│ Directly return passages        │   ← T' is empty
│ or                              │
│ Phrase nodes + Passage nodes    │   ← T' is non-empty
│ → merge & normalize → PPR       │
└─────────────────────────────────┘
         │
         ▼
┌─────────────────┐
│ Passage ranking │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ LLM generates   │
│ answer          │
└─────────────────┘

5. Key Design Decisions

Decision Point HippoRAG2’s Choice Alternative Rationale
Concept-context fusion Dense-Sparse fusion (passage nodes + contains edges) Pure concepts (phrase nodes) / pure context Resolves concept-context trade-off, retaining conceptual conciseness while incorporating contextual richness
Retrieval strategy Query to Triple + LLM filtering NER to Node / Query to Node Incorporates richer contextual information from KG
Seed nodes Phrase nodes + Passage nodes jointly as PPR seeds Phrase nodes only (HippoRAG) Broader activation improves multi-hop reasoning capability
Passage node probability Embedding similarity × weight factor Ranking scores / uniform distribution Balances the influence between phrase nodes and passage nodes

6. Evaluation

6.1 Evaluation Metrics

Metric Meaning System vs. Baseline
recall@5 Retrieval recall Superior to standard RAG and HippoRAG
F1 F1 score Comprehensively superior to standard RAG on factual, sense-making, and associative memory tasks

6.2 Comparative Experimental Setup

Condition Description
HippoRAG 2 Full system (Dense-Sparse fusion + Query to Triple + LLM filtering)
BM25 Sparse retrieval baseline
Contriever / GTR Dense retrieval baseline
RAPTOR Hierarchical summary retrieval baseline
GraphRAG / LightRAG Graph RAG baselines
HippoRAG Predecessor method baseline

Datasets:

Task Type Datasets
Simple QA NaturalQuestions, PopQA
Multi-hop QA MuSiQue, 2WikiMultihopQA, HotpotQA, LV-Eval
Discourse Understanding NarrativeQA

Key findings:

  • Comprehensively superior to standard RAG on factual memory, sense-making, and associative memory tasks
  • 7% improvement over SOTA embedding models on associative memory tasks
  • Resolves HippoRAG’s performance degradation on factual memory tasks

7. Limitations and Applicability

Notes do not explicitly document specific limitations. From a design perspective, while the concept-context trade-off is mitigated through fusion, the dense-sparse weight balancing still relies on hyperparameter tuning.

Best Applicable Scenarios

  • RAG scenarios requiring simultaneous consideration of factual memory and associative reasoning
  • Environments requiring multi-hop QA without bearing the high cost of iterative retrieval
  • Scenarios where query information can be expressed through both entity relations and contextual context

Unsuitable Scenarios

  • Notes do not explicitly document unsuitable scenarios

8. Quick Reference

What You Want to Know See Which Section
What is the complete pipeline? 0. Execution Overview
High-level design comparison? 1. High-level Design
How is the index constructed? 2. Offline Construction
How does the retrieval strategy work? 3. Online Query
How does it differ from HippoRAG? 5. Key Design Decisions
How does it perform? 6. Evaluation