<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Cao Zihang - Blog</title>
    <description>这里是曹梓航的学术Blog | 抵近人类知识无人区</description>
    <link>https://www.caozihang.com//</link>
    <atom:link href="https://www.caozihang.com//feed.xml" rel="self" type="application/rss+xml" />
    <pubDate>Fri, 05 Jun 2026 02:15:19 +0000</pubDate>
    <lastBuildDate>Fri, 05 Jun 2026 02:15:19 +0000</lastBuildDate>
    <generator>Jekyll v3.10.0</generator>
    
      <item>
        <title>【阅】本周阅读摘选2026-05-25 → 2026-05-31</title>
        <description>&lt;center style=&quot;margin-bottom: 20px; margin-top: 50px&quot;&gt;&lt;font color=&quot;#3879B1&quot; style=&quot;line-height: 1.4;font-weight: 700;font-size: 36px;box-sizing: border-box; &quot;&gt;本周阅读摘选&lt;/font&gt;&lt;/center&gt;

&lt;center style=&quot; margin-bottom: 30px;&quot;&gt;2026-05-25 → 2026-05-31&lt;/center&gt;

&lt;font style=&quot;font-weight: bold;&quot;&gt;目录&lt;/font&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#学术相关&quot; id=&quot;markdown-toc-学术相关&quot;&gt;学术相关&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#intern-atlas&quot; id=&quot;markdown-toc-intern-atlas&quot;&gt;Intern-Atlas&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#0-execution-overview&quot; id=&quot;markdown-toc-0-execution-overview&quot;&gt;0. Execution Overview&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#1-high-level-design-indexing--retrieval--generation&quot; id=&quot;markdown-toc-1-high-level-design-indexing--retrieval--generation&quot;&gt;1. High-level Design (Indexing → Retrieval → Generation)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#2-offline-construction-indexing-detailed-execution&quot; id=&quot;markdown-toc-2-offline-construction-indexing-detailed-execution&quot;&gt;2. Offline Construction: Indexing (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#3-online-query-retrieval-detailed-execution&quot; id=&quot;markdown-toc-3-online-query-retrieval-detailed-execution&quot;&gt;3. Online Query: Retrieval (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#4-online-generation-generation-detailed-execution&quot; id=&quot;markdown-toc-4-online-generation-generation-detailed-execution&quot;&gt;4. Online Generation: Generation (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#5-key-design-decisions&quot; id=&quot;markdown-toc-5-key-design-decisions&quot;&gt;5. Key Design Decisions&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#6-evaluation&quot; id=&quot;markdown-toc-6-evaluation&quot;&gt;6. Evaluation&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#7-limitations-and-applicability&quot; id=&quot;markdown-toc-7-limitations-and-applicability&quot;&gt;7. Limitations and Applicability&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#8-quick-reference&quot; id=&quot;markdown-toc-8-quick-reference&quot;&gt;8. Quick Reference&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#search-r1&quot; id=&quot;markdown-toc-search-r1&quot;&gt;Search-R1&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#0-execution-overview-1&quot; id=&quot;markdown-toc-0-execution-overview-1&quot;&gt;0. Execution Overview&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#1-high-level-design-indexing--retrieval--generation-1&quot; id=&quot;markdown-toc-1-high-level-design-indexing--retrieval--generation-1&quot;&gt;1. High-level Design (Indexing → Retrieval → Generation)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#2-offline-construction-rl-training-detailed-execution&quot; id=&quot;markdown-toc-2-offline-construction-rl-training-detailed-execution&quot;&gt;2. Offline Construction: RL Training (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#3-online-query-retrieval-detailed-execution-1&quot; id=&quot;markdown-toc-3-online-query-retrieval-detailed-execution-1&quot;&gt;3. Online Query: Retrieval (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#4-online-generation-generation-detailed-execution-1&quot; id=&quot;markdown-toc-4-online-generation-generation-detailed-execution-1&quot;&gt;4. Online Generation: Generation (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#5-key-design-decisions-1&quot; id=&quot;markdown-toc-5-key-design-decisions-1&quot;&gt;5. Key Design Decisions&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#6-evaluation-1&quot; id=&quot;markdown-toc-6-evaluation-1&quot;&gt;6. Evaluation&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#7-limitations-and-applicability-1&quot; id=&quot;markdown-toc-7-limitations-and-applicability-1&quot;&gt;7. Limitations and Applicability&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#8-quick-reference-1&quot; id=&quot;markdown-toc-8-quick-reference-1&quot;&gt;8. Quick Reference&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;学术相关&quot;&gt;学术相关&lt;/h1&gt;

&lt;h2 id=&quot;intern-atlas&quot;&gt;Intern-Atlas&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;One-sentence positioning&lt;/strong&gt;: Extracts “method entities + typed causal edges + bottleneck/mechanism evidence” from 1M+ AI papers to construct a queryable methodological evolution graph, serving as the underlying knowledge infrastructure for AI research agents.&lt;/p&gt;

  &lt;p&gt;&lt;strong&gt;Key innovation&lt;/strong&gt;: Upgrades flat citation networks into a “method-method causal graph” where each causal edge is accompanied by verbatim citations and structured bottleneck/mechanism annotations; proposes the SGT-MCTS algorithm to reconstruct method evolution lineages on this graph, enabling idea evaluation and generation based on explicit structural evidence rather than LLM parametric memory.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;0-execution-overview&quot;&gt;0. Execution Overview&lt;/h3&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;Offline Phase (graph construction)
  ├─ ① Paper corpus processing (1,030,314 papers)
  ├─ ② Method entity extraction + alias resolution
  │     ├─ Seed method curation
  │     ├─ LLM Proposer scanning expansion
  │     └─ Alias registry A: 8,155 canonical / 9,545 aliases
  ├─ ③ Citation edge semantic classification (7 types)
  │     ├─ strong-causal: extends / improves / replaces / adapts
  │     ├─ non-strong: uses_component / compares
  │     └─ non-causal: background
  └─ ④ Causal edge evidence filling
        ├─ 14-category bottleneck taxonomy b_e
        ├─ Mechanism m_e / trade-off t_e
        ├─ LLM confidence c_e ∈ [0,1]
        └─ Verbatim citation grounding
              ↓
Online Phase (retrieval + generation)
  ├─ ⑤ Node matching (keywords + BM25)
  ├─ ⑥ Lineage Reconstruction: SGT-MCTS
  │     ├─ Forward/backward dual trees
  │     ├─ SGT-UCT selection = standard UCT + λ·graph-aware prior
  │     ├─ Temporal coherence TC + edge confidence conf joint prior
  │     └─ Branch discovery + Jaccard deduplication
  ├─ ⑦ Lineage Rerank (length + evidence strength + search consensus)
  └─ ⑧ Generation layer
        ├─ Graph-Grounded Idea Evaluator (5-dimension scoring)
        └─ Graph-Grounded Idea Generator (4 strategy types + evidence certificate)
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;1-high-level-design-indexing--retrieval--generation&quot;&gt;1. High-level Design (Indexing → Retrieval → Generation)&lt;/h3&gt;

&lt;h4 id=&quot;11-indexing&quot;&gt;1.1 Indexing&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Chunking strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Method-level entity extraction, replacing traditional paper/paragraph-level extraction&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Index structure&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Method evolution graph: method nodes + typed causal edges + structured evidence attributes&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Knowledge representation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Directed causal network: nodes = methods/papers/stubs, edges = 7 semantic types (extends/improves/replaces/adapts/uses_component/compares/background), each edge carrying bottleneck/mechanism/trade-off/confidence/verbatim citation&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Construction cost&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;High: 1M+ paper corpus processing, LLM extraction (Qwen3.6-35B-A3B), alias resolution, edge classification, evidence filling&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Method-level atomic units replace paper-level; each causal edge mandatorily carries verbatim citations and structured evidence&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;12-retrieval&quot;&gt;1.2 Retrieval&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Retrieval method&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Graph traversal (SGT-MCTS searches evolution paths on the strong-causal subgraph)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Retrieval granularity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Method nodes + Evolution paths (evolution chains)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Iteration strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Multi-hop (forward/backward exploration along strong-causal edges) + Branch discovery restart&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Query processing&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Keyword matching (canonical/alias) + BM25 semantic matching&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;SGT-MCTS dual-tree search with graph-aware and time-aware priors; branch discovery prevents greedy collapse&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;13-generation&quot;&gt;1.3 Generation&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Context injection&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Lineage paths injected into Idea Evaluator / Generator&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Citation tracing&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Verbatim citation grounding + evidence certificate (specific causal edge, bottleneck original text, unresolved explanation)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Quality control&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Graph statistics replace LLM free-text judgment; evidence certificate prevents hallucination; verification failure triggers fallback&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Graph-grounded idea evaluation/generation; structural evidence replaces parametric memory&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;2-offline-construction-indexing-detailed-execution&quot;&gt;2. Offline Construction: Indexing (Detailed Execution)&lt;/h3&gt;

&lt;h4 id=&quot;step-21-corpus-preparation&quot;&gt;Step 2.1 Corpus Preparation&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;1,030,314 AI papers (covering AI conferences, journals, arXiv preprints)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Collect and preprocess paper corpus, construct raw document repository&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Structured paper corpus&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-22-method-entity-extraction-and-alias-resolution&quot;&gt;Step 2.2 Method Entity Extraction and Alias Resolution&lt;/h4&gt;

&lt;h5 id=&quot;step-221-seed-method-construction&quot;&gt;Step 2.2.1 Seed Method Construction&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Human-curated list of well-known methods&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Manually establish initial method seed set&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Seed method set&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h5 id=&quot;step-222-method-expansion&quot;&gt;Step 2.2.2 Method Expansion&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Seed method set + paper corpus&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;LLM Proposer scans entire corpus, identifies and supplements additional candidate method entities&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Expanded candidate method set&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h5 id=&quot;step-223-alias-registry-construction&quot;&gt;Step 2.2.3 Alias Registry Construction&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Expanded method entity set&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Build alias registry A: V_M → 2^Σ*, mapping each canonical method to a set of surface forms&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Matching rules&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Substring matching + case/punctuation normalization + word boundary enforcement (prevents “GPT” matching inside “lgpto”) + longest match priority (“GPT-4 Turbo” &amp;gt; “GPT-4” &amp;gt; “GPT”)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Version merging&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;“-v2”, “-Large” etc. appended to parent unless an independent canonical node already exists&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Ambiguity handling&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Manually maintained negative-surface list (e.g., state space model “Mamba” vs. Python linter “Mamba”)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Scale&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;8,155 canonical methods / 9,545 aliases&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Alias registry A&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-23-citation-edge-semantic-classification-two-phase-llm-extraction&quot;&gt;Step 2.3 Citation Edge Semantic Classification (Two-Phase LLM Extraction)&lt;/h4&gt;

&lt;h5 id=&quot;step-231-phase-1-edge-type-classification&quot;&gt;Step 2.3.1 Phase 1: Edge Type Classification&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Citation relationships between papers&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Use Qwen3.6-35B-A3B to classify each citation edge into semantic types&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Classification system&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;7 types: strong-causal (extends / improves / replaces / adapts), non-strong (uses_component / compares), non-causal (background)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Accuracy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Production model 70.4%; audit model (Claude-Sonnet-4.6) 93.0%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Semantically typed edges&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h5 id=&quot;step-232-phase-2-structured-record-completion&quot;&gt;Step 2.3.2 Phase 2: Structured Record Completion&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Non-background causal edges&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Complete structured evidence records for each edge&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Causal edges carrying structured attributes&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-24-causal-edge-evidence-filling&quot;&gt;Step 2.4 Causal Edge Evidence Filling&lt;/h4&gt;

&lt;h5 id=&quot;step-241-14-category-bottleneck-taxonomy&quot;&gt;Step 2.4.1 14-Category Bottleneck Taxonomy&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Bottleneck Dimension&lt;/th&gt;
      &lt;th&gt;Operational Definition&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;computational complexity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;asymptotic or wall-clock compute at fixed scale&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;memory efficiency&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;peak activation / parameter memory footprint&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;parallelization&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;degree of across-device or across-token parallelism&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;accuracy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;task-level correctness or quality metric&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;generalization&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;out-of-distribution / cross-domain transfer&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;scalability&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;behavior as model / data / context size grows&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;data efficiency&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;sample complexity at fixed quality target&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;training stability&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;variance / divergence risk during optimization&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;inference speed&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;runtime latency or throughput&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;expressiveness&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;function class or representational capacity&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;simplicity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;implementation, conceptual, or interface simplicity&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;robustness&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;behavior under perturbation or adversarial input&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;hyperparameter sensitivity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;outcome variance w.r.t. hyperparameter choice&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;training complexity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;engineering difficulty of the training recipe&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h5 id=&quot;step-242-structured-evidence-quadruple&quot;&gt;Step 2.4.2 Structured Evidence Quadruple&lt;/h5&gt;

&lt;p&gt;Each causal edge e carries quadruple ρ(e) = (b_e, m_e, t_e, c_e):&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Attribute&lt;/th&gt;
      &lt;th&gt;Meaning&lt;/th&gt;
      &lt;th&gt;Key Design&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;b_e&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Bottleneck addressed&lt;/td&gt;
      &lt;td&gt;14-axis bottleneck taxonomy (fixed at publication time)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;m_e&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Mechanism employed&lt;/td&gt;
      &lt;td&gt;LLM-extracted structured field&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;t_e&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Trade-off&lt;/td&gt;
      &lt;td&gt;Cost/limitation of the mechanism&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;c_e&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;LLM-reported confidence&lt;/td&gt;
      &lt;td&gt;∈ [0,1], used later by SGT-MCTS&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Citation grounding&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Verbatim excerpt&lt;/td&gt;
      &lt;td&gt;All non-background edges mandatorily paired with verbatim quote&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h5 id=&quot;step-243-verbatim-citation-verification&quot;&gt;Step 2.4.3 Verbatim Citation Verification&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;LLM-extracted citations + original papers&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Search Match + Symmetry Check&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Purpose&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Ensure citations actually exist in the original text, preventing hallucination&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Verified verbatim citations&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h5 id=&quot;step-244-graph-scale&quot;&gt;Step 2.4.4 Graph Scale&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Metric&lt;/th&gt;
      &lt;th&gt;Value&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Papers&lt;/td&gt;
      &lt;td&gt;1,030,314&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Method nodes&lt;/td&gt;
      &lt;td&gt;8,155 canonical&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Aliases&lt;/td&gt;
      &lt;td&gt;9,545&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Semantically typed edges&lt;/td&gt;
      &lt;td&gt;9,430,201&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;img src=&quot;attachments/Pasted%20image%2020260502215241.png&quot; alt=&quot;Pasted image 20260502215241.png&quot; /&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;3-online-query-retrieval-detailed-execution&quot;&gt;3. Online Query: Retrieval (Detailed Execution)&lt;/h3&gt;

&lt;h4 id=&quot;31-retrieval-mode-overview&quot;&gt;3.1 Retrieval Mode Overview&lt;/h4&gt;

&lt;p&gt;Intern-Atlas employs a single lineage reconstruction retrieval mode, with SGT-MCTS searching on the method evolution graph:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Mode&lt;/th&gt;
      &lt;th&gt;Applicable Scenario&lt;/th&gt;
      &lt;th&gt;Core Mechanism&lt;/th&gt;
      &lt;th&gt;Characteristic&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Lineage Reconstruction&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Query method’s evolution history&lt;/td&gt;
      &lt;td&gt;SGT-MCTS forward/backward dual-tree search on strong-causal subgraph&lt;/td&gt;
      &lt;td&gt;Graph-aware + time-aware; branch discovery prevents greedy collapse&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;32-retrieval-procedure&quot;&gt;3.2 Retrieval Procedure&lt;/h4&gt;

&lt;h5 id=&quot;step-31-node-matching&quot;&gt;Step 3.1: Node Matching&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;User query q&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Parse query, construct seed method set S(q)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Matching method 1&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Vocabulary keyword matching (exact hit on canonical / alias)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Matching method 2&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;BM25 semantic matching (handles semantically ambiguous queries)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Seed method set S(q) ⊆ C_q&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h5 id=&quot;step-32-sgt-mcts-lineage-reconstruction&quot;&gt;Step 3.2: SGT-MCTS Lineage Reconstruction&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Seed method set S(q)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;On strong-causal subgraph (V_M, ε_sc), construct directed evolution path set Π_q along publication chronology&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Why not standard MCTS?&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Central nodes have extremely high branching → standard UCT easily gets trapped in high-visit branches; standard UCT does not perceive graph structure and temporal direction, producing implausible paths where “ancestors come from descendants”&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h6 id=&quot;step-321-dual-tree-construction&quot;&gt;Step 3.2.1: Dual-tree Construction&lt;/h6&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Each seed simultaneously constructs forward + backward trees&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Forward tree&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Searches for predecessor methods&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Backward tree&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Searches for successor developments&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key constraint&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;Operates only on the strong-causal edge subgraph&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h6 id=&quot;step-322-sgt-uct-selection&quot;&gt;Step 3.2.2: SGT-UCT Selection&lt;/h6&gt;

\[\text{SGT-UCT}(v) = \underbrace{\text{UCT}(u, v)}_{\text{standard exploration-exploitation}} + \lambda \cdot \underbrace{\alpha_G(u, v)}_{\text{graph-aware prior}}\]

\[\alpha_G(u, v) = \underbrace{\text{conf}(e_{u \to v})}_{\text{edge confidence}} \cdot \underbrace{\text{TC}(\Delta\tau_{uv})}_{\text{temporal coherence}}\]

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Component&lt;/th&gt;
      &lt;th&gt;Source&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;conf(e)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Edge confidence c_e reported by LLM during graph construction&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;TC(Δτ)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Manually segmented function scoring publication year difference&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

\[\text{TC}(\Delta\tau) =
\begin{cases}
0.40 &amp;amp; -1 \leq \Delta\tau &amp;lt; 0 \quad \text{(slight temporal overlap, e.g., preprints)} \\
0.85 &amp;amp; \Delta\tau = 0 \quad \text{(same year)} \\
1.00 &amp;amp; 1 \leq \Delta\tau \leq 3 \quad \text{(optimal: 1-3 year natural evolution)} \\
0.80 &amp;amp; 4 \leq \Delta\tau \leq 6 \quad \text{(slightly distant but still reasonable)} \\
\max(0.30,\ 1.00 - 0.08(\Delta\tau - 6)) &amp;amp; \Delta\tau &amp;gt; 6 \\
0.70 &amp;amp; \tau \text{ missing}
\end{cases}\]

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Calibration range limitation&lt;/strong&gt;: TC is only calibrated on post-2015 AI literature; domains with different research paces require recalibration.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h6 id=&quot;step-323-expansion-confidence-prioritized&quot;&gt;Step 3.2.3: Expansion (Confidence-prioritized)&lt;/h6&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Loop exclusion&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Discard nodes that would create cycles in the path&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Hard temporal filtering&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Discard nodes with reversed temporal direction (prevents “descendants producing ancestors” paradox)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Expansion strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Select the unexplored child node with highest confidence (no random expansion)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h6 id=&quot;step-324-rollout-greedy&quot;&gt;Step 3.2.4: Rollout (Greedy)&lt;/h6&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Greedy rollout, no MC random sampling&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Selection&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Directly select child node with highest confidence, avoiding noise introduction&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Termination conditions&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Leaf node with no outgoing edges / cycle detected / maximum depth reached&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Scoring&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;R(π): paper does not provide specific form&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h6 id=&quot;step-325-backpropagation&quot;&gt;Step 3.2.5: Backpropagation&lt;/h6&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Backpropagate to update cumulative reward and visit count&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Special penalty&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Additional score deduction for ancestors of non-expandable leaf nodes → reduces dead ends&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h6 id=&quot;step-326-path-stitching-and-deduplication&quot;&gt;Step 3.2.6: Path Stitching and Deduplication&lt;/h6&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Take top-5 cumulative reward paths from forward/backward each → stitch through seed node&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Deduplication&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Jaccard similarity threshold 0.8 to determine path homogeneity; keep only the higher-ranked path for homogeneous ones&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h6 id=&quot;step-327-branch-discovery&quot;&gt;Step 3.2.7: Branch Discovery&lt;/h6&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Problem&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Addresses greedy collapse preventing discovery of branch paths&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Branch node definition&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Node has at least 2 strong-causal child nodes in the search direction, but the current lineage final path includes only 1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Restart search&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Restart algorithm from the branch node&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Constraint&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Edges already in the main lineage are forcibly blocked; search budget halved&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output merge&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Aggregate main path + branch paths&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h5 id=&quot;step-33-lineage-rerank&quot;&gt;Step 3.3: Lineage Rerank&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th&gt; &lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Candidate evolution path π&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Ranking formula&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;rank(π) = w_ℓ ·&lt;/td&gt;
      &lt;td&gt;π&lt;/td&gt;
      &lt;td&gt;/L_max + w_c · conf̄(π) + w_m · N̄(π)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;First term&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Rewards long paths (sufficient evolution)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Second term&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Path average evidence strength&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Third term&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Path node average visit count (search consensus)&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Re-ranked evolution paths&lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt; &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;33-retrieval-algorithm-comparison&quot;&gt;3.3 Retrieval Algorithm Comparison&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Method&lt;/th&gt;
      &lt;th&gt;NR&lt;/th&gt;
      &lt;th&gt;ER&lt;/th&gt;
      &lt;th&gt;CAS&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Beam@1&lt;/td&gt;
      &lt;td&gt;41.0&lt;/td&gt;
      &lt;td&gt;18.6&lt;/td&gt;
      &lt;td&gt;41.0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Beam@5&lt;/td&gt;
      &lt;td&gt;43.4&lt;/td&gt;
      &lt;td&gt;21.6&lt;/td&gt;
      &lt;td&gt;43.4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Beam@10&lt;/td&gt;
      &lt;td&gt;44.9&lt;/td&gt;
      &lt;td&gt;23.2&lt;/td&gt;
      &lt;td&gt;44.9&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;RW@5&lt;/td&gt;
      &lt;td&gt;28.1&lt;/td&gt;
      &lt;td&gt;0.7&lt;/td&gt;
      &lt;td&gt;28.1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;SGT-MCTS&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;84.8&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;79.0&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;84.8&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;img src=&quot;attachments/Pasted%20image%2020260503100504.png&quot; alt=&quot;Pasted image 20260503100504.png&quot; /&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;4-online-generation-generation-detailed-execution&quot;&gt;4. Online Generation: Generation (Detailed Execution)&lt;/h3&gt;

&lt;h4 id=&quot;41-generation-procedure&quot;&gt;4.1 Generation Procedure&lt;/h4&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;Input: query q, retrieved evolution paths Π_q
    │
    ▼
┌─────────────────────────────┐
│ Parse methods mentioned     │
│ in idea d, map to nodes M_d │
│ via lookup table A          │
└─────────────┬───────────────┘
              │
         ┌────┴────┐
         ▼         ▼
   Evaluator   Generator
   (Evaluate)   (Generate)
      │           │
      ▼           ▼
  5-dim scoring  4 strategy types
  +cross penalty +evidence certificate
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h4 id=&quot;step-41-graph-grounded-idea-evaluator&quot;&gt;Step 4.1: Graph-Grounded Idea Evaluator&lt;/h4&gt;

&lt;h5 id=&quot;step-411-motivation&quot;&gt;Step 4.1.1: Motivation&lt;/h5&gt;

&lt;p&gt;Free-text LLM judges prefer stacking popular methods; novelty scores are negatively correlated with actual scientific impact; graph statistics can directly provide structural evidence.&lt;/p&gt;

&lt;h5 id=&quot;step-412-per-dimension-scoring&quot;&gt;Step 4.1.2: Per-dimension Scoring&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Methods mentioned in idea d → resolved to nodes M_d ⊆ V_M via lookup table A&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Independently evaluate 5 dimensions based on graph statistics&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Dimensions&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Novelty (N), Feasibility (F), Significance (S), Validity (V), Clarity (C)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Calculation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Each dimension score directly based on graph statistics of M_d’s position/connectivity structure in retrieval context C_d&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;strong&gt;Formula:&lt;/strong&gt;&lt;/p&gt;

\[s_k(d, G) = \text{clip}_{[1,10]}\left(b_k + \sum_j w_j^{(k)} \cdot \phi_j^{(k)}(M_d, C_d)\right), \quad k \in \{N, F, S, V, C\}\]

&lt;h5 id=&quot;step-413-cross-dimensional-aggregation&quot;&gt;Step 4.1.3: Cross-dimensional Aggregation&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Fixed weight vector w + 4 hand-crafted conjunctive penalties Ω_cross&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Representative penalty&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Strong deduction when high Novelty + low Feasibility — “novel but infeasible” typically indicates a flawed core proposal&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;strong&gt;Formula:&lt;/strong&gt;&lt;/p&gt;

\[s^*(d, G) = \text{clip}_{[1,10]}\left(\mathbf{w}^\top \mathbf{s} + \Omega_{\text{cross}}(\mathbf{s})\right), \quad \mathbf{s} = (s_N, s_F, s_S, s_V, s_C)\]

&lt;h5 id=&quot;step-414-idea-evaluation-results&quot;&gt;Step 4.1.4: Idea Evaluation Results&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Publication tier&lt;/th&gt;
      &lt;th&gt;Overall Score&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Top-tier&lt;/td&gt;
      &lt;td&gt;8.48 ± 1 s.d.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Core&lt;/td&gt;
      &lt;td&gt;7.83 ± 1 s.d.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Workshop&lt;/td&gt;
      &lt;td&gt;6.85 ± 1 s.d.&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Rejected&lt;/td&gt;
      &lt;td&gt;5.84 ± 1 s.d.&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;img src=&quot;attachments/Pasted%20image%2020260503095949.png&quot; alt=&quot;Pasted image 20260503095949.png&quot; /&gt;&lt;/p&gt;

&lt;h4 id=&quot;step-42-graph-grounded-idea-generator&quot;&gt;Step 4.2: Graph-Grounded Idea Generator&lt;/h4&gt;

&lt;h5 id=&quot;step-421-structural-gap-pattern--generation-strategy-mapping&quot;&gt;Step 4.2.1: Structural Gap Pattern → Generation Strategy Mapping&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Gap Pattern&lt;/th&gt;
      &lt;th&gt;Corresponding Generation Strategy&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Open axes&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Bottleneck Resolution&lt;/td&gt;
      &lt;td&gt;Address identified but unresolved bottlenecks in the graph&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Recent improvement direction&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Trend Extrapolation&lt;/td&gt;
      &lt;td&gt;Extrapolate along recent improvement directions&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Sacrifice axes&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Cross-pollination&lt;/td&gt;
      &lt;td&gt;Cross-domain/cross-method combination, leveraging trade-offs between different methods&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Disconnected pairs&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Paradigm Challenge&lt;/td&gt;
      &lt;td&gt;Challenge existing paradigms, connecting disconnected method pairs in the graph&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h5 id=&quot;step-422-evidence-certificate-core-anti-hallucination-mechanism&quot;&gt;Step 4.2.2: Evidence Certificate (Core Anti-hallucination Mechanism)&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Field&lt;/th&gt;
      &lt;th&gt;Content&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Certificate tuple&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;(specific causal edge, bottleneck original text in the graph, explanation of why this bottleneck remains unresolved)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Verification&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Exact match of bottleneck original text in the certificate against actual graph content&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Failure handling&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Discard result, fallback&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Design goal&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Ensure ideas are traceable without over-constraining generation&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;5-key-design-decisions&quot;&gt;5. Key Design Decisions&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Decision Point&lt;/th&gt;
      &lt;th&gt;Intern-Atlas’s Choice&lt;/th&gt;
      &lt;th&gt;Alternative&lt;/th&gt;
      &lt;th&gt;Rationale&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Graph atomic unit&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Method entities (method-level)&lt;/td&gt;
      &lt;td&gt;Papers (paper-level)&lt;/td&gt;
      &lt;td&gt;Notes explicitly document: paper-level citations cannot distinguish extends/improves/replaces/compares/background&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Causal edge evidence&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Mandatory verbatim citations + structured bottleneck/mechanism&lt;/td&gt;
      &lt;td&gt;Edge type labels only&lt;/td&gt;
      &lt;td&gt;Provides grounded evidence, supports certificate verification for idea generation&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Lineage search algorithm&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;SGT-MCTS (graph-aware + time-aware)&lt;/td&gt;
      &lt;td&gt;Beam search / standard MCTS / Random Walk&lt;/td&gt;
      &lt;td&gt;Notes explicitly document: high branching at central nodes causes beam collapse; standard MCTS does not perceive publication year direction; Random Walk performs extremely poorly (ER only 0.7)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Rollout strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Greedy (highest confidence)&lt;/td&gt;
      &lt;td&gt;MC random sampling&lt;/td&gt;
      &lt;td&gt;Random sampling introduces LLM extraction noise&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Dead-end handling&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Backpropagation additional penalty + branch discovery restart&lt;/td&gt;
      &lt;td&gt;Pure reliance on UCT exploration term&lt;/td&gt;
      &lt;td&gt;Explicitly suppresses greedy collapse, proactively recalls ignored branches&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Idea evaluation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Per-dimension graph statistics + hand-crafted conjunctive penalties&lt;/td&gt;
      &lt;td&gt;Free-text LLM judge&lt;/td&gt;
      &lt;td&gt;The latter’s novelty score is negatively correlated with actual impact&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Idea generation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Structural gap patterns + evidence certificate&lt;/td&gt;
      &lt;td&gt;Direct LLM free generation&lt;/td&gt;
      &lt;td&gt;Forces grounding to specific causal edges and bottlenecks, avoids method name stacking&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;6-evaluation&quot;&gt;6. Evaluation&lt;/h3&gt;

&lt;h4 id=&quot;61-graph-construction-quality&quot;&gt;6.1 Graph Construction Quality&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Benchmark&lt;/strong&gt;: 30 high-impact surveys → 30 method evolution graphs, containing 2,268 nodes / 1,462 edges / 133 evolution chains.&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Metric&lt;/th&gt;
      &lt;th&gt;Meaning&lt;/th&gt;
      &lt;th&gt;Result&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;NMR (Node Match Ratio)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Node match rate&lt;/td&gt;
      &lt;td&gt;91.0%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;ERR (Edge Reachable Ratio)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Edge reachability rate&lt;/td&gt;
      &lt;td&gt;89.7%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;PSC (Path Semantic Correctness)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Path semantic correctness&lt;/td&gt;
      &lt;td&gt;92.0%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;NR (Node Recall)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Node recall&lt;/td&gt;
      &lt;td&gt;84.8% (SGT-MCTS)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;ER (Edge Recall)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Edge recall&lt;/td&gt;
      &lt;td&gt;79.0% (SGT-MCTS)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;CAS (Chain Alignment Score)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Lineage chain alignment score&lt;/td&gt;
      &lt;td&gt;84.8% (SGT-MCTS)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;img src=&quot;attachments/Pasted%20image%2020260503095949.png&quot; alt=&quot;Pasted image 20260503095949.png&quot; /&gt;&lt;/p&gt;

&lt;h4 id=&quot;62-idea-evaluator-strata-dataset&quot;&gt;6.2 Idea Evaluator (Strata Dataset)&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Category&lt;/th&gt;
      &lt;th&gt;Paper Count&lt;/th&gt;
      &lt;th&gt;Overall Score&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Top-tier&lt;/strong&gt; (ICLR 2026, ICML 2025, NeurIPS 2025)&lt;/td&gt;
      &lt;td&gt;300&lt;/td&gt;
      &lt;td&gt;8.48&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core&lt;/strong&gt; (AAAI 2026, IJCAI 2025)&lt;/td&gt;
      &lt;td&gt;300&lt;/td&gt;
      &lt;td&gt;7.83&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Workshop&lt;/strong&gt; (ICLR 2026)&lt;/td&gt;
      &lt;td&gt;300&lt;/td&gt;
      &lt;td&gt;6.85&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Rejected&lt;/strong&gt; (ICLR 2026)&lt;/td&gt;
      &lt;td&gt;300&lt;/td&gt;
      &lt;td&gt;5.84&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;strong&gt;Evaluation method&lt;/strong&gt;: Publication tier verification (across publication strata) + human evaluation.&lt;/p&gt;

&lt;h4 id=&quot;63-idea-generator-comparative-experiments&quot;&gt;6.3 Idea Generator Comparative Experiments&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Baseline&lt;/th&gt;
      &lt;th&gt;Retrieval/Knowledge Source&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Direct LLM generation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;No external retrieval&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;External search&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;OpenAlex / Semantic Scholar&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Local RAG&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;BM25&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Intern-Atlas&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Method evolution graph + evidence certificate&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;64-sgt-mcts-retrieval-algorithm-comparison&quot;&gt;6.4 SGT-MCTS Retrieval Algorithm Comparison&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Method&lt;/th&gt;
      &lt;th&gt;NR&lt;/th&gt;
      &lt;th&gt;ER&lt;/th&gt;
      &lt;th&gt;CAS&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Beam@1&lt;/td&gt;
      &lt;td&gt;41.0&lt;/td&gt;
      &lt;td&gt;18.6&lt;/td&gt;
      &lt;td&gt;41.0&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Beam@5&lt;/td&gt;
      &lt;td&gt;43.4&lt;/td&gt;
      &lt;td&gt;21.6&lt;/td&gt;
      &lt;td&gt;43.4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Beam@10&lt;/td&gt;
      &lt;td&gt;44.9&lt;/td&gt;
      &lt;td&gt;23.2&lt;/td&gt;
      &lt;td&gt;44.9&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;RW@5&lt;/td&gt;
      &lt;td&gt;28.1&lt;/td&gt;
      &lt;td&gt;0.7&lt;/td&gt;
      &lt;td&gt;28.1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;SGT-MCTS&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;84.8&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;79.0&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;84.8&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;img src=&quot;attachments/Pasted%20image%2020260503100504.png&quot; alt=&quot;Pasted image 20260503100504.png&quot; /&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;7-limitations-and-applicability&quot;&gt;7. Limitations and Applicability&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Limitation&lt;/th&gt;
      &lt;th&gt;Specific Manifestation&lt;/th&gt;
      &lt;th&gt;Mitigation&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Edge type classification accuracy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Phase-1 production model 70.4% / audit model 93.0%&lt;/td&gt;
      &lt;td&gt;Key decisions can undergo secondary audit; improve production model&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Bottleneck taxonomy rigidity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;14-axis taxonomy D fixed at publication time; new dimensions mapped to nearest existing axis&lt;/td&gt;
      &lt;td&gt;Await future taxonomy revision&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Alias resolution coverage&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Substring matching favors precision over recall; ambiguity handled via manual negative list&lt;/td&gt;
      &lt;td&gt;Biased toward high-quality nodes&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Temporal coherence calibration range&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;TC calibrated on post-2015 AI literature&lt;/td&gt;
      &lt;td&gt;Cross-pace domains require recalibration&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Rollout scoring opacity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;R(π) paper does not provide specific form&lt;/td&gt;
      &lt;td&gt;Reproduction requires custom design&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;best-applicable-scenarios&quot;&gt;Best Applicable Scenarios&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;AI agents performing &lt;strong&gt;idea generation/evaluation&lt;/strong&gt; that require structured method evolution priors&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Traceable&lt;/strong&gt; scientific idea generation (evidence certificate prevents hallucination)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Method lineage research / survey automation&lt;/strong&gt; within AI subfields&lt;/li&gt;
  &lt;li&gt;Evaluating ideas while avoiding the “popular method stacking” bias of LLM free-text judges&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;unsuitable-scenarios&quot;&gt;Unsuitable Scenarios&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;Disciplines with research paces significantly different from AI (TC calibration fails)&lt;/li&gt;
  &lt;li&gt;Extremely niche subfields without human survey references (limited graph coverage)&lt;/li&gt;
  &lt;li&gt;Research focused on paper-level citation network properties (e.g., PageRank, influence propagation)&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;8-quick-reference&quot;&gt;8. Quick Reference&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;What You Want to Know&lt;/th&gt;
      &lt;th&gt;See Which Section&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;What is the complete pipeline?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#0-execution-overview&quot;&gt;0. Execution Overview&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;High-level design comparison?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#1-high-level-design-indexing--retrieval--generation&quot;&gt;1. High-level Design&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How is the graph constructed?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#2-offline-construction-indexing-detailed-execution&quot;&gt;2. Offline Construction&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How to retrieve evolution paths from the graph?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#3-online-query-retrieval-detailed-execution&quot;&gt;3. Online Query&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How to use the graph to evaluate/generate ideas?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#4-online-generation-generation-detailed-execution&quot;&gt;4. Online Generation&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How does it differ from existing approaches?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#5-key-design-decisions&quot;&gt;5. Key Design Decisions&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How does it perform?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#6-evaluation&quot;&gt;6. Evaluation&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;When should it NOT be used?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#7-limitations-and-applicability&quot;&gt;7. Limitations and Applicability&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h2 id=&quot;search-r1&quot;&gt;Search-R1&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;One-sentence positioning&lt;/strong&gt;: A framework that trains LLMs via reinforcement learning to autonomously generate multi-turn search queries during step-by-step reasoning and perform real-time retrieval, addressing the suboptimal performance of prompt-engineering-driven search through retrieval result loss masking and outcome-oriented rewards.&lt;/p&gt;

  &lt;p&gt;&lt;strong&gt;Key innovation&lt;/strong&gt;: Unlike prompt-driven search methods, Search-R1 models the search engine as part of the environment and trains the LLM via RL to autonomously learn the optimal search interaction strategy; simultaneously introduces retrieval token masking to prevent gradient updates on retrieved content, ensuring training stability.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;0-execution-overview-1&quot;&gt;0. Execution Overview&lt;/h3&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;Training Phase
  ├─ ① Preparation: search engine as environment, policy model π_θ, reference model π_ref
  ├─ ② Rollout: model generates reasoning sequences, interleaving token generation with search engine retrieval
  │   ├─ Encountering uncertain knowledge → generate &amp;lt;search&amp;gt;... &amp;lt;/search&amp;gt; query
  │   ├─ Search engine returns documents → wrap as &amp;lt;information&amp;gt;... &amp;lt;/information&amp;gt;
  │   ├─ Reasoning process → wrap as &amp;lt;think&amp;gt;... &amp;lt;/think&amp;gt;
  │   └─ Final answer → wrap as &amp;lt;answer&amp;gt;... &amp;lt;/answer&amp;gt;
  ├─ ③ Reward computation: outcome-oriented reward r_φ(x, y) = EM(a_pred, a_gold)
  ├─ ④ Loss Masking: policy gradient computed only on LLM-generated tokens; retrieved content excluded from optimization
  └─ ⑤ Policy update: PPO or GRPO updates the policy model

Inference Phase (per query)
  ├─ ① Input question q
  ├─ ② Iterative generation: text generation ↔ search query alternation
  │   ├─ Generate response tokens
  │   ├─ Detect &amp;lt;search&amp;gt; / &amp;lt;/answer&amp;gt; / &amp;lt;eos&amp;gt; → determine next action
  │   ├─ Search query → retrieve documents → inject into reasoning chain
  │   └─ Continue generation
  ├─ ③ Termination condition: maximum action count B reached or &amp;lt;answer&amp;gt; generated
  └─ ④ Output final answer
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;1-high-level-design-indexing--retrieval--generation-1&quot;&gt;1. High-level Design (Indexing → Retrieval → Generation)&lt;/h3&gt;

&lt;h4 id=&quot;11-indexing-1&quot;&gt;1.1 Indexing&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Chunking strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;—&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Index structure&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;— (relies on external search engine; offline index construction not explicitly recorded in notes)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Knowledge representation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;—&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Construction cost&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;—&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Notes do not explicitly record index construction; the offline phase focuses on model training rather than knowledge base construction&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;12-retrieval-1&quot;&gt;1.2 Retrieval&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Retrieval method&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Dynamic search triggering (model learns via RL to autonomously decide when to retrieve)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Retrieval granularity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Documents (external documents returned by the search engine)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Iteration strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Multi-turn iteration (search may be triggered multiple times during reasoning until termination conditions are met)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Query processing&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Model autonomously generates search queries based on current reasoning state&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;RL-driven agentic search: model learns the optimal search interaction strategy rather than relying on manual prompt engineering&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;13-generation-1&quot;&gt;1.3 Generation&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Context injection&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Retrieved documents injected into the reasoning chain via &lt;information&gt; tags&lt;/information&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Citation tracing&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;—&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Quality control&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Outcome-oriented reward (EM matching) drives generation quality; retrieval token masking ensures training stability&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Structured token system (&lt;search&gt; / &lt;information&gt; / &lt;think&gt; / &lt;answer&gt;); reasoning and retrieval proceed in an interleaved manner&lt;/answer&gt;&lt;/think&gt;&lt;/information&gt;&lt;/search&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;2-offline-construction-rl-training-detailed-execution&quot;&gt;2. Offline Construction: RL Training (Detailed Execution)&lt;/h3&gt;

&lt;p&gt;The core of Search-R1’s offline phase is RL training rather than traditional knowledge base index construction. The system relies on external search engines (e.g., Bing) for document retrieval, and the notes do not explicitly record traditional index construction steps.&lt;/p&gt;

&lt;h4 id=&quot;step-21-search-engine-environment-modeling&quot;&gt;Step 2.1 Search Engine Environment Modeling&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Policy LLM π_θ, search engine R&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Model the search engine as part of the environment; sampled trajectory sequences interleave LLM token generation with search engine retrieval&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Unlike previous methods that rely solely on the policy LLM for rollout generation, explicitly introduces retrieval-interleaved reasoning: π_θ(· | x; R)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Rollout environment with interleaved retrieval and reasoning&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-22-structured-token-system-definition&quot;&gt;Step 2.2 Structured Token System Definition&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Define four types of special tokens to structure the interaction&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Uses special tokens rather than free text, facilitating rule-based parsing and training control&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Token system: &lt;search&gt; / &lt;/search&gt;, &lt;information&gt; / &lt;/information&gt;, &lt;think&gt; / &lt;/think&gt;, &lt;answer&gt; / &lt;/answer&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-23-rollout-sequence-generation&quot;&gt;Step 2.3 Rollout Sequence Generation&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Query q, policy model π_θ, search engine R&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Model generates response tokens y&lt;em&gt;_t ~ π&lt;/em&gt;&lt;em&gt;θ(· | x, y&lt;/em&gt;*&amp;lt;t; R), appended to the rollout sequence&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;The sequence contains two types of tokens: LLM-generated tokens and retrieved tokens&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Complete rollout sequence y&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-24-retrieval-token-masking-loss-masking&quot;&gt;Step 2.4 Retrieval Token Masking (Loss Masking)&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Problem&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Optimizing retrieved tokens equally leads to unexpected learning dynamics&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Introduce a loss mask ensuring the policy gradient objective is computed only on LLM-generated tokens&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Indicator function I(y_t): model-generated token = 1, retrieved content (within &lt;information&gt;) = 0&lt;/information&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Masked policy gradient computation&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-25-reward-computation&quot;&gt;Step 2.5 Reward Computation&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Model’s final answer a_pred, ground-truth answer a_gold&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Apply a rule-based outcome-oriented reward: r_φ(x, y) = EM(a_pred, a_gold)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Does not use format reward (model already demonstrates strong structural compliance); does not train a neural reward model (avoids sensitivity and additional cost)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Reward value&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-26-policy-update-ppo--grpo&quot;&gt;Step 2.6 Policy Update (PPO / GRPO)&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;PPO&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Actor-critic method; Actor generates answers, Critic value network estimates advantage, advantage function uses GAE&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;GRPO&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Group-relative advantage estimation, no Critic network needed; samples G outputs per group to compute group baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Both are compatible; GRPO converges faster, PPO trains more stably, final rewards are comparable&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Updated policy model parameters&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;3-online-query-retrieval-detailed-execution-1&quot;&gt;3. Online Query: Retrieval (Detailed Execution)&lt;/h3&gt;

&lt;h4 id=&quot;31-retrieval-mode-overview-1&quot;&gt;3.1 Retrieval Mode Overview&lt;/h4&gt;

&lt;p&gt;Search-R1 employs a single RL-driven search mode, autonomously triggered by the trained model during inference:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Mode&lt;/th&gt;
      &lt;th&gt;Applicable Scenario&lt;/th&gt;
      &lt;th&gt;Core Mechanism&lt;/th&gt;
      &lt;th&gt;Characteristic&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;RL-driven Search&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Knowledge uncertainty encountered during reasoning&lt;/td&gt;
      &lt;td&gt;Model autonomously generates &lt;search&gt; queries → retrieves → injects&lt;/search&gt;&lt;/td&gt;
      &lt;td&gt;Learned, not manually predefined strategy&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;32-retrieval-procedure-1&quot;&gt;3.2 Retrieval Procedure&lt;/h4&gt;

&lt;p&gt;&lt;img src=&quot;attachments/Pasted%20image%2020260513202849.png&quot; alt=&quot;Pasted image 20260513202849.png&quot; /&gt;&lt;/p&gt;

&lt;h5 id=&quot;step-31-reasoning-and-search-alternation&quot;&gt;Step 3.1: Reasoning and Search Alternation&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Query q, current reasoning chain state&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;LLM alternates between text generation and external search engine queries&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Iterative framework: generates at each step until &lt;search&gt;, &amp;lt;/answer&amp;gt;, or &lt;eos&gt; is detected&lt;/eos&gt;&lt;/search&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Response token sequence&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h5 id=&quot;step-32-search-query-detection-and-execution&quot;&gt;Step 3.2: Search Query Detection and Execution&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Query wrapped in &lt;search&gt; tags&lt;/search&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Extract search query from rollout sequence, invoke search engine to retrieve documents D&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Retrieved documents, wrapped as &lt;information&gt;... &lt;/information&gt; and injected into the sequence&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h5 id=&quot;step-33-iterative-termination-judgment&quot;&gt;Step 3.3: Iterative Termination Judgment&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Termination condition 1&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Maximum action count B reached&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Termination condition 2&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Model generates final response wrapped in &lt;answer&gt; tags&lt;/answer&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Fallback strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;If &lt;answer&gt; content is empty (model outputs &quot;My action is not correct. Let me rethink.&quot;), continue iteration&lt;/answer&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;4-online-generation-generation-detailed-execution-1&quot;&gt;4. Online Generation: Generation (Detailed Execution)&lt;/h3&gt;

&lt;h4 id=&quot;41-reasoning-procedure&quot;&gt;4.1 Reasoning Procedure&lt;/h4&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;Input question q
    │
    ▼
┌─────────────────────────┐
│ Initialize: action count b ← 0 │
│ Initialize: response y ← ∅      │
└───────────┬─────────────┘
            │
            ▼
    ┌───────┴───────┐
    │  while b &amp;lt; B  │
    └───────┬───────┘
            │
            ▼
    ┌───────────────────┐
    │ Generate response tokens │
    │ until termination signal │
    └─────────┬─────────┘
              │
         ┌────┴────┐
         ▼         ▼
      &amp;lt;search&amp;gt;   &amp;lt;answer&amp;gt;
         │         │
         ▼         ▼
      Retrieve    Return y
      documents
      inject
         │
         ▼
      b ← b + 1
         │
         └──────→ Continue generating
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h4 id=&quot;step-41-online-reasoning-sequence-generation&quot;&gt;Step 4.1: Online Reasoning Sequence Generation&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Query q, trained policy model π_θ, search engine R&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Model autoregressively generates response tokens; pauses upon encountering &lt;search&gt; to trigger retrieval, then continues generation after injecting results&lt;/search&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Reasoning process is structurally consistent with the training Rollout, but no longer computes gradient updates&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Final response sequence y, containing the answer wrapped in &lt;answer&gt; tags&lt;/answer&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;5-key-design-decisions-1&quot;&gt;5. Key Design Decisions&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Decision Point&lt;/th&gt;
      &lt;th&gt;Search-R1’s Choice&lt;/th&gt;
      &lt;th&gt;Alternative&lt;/th&gt;
      &lt;th&gt;Rationale&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Training paradigm&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Reinforcement learning (PPO / GRPO)&lt;/td&gt;
      &lt;td&gt;Prompt engineering / SFT / Rejection sampling&lt;/td&gt;
      &lt;td&gt;Explicitly stated in notes: prompting advanced LLMs to use search engines is often suboptimal; LLMs may not fully possess the ability to optimally interact with search engines&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Retrieval triggering&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Model learns autonomously (&lt;search&gt; token)&lt;/search&gt;&lt;/td&gt;
      &lt;td&gt;Fixed retrieval strategy / manually predefined trigger conditions&lt;/td&gt;
      &lt;td&gt;Explicitly stated in notes: RL training enables the model to autonomously learn when search is needed&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Reward design&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Outcome-oriented reward (EM matching)&lt;/td&gt;
      &lt;td&gt;Process reward / neural reward model / format reward&lt;/td&gt;
      &lt;td&gt;Explicitly stated in notes: uses simple outcome reward to avoid complexity; does not train a neural reward model to avoid sensitivity and cost&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Training stability&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Retrieval token masking&lt;/td&gt;
      &lt;td&gt;No special handling of retrieved content&lt;/td&gt;
      &lt;td&gt;Explicitly stated in notes: optimizing on retrieved tokens equally leads to unexpected learning dynamics&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Inference algorithm&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Supports both PPO and GRPO&lt;/td&gt;
      &lt;td&gt;Single RL algorithm&lt;/td&gt;
      &lt;td&gt;Explicitly stated in notes: both algorithms are compatible, providing empirical comparison&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;6-evaluation-1&quot;&gt;6. Evaluation&lt;/h3&gt;

&lt;h4 id=&quot;61-evaluation-metrics&quot;&gt;6.1 Evaluation Metrics&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Metric&lt;/th&gt;
      &lt;th&gt;Meaning&lt;/th&gt;
      &lt;th&gt;This System vs. Baseline&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;EM&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Exact Match&lt;/td&gt;
      &lt;td&gt;Qwen2.5-7B +41% over RAG baselines; Qwen2.5-3B +20% over RAG baselines&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;62-rl-training-comparison&quot;&gt;6.2 RL Training Comparison&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Condition&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;PPO&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Actor-critic RL; higher training stability&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;GRPO&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Group Relative Policy Optimization; faster convergence&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;GRPO converges faster, PPO is more stable, final training rewards are comparable&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;63-experimental-setup&quot;&gt;6.3 Experimental Setup&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Condition&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Search-R1&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Full system (RL training + retrieval masking + outcome reward)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;CoT&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Chain-of-thought baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;vanilla RAG&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Standard retrieval-augmented generation baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;IRCoT&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Iterative retrieval chain-of-thought baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Search-o1&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Inference-time search augmentation baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;R1&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;DeepSeek-R1 baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;SFT&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Supervised fine-tuning baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Rejection Sampling&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Rejection sampling baseline&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;64-datasets&quot;&gt;6.4 Datasets&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dataset&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Natural Questions (NQ)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Open-domain QA benchmark&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;TriviaQA&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Open-domain QA benchmark&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;PopQA&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Knowledge-intensive QA benchmark&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;HotpotQA&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Multi-hop QA benchmark&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;2WikiMultiHopQA&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Multi-hop QA benchmark&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;MuSiQue&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Multi-hop QA benchmark&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Bamboogle&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Open-domain QA benchmark&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;7-limitations-and-applicability-1&quot;&gt;7. Limitations and Applicability&lt;/h3&gt;

&lt;p&gt;The notes do not explicitly record Search-R1’s limitations.&lt;/p&gt;

&lt;h4 id=&quot;best-applicable-scenarios-1&quot;&gt;Best Applicable Scenarios&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;Reasoning tasks requiring efficient acquisition of external knowledge and up-to-date information&lt;/li&gt;
  &lt;li&gt;Open-domain QA (NQ, TriviaQA, etc.)&lt;/li&gt;
  &lt;li&gt;Multi-hop QA (HotpotQA, 2WikiMultiHopQA, MuSiQue, etc.)&lt;/li&gt;
  &lt;li&gt;Knowledge-intensive reasoning tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;unsuitable-scenarios-1&quot;&gt;Unsuitable Scenarios&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;—&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;8-quick-reference-1&quot;&gt;8. Quick Reference&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;What You Want to Know&lt;/th&gt;
      &lt;th&gt;See Which Section&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;What is the complete pipeline?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#0-execution-overview&quot;&gt;0. Execution Overview&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;High-level design comparison?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#1-high-level-design-indexing--retrieval--generation&quot;&gt;1. High-level Design&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How is the model trained?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#2-offline-constructionrl-training-detailed-execution&quot;&gt;2. Offline Construction&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How is retrieval triggered?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#3-online-query-retrieval-detailed-execution&quot;&gt;3. Online Query&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;RL training details?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#2-offline-constructionrl-training-detailed-execution&quot;&gt;2. Offline Construction&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Why this design?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#5-key-design-decisions&quot;&gt;5. Key Design Decisions&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How does it perform?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#6-evaluation&quot;&gt;6. Evaluation&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;When should it NOT be used?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#7-limitations-and-applicability&quot;&gt;7. Limitations and Applicability&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

</description>
        <pubDate>Mon, 01 Jun 2026 00:00:00 +0000</pubDate>
        <link>https://www.caozihang.com//2026/06/01/week/</link>
        <guid isPermaLink="true">https://www.caozihang.com//2026/06/01/week/</guid>
        
        <category>日常</category>
        
        
      </item>
    
      <item>
        <title>【阅】本周阅读摘选2026-05-18 → 2026-05-24</title>
        <description>&lt;center style=&quot;margin-bottom: 20px; margin-top: 50px&quot;&gt;&lt;font color=&quot;#3879B1&quot; style=&quot;line-height: 1.4;font-weight: 700;font-size: 36px;box-sizing: border-box; &quot;&gt;本周阅读摘选&lt;/font&gt;&lt;/center&gt;

&lt;center style=&quot; margin-bottom: 30px;&quot;&gt;2026-05-18 → 2026-05-24&lt;/center&gt;

&lt;font style=&quot;font-weight: bold;&quot;&gt;目录&lt;/font&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#学术相关&quot; id=&quot;markdown-toc-学术相关&quot;&gt;学术相关&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#hipporag&quot; id=&quot;markdown-toc-hipporag&quot;&gt;HippoRAG&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#0-execution-overview&quot; id=&quot;markdown-toc-0-execution-overview&quot;&gt;0. Execution Overview&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#1-high-level-design-indexing--retrieval--generation&quot; id=&quot;markdown-toc-1-high-level-design-indexing--retrieval--generation&quot;&gt;1. High-level Design (Indexing → Retrieval → Generation)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#2-offline-construction-indexing-detailed-execution&quot; id=&quot;markdown-toc-2-offline-construction-indexing-detailed-execution&quot;&gt;2. Offline Construction: Indexing (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#3-online-query-retrieval-detailed-execution&quot; id=&quot;markdown-toc-3-online-query-retrieval-detailed-execution&quot;&gt;3. Online Query: Retrieval (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#4-online-generation-generation-detailed-execution&quot; id=&quot;markdown-toc-4-online-generation-generation-detailed-execution&quot;&gt;4. Online Generation: Generation (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#5-key-design-decisions&quot; id=&quot;markdown-toc-5-key-design-decisions&quot;&gt;5. Key Design Decisions&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#6-evaluation&quot; id=&quot;markdown-toc-6-evaluation&quot;&gt;6. Evaluation&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#7-limitations-and-applicability&quot; id=&quot;markdown-toc-7-limitations-and-applicability&quot;&gt;7. Limitations and Applicability&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#8-quick-reference&quot; id=&quot;markdown-toc-8-quick-reference&quot;&gt;8. Quick Reference&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#hipporag2&quot; id=&quot;markdown-toc-hipporag2&quot;&gt;HippoRAG2&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#0-execution-overview-1&quot; id=&quot;markdown-toc-0-execution-overview-1&quot;&gt;0. Execution Overview&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#1-high-level-design-indexing--retrieval--generation-1&quot; id=&quot;markdown-toc-1-high-level-design-indexing--retrieval--generation-1&quot;&gt;1. High-level Design (Indexing → Retrieval → Generation)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#2-offline-construction-indexing-detailed-execution-1&quot; id=&quot;markdown-toc-2-offline-construction-indexing-detailed-execution-1&quot;&gt;2. Offline Construction: Indexing (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#3-online-query-retrieval-detailed-execution-1&quot; id=&quot;markdown-toc-3-online-query-retrieval-detailed-execution-1&quot;&gt;3. Online Query: Retrieval (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#4-online-generation-generation-detailed-execution-1&quot; id=&quot;markdown-toc-4-online-generation-generation-detailed-execution-1&quot;&gt;4. Online Generation: Generation (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#5-key-design-decisions-1&quot; id=&quot;markdown-toc-5-key-design-decisions-1&quot;&gt;5. Key Design Decisions&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#6-evaluation-1&quot; id=&quot;markdown-toc-6-evaluation-1&quot;&gt;6. Evaluation&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#7-limitations-and-applicability-1&quot; id=&quot;markdown-toc-7-limitations-and-applicability-1&quot;&gt;7. Limitations and Applicability&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#8-quick-reference-1&quot; id=&quot;markdown-toc-8-quick-reference-1&quot;&gt;8. Quick Reference&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;学术相关&quot;&gt;学术相关&lt;/h1&gt;

&lt;h2 id=&quot;hipporag&quot;&gt;HippoRAG&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;One-sentence positioning&lt;/strong&gt;: A retrieval framework inspired by the hippocampal indexing theory that extracts an open knowledge graph via LLM and combines it with the Personalized PageRank algorithm to achieve cross-passage multi-hop knowledge integration in a single retrieval step.&lt;/p&gt;

  &lt;p&gt;&lt;strong&gt;Key innovation&lt;/strong&gt;: Uses Personalized PageRank to perform pattern completion on the knowledge graph, compressing traditional iterative multi-hop retrieval into a single-step graph traversal, achieving performance comparable to or better than iterative retrieval while reducing cost by 10-30x and improving speed by 6-13x.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;0-execution-overview&quot;&gt;0. Execution Overview&lt;/h3&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;Offline Phase (one-time construction)
  ├─ ① Named entity recognition (extract named entities from each passage)
  ├─ ② OpenIE extraction of KG triples (two-stage: NER → OpenIE extracts noun phrase nodes N and relation edges E)
  ├─ ③ Retrieval encoder supplements synonym edges (add E&apos; between entities with cosine similarity &amp;gt; τ)
  └─ ④ Construct Passage-Node co-occurrence matrix P (|N| × |P|, recording each noun phrase&apos;s occurrence count in each passage)
         ↓
Online Phase (per query)
  ├─ ⑤ Query entity extraction (LLM extracts salient named entities from query)
  ├─ ⑥ Query node mapping (entity vectorized encoding, mapped to most similar KG node via cosine similarity)
  ├─ ⑦ Run Personalized PageRank (query nodes as seeds, PPR distributes probability on graph)
  │      - Initialization: query nodes equal probability, rest are 0
  │      - Node-specific weighting: si = |Pi|^{-1} (similar to TF-IDF)
  │      - Transition probability constructed via adjacency matrix
  └─ ⑧ Aggregate PPR node probabilities with co-occurrence matrix P to obtain passage ranking scores
         ↓
  ⑨ Retrieved passages input to LLM for final answer generation
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;1-high-level-design-indexing--retrieval--generation&quot;&gt;1. High-level Design (Indexing → Retrieval → Generation)&lt;/h3&gt;

&lt;h4 id=&quot;11-indexing&quot;&gt;1.1 Indexing&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Chunking strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Processed by passage (notes do not document chunking parameters)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Index structure&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Graph index (open KG + synonym edges + Passage-Node co-occurrence matrix)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Knowledge representation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Entity-relation graph (schemaless OpenIE triples) + synonym relations + co-occurrence statistics&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Construction cost&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Medium (LLM two-stage extraction + encoder similarity computation)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Artificial neocortex (LLM) handles extraction, artificial hippocampus (open KG) handles indexing, parahippocampal region (retrieval encoder) handles connection&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;12-retrieval&quot;&gt;1.2 Retrieval&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Retrieval method&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Graph traversal (Personalized PageRank)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Retrieval granularity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Passage&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Iteration strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Single retrieval (PPR achieves multi-hop effect in a single step)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Query processing&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Entity extraction → Query node mapping (cosine similarity)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;PPR uses query nodes as seeds to diffuse probability on the graph, achieving in one step what traditional methods require iteration for&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;13-generation&quot;&gt;1.3 Generation&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Context injection&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Retrieved and ranked passages as context input to LLM&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Citation tracing&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Based on Passage-Node co-occurrence matrix associating nodes with original passages&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Quality control&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Notes do not explicitly document special mechanisms in the generation phase&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Retrieval framework positioning; generation phase relies on standard LLM generation&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;2-offline-construction-indexing-detailed-execution&quot;&gt;2. Offline Construction: Indexing (Detailed Execution)&lt;/h3&gt;

&lt;h4 id=&quot;step-21-named-entity-recognition&quot;&gt;Step 2.1 Named Entity Recognition&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Raw passage collection P&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Extract named entities from each passage&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;First step of two-stage extraction: extract named entities first, then add them to the OpenIE prompt, balancing generality and named entity bias&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Named entity set for each passage&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-22-openie-extraction-of-kg-triples&quot;&gt;Step 2.2 OpenIE Extraction of KG Triples&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Passages + named entities&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;LLM performs open information extraction (OpenIE) via 1-shot prompting, extracting noun phrase nodes N and relation edges E&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Add named entities to the OpenIE prompt to extract final triples containing concepts beyond named entities (noun phrases)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Nodes N and edges E of schemaless open KG&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-23-supplement-synonym-edges&quot;&gt;Step 2.3 Supplement Synonym Edges&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Node set N + retrieval encoder M&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Compute cosine similarity between node representations; add synonym relation edges E’ when similarity exceeds threshold τ&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Uses off-the-shelf dense retrieval encoders to establish additional edges between similar but non-identical noun phrases, aiding downstream pattern completion&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Extended edge set (E + E’)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-24-construct-passage-node-co-occurrence-matrix&quot;&gt;Step 2.4 Construct Passage-Node Co-occurrence Matrix&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Final node set N + original passages P&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Count the occurrence of each noun phrase in each original passage&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;|N| × |P| co-occurrence matrix P (recording each noun phrase’s occurrence count in each passage)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;3-online-query-retrieval-detailed-execution&quot;&gt;3. Online Query: Retrieval (Detailed Execution)&lt;/h3&gt;

&lt;h4 id=&quot;31-retrieval-procedure&quot;&gt;3.1 Retrieval Procedure&lt;/h4&gt;

&lt;h5 id=&quot;step-31-query-entity-extraction&quot;&gt;Step 3.1 Query Entity Extraction&lt;/h5&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Operation&lt;/strong&gt;: LLM extracts salient named entities (query named entities) from the query&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Purpose&lt;/strong&gt;: Transform natural language query into seed nodes on the graph&lt;/li&gt;
&lt;/ul&gt;

&lt;h5 id=&quot;step-32-query-node-mapping&quot;&gt;Step 3.2 Query Node Mapping&lt;/h5&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Operation&lt;/strong&gt;: Vectorize query named entities using the retrieval encoder, map to most similar nodes in the KG based on cosine similarity&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Output&lt;/strong&gt;: Query nodes&lt;/li&gt;
&lt;/ul&gt;

&lt;h5 id=&quot;step-33-personalized-pagerank-execution&quot;&gt;Step 3.3 Personalized PageRank Execution&lt;/h5&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Initialize personalized probability distribution&lt;/strong&gt;: All query nodes have equal probability, other nodes probability is 0&lt;/li&gt;
  &lt;li&gt;
    &lt;table&gt;
      &lt;tbody&gt;
        &lt;tr&gt;
          &lt;td&gt;&lt;strong&gt;Node-specific weighting&lt;/strong&gt;: si =&lt;/td&gt;
          &lt;td&gt;Pi&lt;/td&gt;
          &lt;td&gt;^{-1}, similar to TF-IDF’s inverse document frequency idea&lt;/td&gt;
        &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Transition probability&lt;/strong&gt;: Constructed via adjacency matrix&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Core mechanism&lt;/strong&gt;: PPR distributes probability on the graph only through user-defined source nodes (query nodes), simulating pattern completion in hippocampal neural pathways&lt;/li&gt;
&lt;/ul&gt;

&lt;h5 id=&quot;step-34-passage-ranking&quot;&gt;Step 3.4 Passage Ranking&lt;/h5&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Operation&lt;/strong&gt;: Multiply the updated PPR node probability distribution with the co-occurrence matrix P&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Output&lt;/strong&gt;: Final ranking score for each passage&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;4-online-generation-generation-detailed-execution&quot;&gt;4. Online Generation: Generation (Detailed Execution)&lt;/h3&gt;

&lt;p&gt;HippoRAG is positioned as a retrieval framework; the notes do not explicitly document special design in the generation phase. Retrieved and ranked passages serve as context input to the downstream LLM for answer generation.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;User query
    │
    ▼
┌─────────────────┐
│ Query entity    │  ← LLM extracts named entities
│ extraction      │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Query node      │  ← Cosine similarity matches KG nodes
│ mapping         │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ PPR probability │  ← Execute Personalized PageRank on graph
│ diffusion       │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Passage ranking │  ← PPR probability × co-occurrence matrix P
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ LLM generates   │  ← Retrieved passages as context
│ answer          │
└─────────────────┘
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;5-key-design-decisions&quot;&gt;5. Key Design Decisions&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Decision Point&lt;/th&gt;
      &lt;th&gt;HippoRAG’s Choice&lt;/th&gt;
      &lt;th&gt;Alternative&lt;/th&gt;
      &lt;th&gt;Rationale&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Index structure&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Open KG + synonym edges + co-occurrence matrix&lt;/td&gt;
      &lt;td&gt;Pure vector index / structured KG / text chunk index&lt;/td&gt;
      &lt;td&gt;Schemaless OpenIE flexibly adapts to any corpus; co-occurrence matrix associates nodes with original passages&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Retrieval algorithm&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Personalized PageRank single-step graph traversal&lt;/td&gt;
      &lt;td&gt;Iterative multi-hop retrieval (e.g., IRCoT)&lt;/td&gt;
      &lt;td&gt;PPR achieves multi-hop effect in one step, reducing cost by 10-30x and improving speed by 6-13x&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Entity extraction&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Two-stage: NER → OpenIE&lt;/td&gt;
      &lt;td&gt;Single-stage OpenIE&lt;/td&gt;
      &lt;td&gt;Balances generality and named entity bias&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Synonym edge construction&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Retrieval encoder cosine similarity&lt;/td&gt;
      &lt;td&gt;Exact string matching / semantic models&lt;/td&gt;
      &lt;td&gt;Establishes connections between similar but non-identical phrases, aiding pattern completion&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Node-specific weighting&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;si = |Pi|^{-1} (similar to TF-IDF)&lt;/td&gt;
      &lt;td&gt;Uniform weighting / other weighting strategies&lt;/td&gt;
      &lt;td&gt;Reduces weight of high-frequency nodes, improving retrieval precision&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;6-evaluation&quot;&gt;6. Evaluation&lt;/h3&gt;

&lt;h4 id=&quot;61-evaluation-metrics&quot;&gt;6.1 Evaluation Metrics&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Metric&lt;/th&gt;
      &lt;th&gt;Meaning&lt;/th&gt;
      &lt;th&gt;System vs. Baseline&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;recall@2 / recall@5&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Retrieval recall&lt;/td&gt;
      &lt;td&gt;Superior to traditional RAG methods&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;EM&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Exact match&lt;/td&gt;
      &lt;td&gt;Single-step retrieval comparable to or better than IRCoT&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;F1&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;F1 score&lt;/td&gt;
      &lt;td&gt;Up to 20% improvement over SOTA&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;62-comparative-experimental-setup&quot;&gt;6.2 Comparative Experimental Setup&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Condition&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;HippoRAG&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Full system (OpenIE KG + PPR retrieval)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;BM25&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Sparse retrieval baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Contriever / GTR / ColBERTv2&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Dense retrieval baselines&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Propositionizer&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Proposition-level retrieval baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;RAPTOR&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Hierarchical summary retrieval baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;IRCoT&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Iterative retrieval baseline&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;strong&gt;Benchmark datasets&lt;/strong&gt;: MuSiQue, 2WikiMultiHopQA, HotpotQA&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key findings:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Single-step retrieval achieves performance comparable to or better than iterative retrieval IRCoT&lt;/li&gt;
  &lt;li&gt;Cost is 1/10 to 1/30 of IRCoT, speed is 6-13x faster than IRCoT&lt;/li&gt;
  &lt;li&gt;Integration into IRCoT yields further improvements&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;7-limitations-and-applicability&quot;&gt;7. Limitations and Applicability&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Limitation&lt;/th&gt;
      &lt;th&gt;Specific Manifestation&lt;/th&gt;
      &lt;th&gt;Mitigation&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;NER design limitation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Cannot extract sufficient information from queries for retrieval, accounting for approximately half of all errors&lt;/td&gt;
      &lt;td&gt;Improve NER module design; consider richer query understanding mechanisms&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Entity-centric bias&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Strong bias toward concepts, many contextual signals not utilized&lt;/td&gt;
      &lt;td&gt;Introduce context-aware retrieval mechanisms&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Missing contextual cues&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Ignoring contextual cues accounts for approximately 48% of errors&lt;/td&gt;
      &lt;td&gt;Incorporate more contextual information in indexing and retrieval&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;best-applicable-scenarios&quot;&gt;Best Applicable Scenarios&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;Multi-hop QA tasks requiring &lt;strong&gt;cross-passage knowledge integration&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Scenarios sensitive to &lt;strong&gt;retrieval latency&lt;/strong&gt; (single-step PPR is 6-13x faster than iterative retrieval)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Cost-constrained&lt;/strong&gt; environments (10-30x cheaper than iterative retrieval)&lt;/li&gt;
  &lt;li&gt;Scenarios where query information can primarily be expressed through &lt;strong&gt;entity relations&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;unsuitable-scenarios&quot;&gt;Unsuitable Scenarios&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;Scenarios where key information in queries &lt;strong&gt;cannot be extracted via NER&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Queries requiring &lt;strong&gt;extensive contextual cues&lt;/strong&gt; rather than entity relations&lt;/li&gt;
  &lt;li&gt;Scenarios with extremely high demands on &lt;strong&gt;fine-grained semantic differences between concepts&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;8-quick-reference&quot;&gt;8. Quick Reference&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;What You Want to Know&lt;/th&gt;
      &lt;th&gt;See Which Section&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;What is the complete pipeline?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#0-execution-overview&quot;&gt;0. Execution Overview&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;High-level design comparison?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#1-high-level-design-indexing--retrieval--generation&quot;&gt;1. High-level Design&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How is the index constructed?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#2-offline-construction-indexing-detailed-execution&quot;&gt;2. Offline Construction&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How are queries answered?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#3-online-query-retrieval-detailed-execution&quot;&gt;3. Online Query&lt;/a&gt; + &lt;a href=&quot;#4-online-generation-generation-detailed-execution&quot;&gt;4. Online Generation&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Why is it better than iterative retrieval?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#5-key-design-decisions&quot;&gt;5. Key Design Decisions&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How does it perform?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#6-evaluation&quot;&gt;6. Evaluation&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;When should it NOT be used?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#7-limitations-and-applicability&quot;&gt;7. Limitations and Applicability&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h2 id=&quot;hipporag2&quot;&gt;HippoRAG2&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;One-sentence positioning&lt;/strong&gt;: An improved version of HippoRAG that introduces passage nodes into the knowledge graph to achieve dense-sparse fusion of concepts and context, combining Query-to-Triple retrieval with LLM filtering to comprehensively outperform standard RAG on factual memory, sense-making, and associative memory tasks.&lt;/p&gt;

  &lt;p&gt;&lt;strong&gt;Key innovation&lt;/strong&gt;: Introduces passage nodes on top of PPR to achieve dense-sparse fusion of concepts and context, and through the Query-to-Triple + LLM filtering retrieval strategy, addresses the performance degradation of knowledge graph RAG on factual memory tasks.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;0-execution-overview-1&quot;&gt;0. Execution Overview&lt;/h3&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;Offline Phase (one-time construction)
  ├─ ① Entity extraction
  ├─ ② Extract KG triples (OpenIE)
  ├─ ③ Synonym edge completion (cosine similarity &amp;gt; τ)
  └─ ④ Dense-Sparse fusion (core design)
         - Sparse layer: phrase nodes
         - Dense layer: complete vector representation of each passage abstracted as passage node
         - Bridging: passage nodes connected to corresponding entities via contains edges
         ↓
Online Phase (per query)
  ├─ ⑤ Query embedding
  ├─ ⑥ Vector recall
  │      - top-k passages (embedding similarity)
  │      - top-k KG triples (Query to Triple)
  ├─ ⑦ LLM filters triples (produces T&apos; ⊆ T)
  ├─ ⑧ Seed node selection
  │      - If filtered set is empty → directly return top-k passages
  │      - Otherwise
  │        · Phrase nodes: identified from T&apos;, top-k selected by average ranking score, probabilities assigned by normalized ranking scores
  │        · Passage nodes: all recalled passages serve as seeds, probability = embedding similarity × weight factor
  ├─ ⑨ Merge &amp;amp; global normalization
  ├─ ⑩ Execute Personalized PageRank
  └─ ⑪ Sort and output retrieval results
         ↓
  ⑫ Retrieved passages input to LLM for final answer generation
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;1-high-level-design-indexing--retrieval--generation-1&quot;&gt;1. High-level Design (Indexing → Retrieval → Generation)&lt;/h3&gt;

&lt;h4 id=&quot;11-indexing-1&quot;&gt;1.1 Indexing&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Chunking strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Processed by passage&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Index structure&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Graph index (open KG + passage nodes + synonym edges + contains edges)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Knowledge representation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Entity-relation graph + passage nodes + dense-sparse fusion&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Construction cost&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Medium (LLM extraction + encoder similarity computation + passage vector encoding)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Dense-Sparse fusion — phrase nodes as sparse encoding, passage nodes as dense encoding, contains edges bridge the two&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;12-retrieval-1&quot;&gt;1.2 Retrieval&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Retrieval method&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Hybrid (graph traversal PPR + vector recall of triples/passages)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Retrieval granularity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Passage&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Iteration strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Single retrieval (PPR single-step achieves multi-hop effect)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Query processing&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Query to Triple (embedding recall of top-k triples + LLM filtering)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Query-to-Triple incorporates richer contextual information from KG; phrase + passage nodes jointly serve as PPR seeds&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;13-generation-1&quot;&gt;1.3 Generation&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Context injection&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Retrieved and ranked passages as context input to LLM&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Citation tracing&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Based on passage-node associations&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Quality control&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Notes do not explicitly document special mechanisms in the generation phase&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Retrieval framework positioning; generation phase relies on standard LLM generation&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;2-offline-construction-indexing-detailed-execution-1&quot;&gt;2. Offline Construction: Indexing (Detailed Execution)&lt;/h3&gt;

&lt;h4 id=&quot;step-21-entity-extraction&quot;&gt;Step 2.1 Entity Extraction&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Raw passages&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Extract named entities&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Named entity set&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-22-extract-kg-triples&quot;&gt;Step 2.2 Extract KG Triples&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Passages + named entities&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Extract noun phrase nodes and relation edges via OpenIE&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Nodes and edges of schemaless open KG&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-23-synonym-edge-completion&quot;&gt;Step 2.3 Synonym Edge Completion&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Node set + retrieval encoder&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Add synonym edges between entities with cosine similarity above threshold τ&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Extended edge set&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-24-dense-sparse-fusion-core-design&quot;&gt;Step 2.4 Dense-Sparse Fusion (Core Design)&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;KG nodes + passages&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Abstract the complete vector representation of each passage as a passage node; passage nodes connected to corresponding entities via contains edges&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Inspired by brain’s dense-sparse integration — phrase nodes as sparse encoding of extracted concepts, passage nodes as dense encoding of the context from which concepts originate&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Enhanced KG (containing phrase nodes, passage nodes, contains edges)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Why this approach?&lt;/strong&gt; Concepts are concise and generalizable but lose information; context is semantically rich but adds complexity. Dense-sparse fusion retains the advantages of both, resolving the concept-context trade-off.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;3-online-query-retrieval-detailed-execution-1&quot;&gt;3. Online Query: Retrieval (Detailed Execution)&lt;/h3&gt;

&lt;h4 id=&quot;31-retrieval-mode-overview&quot;&gt;3.1 Retrieval Mode Overview&lt;/h4&gt;

&lt;p&gt;HippoRAG 2 supports three query mapping methods, with Query to Triple as the default:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Method&lt;/th&gt;
      &lt;th&gt;Mechanism&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;NER to Node&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Extract query entities → embedding matches KG nodes&lt;/td&gt;
      &lt;td&gt;HippoRAG’s original method&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Query to Node&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Entire query embedding directly matches KG nodes&lt;/td&gt;
      &lt;td&gt;Does not extract individual entities&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Query to Triple&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Entire query embedding matches triples in the graph&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;Default&lt;/strong&gt;, incorporates richer contextual information&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;32-retrieval-procedure&quot;&gt;3.2 Retrieval Procedure&lt;/h4&gt;

&lt;h5 id=&quot;step-31-query-embedding&quot;&gt;Step 3.1 Query Embedding&lt;/h5&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Operation&lt;/strong&gt;: Encode the query as a vector representation&lt;/li&gt;
&lt;/ul&gt;

&lt;h5 id=&quot;step-32-vector-recall&quot;&gt;Step 3.2 Vector Recall&lt;/h5&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;top-k passages&lt;/strong&gt;: Recalled via embedding similarity&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;top-k triples&lt;/strong&gt;: Query to Triple, recall triples in the graph via embedding similarity&lt;/li&gt;
&lt;/ul&gt;

&lt;h5 id=&quot;step-33-llm-filters-triples&quot;&gt;Step 3.3 LLM Filters Triples&lt;/h5&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Operation&lt;/strong&gt;: Use LLM to filter retrieved triples T, producing T’ ⊆ T&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Purpose&lt;/strong&gt;: Improve retrieval quality, remove irrelevant triples&lt;/li&gt;
&lt;/ul&gt;

&lt;h5 id=&quot;step-34-seed-node-selection&quot;&gt;Step 3.4 Seed Node Selection&lt;/h5&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;If filtered set is empty&lt;/strong&gt;: Directly return embedding-recalled top-k passages&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Otherwise&lt;/strong&gt;:
    &lt;ul&gt;
      &lt;li&gt;&lt;strong&gt;Phrase nodes&lt;/strong&gt;: Identify phrase nodes from filtered triples T’, select top-k by average ranking score, assign probabilities based on normalized ranking scores&lt;/li&gt;
      &lt;li&gt;&lt;strong&gt;Passage nodes&lt;/strong&gt;: All recalled passage nodes also serve as seed nodes, with initial probability being embedding similarity multiplied by a weight factor (§6.2), balancing the influence of phrase nodes and passage nodes&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h5 id=&quot;step-35-merge-and-global-normalization&quot;&gt;Step 3.5 Merge and Global Normalization&lt;/h5&gt;

&lt;ul&gt;
  &lt;li&gt;Merge probability distributions of phrase nodes and passage nodes&lt;/li&gt;
  &lt;li&gt;Global normalization&lt;/li&gt;
&lt;/ul&gt;

&lt;h5 id=&quot;step-36-personalized-pagerank&quot;&gt;Step 3.6 Personalized PageRank&lt;/h5&gt;

&lt;ul&gt;
  &lt;li&gt;Execute PPR using the merged seed node probability distribution as the personalized vector&lt;/li&gt;
&lt;/ul&gt;

&lt;h5 id=&quot;step-37-sort-and-output&quot;&gt;Step 3.7 Sort and Output&lt;/h5&gt;

&lt;ul&gt;
  &lt;li&gt;PPR output node probabilities multiplied by the co-occurrence matrix to obtain final passage ranking scores&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;4-online-generation-generation-detailed-execution-1&quot;&gt;4. Online Generation: Generation (Detailed Execution)&lt;/h3&gt;

&lt;p&gt;HippoRAG 2 is positioned as a retrieval framework; the notes do not explicitly document special design in the generation phase. Retrieved and ranked passages serve as context input to the downstream LLM for answer generation.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;User query
    │
    ▼
┌─────────────────┐
│ Query embedding │
└────────┬────────┘
         │
    ┌────┴────┐
    ▼         ▼
┌────────┐ ┌─────────────┐
│ top-k  │ │ top-k       │
│passages│ │ triples     │
└───┬────┘ └──────┬──────┘
    │             │
    │             ▼
    │      ┌─────────────┐
    │      │ LLM filters │
    │      └──────┬──────┘
    │             │
    │      ┌──────┴──────┐
    │      ▼             ▼
    │  ┌────────┐   ┌──────────┐
    │  │ T&apos; is  │   │ T&apos; is    │
    │  │ empty  │   │ non-empty│
    │  └───┬────┘   └────┬─────┘
    │      │             │
    ▼      ▼             ▼
┌─────────────────────────────────┐
│ Directly return passages        │   ← T&apos; is empty
│ or                              │
│ Phrase nodes + Passage nodes    │   ← T&apos; is non-empty
│ → merge &amp;amp; normalize → PPR       │
└─────────────────────────────────┘
         │
         ▼
┌─────────────────┐
│ Passage ranking │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ LLM generates   │
│ answer          │
└─────────────────┘
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;5-key-design-decisions-1&quot;&gt;5. Key Design Decisions&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Decision Point&lt;/th&gt;
      &lt;th&gt;HippoRAG2’s Choice&lt;/th&gt;
      &lt;th&gt;Alternative&lt;/th&gt;
      &lt;th&gt;Rationale&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Concept-context fusion&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Dense-Sparse fusion (passage nodes + contains edges)&lt;/td&gt;
      &lt;td&gt;Pure concepts (phrase nodes) / pure context&lt;/td&gt;
      &lt;td&gt;Resolves concept-context trade-off, retaining conceptual conciseness while incorporating contextual richness&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Retrieval strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Query to Triple + LLM filtering&lt;/td&gt;
      &lt;td&gt;NER to Node / Query to Node&lt;/td&gt;
      &lt;td&gt;Incorporates richer contextual information from KG&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Seed nodes&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Phrase nodes + Passage nodes jointly as PPR seeds&lt;/td&gt;
      &lt;td&gt;Phrase nodes only (HippoRAG)&lt;/td&gt;
      &lt;td&gt;Broader activation improves multi-hop reasoning capability&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Passage node probability&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Embedding similarity × weight factor&lt;/td&gt;
      &lt;td&gt;Ranking scores / uniform distribution&lt;/td&gt;
      &lt;td&gt;Balances the influence between phrase nodes and passage nodes&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;6-evaluation-1&quot;&gt;6. Evaluation&lt;/h3&gt;

&lt;h4 id=&quot;61-evaluation-metrics-1&quot;&gt;6.1 Evaluation Metrics&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Metric&lt;/th&gt;
      &lt;th&gt;Meaning&lt;/th&gt;
      &lt;th&gt;System vs. Baseline&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;recall@5&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Retrieval recall&lt;/td&gt;
      &lt;td&gt;Superior to standard RAG and HippoRAG&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;F1&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;F1 score&lt;/td&gt;
      &lt;td&gt;Comprehensively superior to standard RAG on factual, sense-making, and associative memory tasks&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;62-comparative-experimental-setup-1&quot;&gt;6.2 Comparative Experimental Setup&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Condition&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;HippoRAG 2&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Full system (Dense-Sparse fusion + Query to Triple + LLM filtering)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;BM25&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Sparse retrieval baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Contriever / GTR&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Dense retrieval baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;RAPTOR&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Hierarchical summary retrieval baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;GraphRAG / LightRAG&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Graph RAG baselines&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;HippoRAG&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Predecessor method baseline&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;strong&gt;Datasets:&lt;/strong&gt;&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Task Type&lt;/th&gt;
      &lt;th&gt;Datasets&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Simple QA&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;NaturalQuestions, PopQA&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Multi-hop QA&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;MuSiQue, 2WikiMultihopQA, HotpotQA, LV-Eval&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Discourse Understanding&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;NarrativeQA&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;strong&gt;Key findings:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Comprehensively superior to standard RAG on factual memory, sense-making, and associative memory tasks&lt;/li&gt;
  &lt;li&gt;7% improvement over SOTA embedding models on associative memory tasks&lt;/li&gt;
  &lt;li&gt;Resolves HippoRAG’s performance degradation on factual memory tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;7-limitations-and-applicability-1&quot;&gt;7. Limitations and Applicability&lt;/h3&gt;

&lt;p&gt;Notes do not explicitly document specific limitations. From a design perspective, while the concept-context trade-off is mitigated through fusion, the dense-sparse weight balancing still relies on hyperparameter tuning.&lt;/p&gt;

&lt;h4 id=&quot;best-applicable-scenarios-1&quot;&gt;Best Applicable Scenarios&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;RAG scenarios requiring simultaneous consideration of &lt;strong&gt;factual memory&lt;/strong&gt; and &lt;strong&gt;associative reasoning&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Environments requiring &lt;strong&gt;multi-hop QA&lt;/strong&gt; without bearing the high cost of iterative retrieval&lt;/li&gt;
  &lt;li&gt;Scenarios where query information can be expressed through both &lt;strong&gt;entity relations&lt;/strong&gt; and &lt;strong&gt;contextual context&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;unsuitable-scenarios-1&quot;&gt;Unsuitable Scenarios&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;Notes do not explicitly document unsuitable scenarios&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;8-quick-reference-1&quot;&gt;8. Quick Reference&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;What You Want to Know&lt;/th&gt;
      &lt;th&gt;See Which Section&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;What is the complete pipeline?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#0-execution-overview&quot;&gt;0. Execution Overview&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;High-level design comparison?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#1-high-level-design-indexing--retrieval--generation&quot;&gt;1. High-level Design&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How is the index constructed?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#2-offline-construction-indexing-detailed-execution&quot;&gt;2. Offline Construction&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How does the retrieval strategy work?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#3-online-query-retrieval-detailed-execution&quot;&gt;3. Online Query&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How does it differ from HippoRAG?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#5-key-design-decisions&quot;&gt;5. Key Design Decisions&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How does it perform?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#6-evaluation&quot;&gt;6. Evaluation&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

</description>
        <pubDate>Mon, 25 May 2026 00:00:00 +0000</pubDate>
        <link>https://www.caozihang.com//2026/05/25/week/</link>
        <guid isPermaLink="true">https://www.caozihang.com//2026/05/25/week/</guid>
        
        <category>日常</category>
        
        
      </item>
    
      <item>
        <title>【阅】本周阅读摘选2026-05-11 → 2026-05-17</title>
        <description>&lt;center style=&quot;margin-bottom: 20px; margin-top: 50px&quot;&gt;&lt;font color=&quot;#3879B1&quot; style=&quot;line-height: 1.4;font-weight: 700;font-size: 36px;box-sizing: border-box; &quot;&gt;本周阅读摘选&lt;/font&gt;&lt;/center&gt;

&lt;center style=&quot; margin-bottom: 30px;&quot;&gt;2026-05-11 → 2026-05-17&lt;/center&gt;

&lt;font style=&quot;font-weight: bold;&quot;&gt;目录&lt;/font&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#学术相关&quot; id=&quot;markdown-toc-学术相关&quot;&gt;学术相关&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#lightrag&quot; id=&quot;markdown-toc-lightrag&quot;&gt;LightRAG&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#0-execution-overview&quot; id=&quot;markdown-toc-0-execution-overview&quot;&gt;0. Execution Overview&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#1-high-level-design-indexing--retrieval--generation&quot; id=&quot;markdown-toc-1-high-level-design-indexing--retrieval--generation&quot;&gt;1. High-level Design (Indexing → Retrieval → Generation)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#2-offline-construction-indexing-detailed-execution&quot; id=&quot;markdown-toc-2-offline-construction-indexing-detailed-execution&quot;&gt;2. Offline Construction: Indexing (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#3-online-query-retrieval-detailed-execution&quot; id=&quot;markdown-toc-3-online-query-retrieval-detailed-execution&quot;&gt;3. Online Query: Retrieval (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#4-online-generation-generation-detailed-execution&quot; id=&quot;markdown-toc-4-online-generation-generation-detailed-execution&quot;&gt;4. Online Generation: Generation (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#5-key-design-decisions&quot; id=&quot;markdown-toc-5-key-design-decisions&quot;&gt;5. Key Design Decisions&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#6-evaluation&quot; id=&quot;markdown-toc-6-evaluation&quot;&gt;6. Evaluation&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#7-limitations-and-applicability&quot; id=&quot;markdown-toc-7-limitations-and-applicability&quot;&gt;7. Limitations and Applicability&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#8-quick-reference&quot; id=&quot;markdown-toc-8-quick-reference&quot;&gt;8. Quick Reference&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#graphrag&quot; id=&quot;markdown-toc-graphrag&quot;&gt;GraphRAG&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#0-execution-overview-1&quot; id=&quot;markdown-toc-0-execution-overview-1&quot;&gt;0. Execution Overview&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#1-offline-construction-knowledge-graph-indexing&quot; id=&quot;markdown-toc-1-offline-construction-knowledge-graph-indexing&quot;&gt;1. Offline Construction: Knowledge Graph Indexing&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#2-online-query-four-search-modes&quot; id=&quot;markdown-toc-2-online-query-four-search-modes&quot;&gt;2. Online Query: Four Search Modes&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#3-online-generation-map-reduce-answer-aggregation-global-search-example&quot; id=&quot;markdown-toc-3-online-generation-map-reduce-answer-aggregation-global-search-example&quot;&gt;3. Online Generation: Map-Reduce Answer Aggregation (Global Search Example)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#4-key-design-decisions&quot; id=&quot;markdown-toc-4-key-design-decisions&quot;&gt;4. Key Design Decisions&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#5-evaluation&quot; id=&quot;markdown-toc-5-evaluation&quot;&gt;5. Evaluation&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#6-limitations-and-applicability&quot; id=&quot;markdown-toc-6-limitations-and-applicability&quot;&gt;6. Limitations and Applicability&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#7-quick-reference&quot; id=&quot;markdown-toc-7-quick-reference&quot;&gt;7. Quick Reference&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#相关项目&quot; id=&quot;markdown-toc-相关项目&quot;&gt;相关项目&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;学术相关&quot;&gt;学术相关&lt;/h1&gt;

&lt;h2 id=&quot;lightrag&quot;&gt;LightRAG&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;One-sentence positioning&lt;/strong&gt;: Integrates graph structure into the text indexing and retrieval process, achieving comprehensive answers to complex queries while maintaining retrieval efficiency through a dual-layer retrieval system (low-level entity retrieval + high-level relation retrieval) with graph-vector hybrid representation.&lt;/p&gt;

  &lt;p&gt;&lt;strong&gt;Key innovation&lt;/strong&gt;: Replaces traditional chunk traversal and pure embedding matching with key-value structures, combining dual-layer query decomposition with one-hop neighbor expansion to support both concrete entity local retrieval and abstract concept global association.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;0-execution-overview&quot;&gt;0. Execution Overview&lt;/h3&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;Offline Phase (one-time construction)
  ├─ ① Text chunking (chunk size = 1200)
  ├─ ② Extract entities and relations (LLM-based)
  ├─ ③ Generate key-value pairs for each entity/relation
  │      - K: words or phrases convenient for retrieval
  │      - V: relevant segments from the original document
  ├─ ④ Deduplication (reduce graph scale)
  └─ ⑤ Build graph structure (nodes = entities, edges = relations)
         ↓
Online Phase (per query)
  ├─ ⑥ Query keyword extraction (intent identification: Low-Level vs High-Level)
  ├─ ⑦ Keyword matching (vector similarity retrieval)
  │      - Low-Level → match entity Nodes
  │      - High-Level → match relation Edges
  ├─ ⑧ Incorporate high-order relatedness (supplement one-hop neighbors)
  └─ ⑨ Concatenate context and generate answer
         ↓
Incremental Update (new data added)
  └─ ⑩ Execute ①~③ on new documents, union with original graph, no full rebuild needed
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;1-high-level-design-indexing--retrieval--generation&quot;&gt;1. High-level Design (Indexing → Retrieval → Generation)&lt;/h3&gt;

&lt;h4 id=&quot;11-indexing&quot;&gt;1.1 Indexing&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Chunking strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Fixed token size chunking (default 1200)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Index structure&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Graph index + vector index (hybrid)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Knowledge representation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Entity-relation graph + key-value pairs (K=retrieval keywords, V=original document segments)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Construction cost&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Medium (LLM extracts entities/relations + generates K-V pairs, but lighter than GraphRAG’s recursive community summarization)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Graph-based text index with incremental updates requiring no full database reprocessing&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;12-retrieval&quot;&gt;1.2 Retrieval&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Retrieval method&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Hybrid (vector similarity + graph traversal expansion)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Retrieval granularity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Entity / Relation / One-hop neighbor subgraph&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Iteration strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Single retrieval + neighbor expansion (not iterative multi-hop)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Query processing&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Query decomposed into Low-Level (concrete/local) and High-Level (abstract/global) keywords&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Dual-layer retrieval system: matches entity nodes for concrete queries, matches relation edges for abstract queries&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;13-generation&quot;&gt;1.3 Generation&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Context injection&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Direct concatenation of retrieved entities, relation descriptions, and neighbor information&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Citation tracing&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Based on V in key-value pairs (original document segments)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Quality control&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Standard LLM generation (no special post-processing)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Graph structure provides cross-chunk multi-hop associations, solving vector RAG’s context fragmentation problem&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;2-offline-construction-indexing-detailed-execution&quot;&gt;2. Offline Construction: Indexing (Detailed Execution)&lt;/h3&gt;

&lt;h4 id=&quot;step-21-text-chunking&quot;&gt;Step 2.1 Text Chunking&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Raw document collection&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Segment text by fixed token size (default chunk size = 1200)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Chunk size fixed at 1200, consistent with GraphRAG and other baselines for fair comparison&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Text Chunks&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-22-entity-and-relation-extraction&quot;&gt;Step 2.2 Entity and Relation Extraction&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Text Chunks&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;LLM extracts entities and relations from each chunk&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Uses LLM (default GPT-4o-mini) for extraction; gleaning parameter fixed at 1&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Entity set + relation set&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-23-key-value-pair-generation-core-design&quot;&gt;Step 2.3 Key-Value Pair Generation (Core Design)&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Extracted entities and relations&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;For each entity and relation, LLM generates (K, V) pairs&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;K&lt;/strong&gt; uses retrieval-friendly words or phrases (not full descriptions); &lt;strong&gt;V&lt;/strong&gt; uses relevant segments from the original document&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Entity nodes and relation edges with K-V attributes&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Why this design?&lt;/strong&gt; Traditional RAG relies on embedding matching or chunk traversal — the former has low precision, the latter has low efficiency. The key-value structure separates retrieval keywords from original evidence, optimizing retrieval speed while preserving provenance capability.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4 id=&quot;step-24-deduplication&quot;&gt;Step 2.4 Deduplication&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Entities/relations extracted from each chunk&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Merge duplicate entities and relations, reducing graph scale&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Deduplicated knowledge graph&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-25-incremental-update&quot;&gt;Step 2.5 Incremental Update&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;New document D’&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Execute the same indexing steps on new documents, generate new graph data, then take the union of node sets and edge sets with the original graph&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;No full rebuild required; union-based merging; stands in stark contrast to GraphRAG’s community structure decomposition and rebuild&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Updated knowledge graph&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Comparison with GraphRAG&lt;/strong&gt;: When new data is added, GraphRAG must decompose existing community structures, incorporate new entities/relations, and then fully rebuild; LightRAG’s union-based incremental update significantly reduces maintenance cost in dynamic environments.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;3-online-query-retrieval-detailed-execution&quot;&gt;3. Online Query: Retrieval (Detailed Execution)&lt;/h3&gt;

&lt;h4 id=&quot;31-retrieval-mode-overview&quot;&gt;3.1 Retrieval Mode Overview&lt;/h4&gt;

&lt;p&gt;LightRAG does not distinguish multiple search modes; instead, it adaptively splits &lt;strong&gt;the same retrieval procedure&lt;/strong&gt; into two layers based on query intent:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Layer&lt;/th&gt;
      &lt;th&gt;Applicable Queries&lt;/th&gt;
      &lt;th&gt;Matching Target&lt;/th&gt;
      &lt;th&gt;Core Mechanism&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Low-Level (Local)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Concrete entity queries (“What is the attribute of X?”)&lt;/td&gt;
      &lt;td&gt;Entity Nodes&lt;/td&gt;
      &lt;td&gt;Vector similarity matches entity nodes → expand neighbors&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;High-Level (Global)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Abstract concept queries (“What are the overall trends of X?”)&lt;/td&gt;
      &lt;td&gt;Relation Edges&lt;/td&gt;
      &lt;td&gt;Vector similarity matches relation edges → expand neighbors&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;32-retrieval-procedure&quot;&gt;3.2 Retrieval Procedure&lt;/h4&gt;

&lt;h5 id=&quot;step-31-query-keyword-extraction&quot;&gt;Step 3.1 Query Keyword Extraction&lt;/h5&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Intent identification&lt;/strong&gt;: Decompose query content into Low-Level (Local) and High-Level (Global) keyword layers&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Key requirement&lt;/strong&gt;: Extracted keywords must use the original expressions consistent with the query&lt;/li&gt;
&lt;/ul&gt;

&lt;h5 id=&quot;step-32-keyword-matching&quot;&gt;Step 3.2 Keyword Matching&lt;/h5&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Low-Level&lt;/strong&gt;: Match entity nodes to query keywords via vector similarity retrieval&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;High-Level&lt;/strong&gt;: Match relation edges to query keywords via vector similarity retrieval&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Implementation&lt;/strong&gt;: Uses nano vector database for vector data management and access&lt;/li&gt;
&lt;/ul&gt;

&lt;h5 id=&quot;step-33-incorporating-high-order-relatedness&quot;&gt;Step 3.3 Incorporating High-Order Relatedness&lt;/h5&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Operation&lt;/strong&gt;: Supplement retrieved nodes/edges with their one-hop neighbors&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Purpose&lt;/strong&gt;: Complete context, capture implicit associations, improve answer coherence&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Characteristic&lt;/strong&gt;: Compared to pure vector retrieval, graph-neighbor expansion provides cross-chunk association capability&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Why is dual-layer retrieval effective?&lt;/strong&gt; Traditional RAG only performs local retrieval and cannot answer global questions like “overall trends.” LightRAG maps queries to both entity and relation layers simultaneously, accommodating both concrete facts and abstract associations.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;4-online-generation-generation-detailed-execution&quot;&gt;4. Online Generation: Generation (Detailed Execution)&lt;/h3&gt;

&lt;p&gt;LightRAG’s generation phase is standard:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;User query
    │
    ▼
┌─────────────────┐
│ Query           │  ← Extract Low-Level + High-Level keywords
│ Decomposition   │
│ (Keyword        │
│  Extraction)    │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Vector          │  ← Retrieve matching entity nodes and relation edges
│ Retrieval       │
│ (Keyword        │
│  Matching)      │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Neighbor        │  ← Supplement one-hop neighbors, build subgraph context
│ Expansion       │
│ (High-Order     │
│  Relatedness)   │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Context         │  ← Combine entity descriptions, relation descriptions, original segments
│ Assembly        │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ LLM generates   │
│ answer          │
│ (Generation)    │
└─────────────────┘
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;5-key-design-decisions&quot;&gt;5. Key Design Decisions&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Decision Point&lt;/th&gt;
      &lt;th&gt;LightRAG’s Choice&lt;/th&gt;
      &lt;th&gt;Alternative&lt;/th&gt;
      &lt;th&gt;Rationale&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Index structure&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Graph + key-value pairs&lt;/td&gt;
      &lt;td&gt;Pure vector index / text chunk index / community summaries&lt;/td&gt;
      &lt;td&gt;K-V balances retrieval efficiency and provenance capability, outperforming pure embedding matching and chunk traversal&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Dual-layer retrieval&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Low-Level(entity) + High-Level(relation)&lt;/td&gt;
      &lt;td&gt;Single-layer unified retrieval&lt;/td&gt;
      &lt;td&gt;Simultaneously covers concrete fact queries and abstract association queries&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Neighbor expansion&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;One-hop neighbors only&lt;/td&gt;
      &lt;td&gt;Multi-hop traversal / no expansion&lt;/td&gt;
      &lt;td&gt;One-hop neighbors balance quality and efficiency, avoiding over-expansion that introduces noise&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Incremental update&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Union merging (node set + edge set)&lt;/td&gt;
      &lt;td&gt;Full rebuild (e.g., GraphRAG)&lt;/td&gt;
      &lt;td&gt;No need to decompose existing structure, significantly reducing update cost in dynamic environments&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Deduplication strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Simple deduplication to reduce graph scale&lt;/td&gt;
      &lt;td&gt;Semantic similarity merging / entity alignment&lt;/td&gt;
      &lt;td&gt;Keeps things simple and efficient; lightweight deduplication is sufficient for optimization&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;6-evaluation&quot;&gt;6. Evaluation&lt;/h3&gt;

&lt;h4 id=&quot;61-evaluation-metrics&quot;&gt;6.1 Evaluation Metrics&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;LLM-as-a-Judge&lt;/strong&gt; four-dimensional scoring:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Metric&lt;/th&gt;
      &lt;th&gt;Meaning&lt;/th&gt;
      &lt;th&gt;LightRAG vs Baseline&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Comprehensiveness&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Topical coverage of answers&lt;/td&gt;
      &lt;td&gt;Superior to chunk-based methods, comparable to or better than GraphRAG&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Diversity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Diversity of viewpoints/angles in answers&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;Significant advantage&lt;/strong&gt; (against all baselines)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Empowerment&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Degree to which answers help users&lt;/td&gt;
      &lt;td&gt;Superior to chunk-based methods&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Overall&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Composite score&lt;/td&gt;
      &lt;td&gt;Superior to GraphRAG and all chunk-based baselines&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;62-comparative-experimental-setup&quot;&gt;6.2 Comparative Experimental Setup&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Condition&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;LightRAG&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Full system (graph + vector + dual-layer retrieval)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;GraphRAG&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Graph RAG baseline (community summaries + Map-Reduce)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;NaiveRAG&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Traditional vector RAG baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;HyDE&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Hypothetical document embedding baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;RQ-RAG&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Query rewriting RAG baseline&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;strong&gt;Dataset&lt;/strong&gt;: UltraDomain benchmark (4 subsets, derived from 428 university textbooks, covering 18 domains including Agriculture, Mix, etc.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key findings&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;On large datasets and complex linguistic contexts, LightRAG consistently outperforms GraphRAG&lt;/li&gt;
  &lt;li&gt;Graph-based RAG systems (LightRAG, GraphRAG) consistently outperform pure chunk-based methods&lt;/li&gt;
  &lt;li&gt;LightRAG demonstrates significant advantages on the Diversity metric&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;63-ablation-study&quot;&gt;6.3 Ablation Study&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Variant&lt;/th&gt;
      &lt;th&gt;Modification&lt;/th&gt;
      &lt;th&gt;Result&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;-Low-Level&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Remove low-level retrieval module&lt;/td&gt;
      &lt;td&gt;Performance drops&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;-High-Level&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Remove high-level retrieval module&lt;/td&gt;
      &lt;td&gt;Performance drops&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;-Origin&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Do not use original text in retrieval, rely only on graph structure&lt;/td&gt;
      &lt;td&gt;No significant drop on most datasets; some even improve (Agriculture, Mix)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;-Origin finding&lt;/strong&gt;: The semantic graph in RAG is already sufficient to provide query-needed context; original text may actually introduce irrelevant noise.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;7-limitations-and-applicability&quot;&gt;7. Limitations and Applicability&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Limitation&lt;/th&gt;
      &lt;th&gt;Specific Manifestation&lt;/th&gt;
      &lt;th&gt;Mitigation&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Extraction quality depends on LLM&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Entity and relation extraction quality directly determines graph quality, affecting retrieval performance&lt;/td&gt;
      &lt;td&gt;Use stronger LLMs; increase gleaning iterations&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Incremental update simple merging&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Union-based update does not handle entity conflicts, relation contradictions, or stale data&lt;/td&gt;
      &lt;td&gt;Periodic full rebuilds; introduce conflict detection mechanisms&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Neighbor expansion depth fixed&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Only one-hop neighbor expansion, may be insufficient for complex queries requiring multi-hop reasoning&lt;/td&gt;
      &lt;td&gt;Configurable expansion depth; combine with iterative retrieval&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Limited evaluation scope&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Mainly validated on UltraDomain benchmark; domain coverage is broad but not comprehensive&lt;/td&gt;
      &lt;td&gt;Validate on more real-world scenarios and vertical domains&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;best-applicable-scenarios&quot;&gt;Best Applicable Scenarios&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;General-purpose RAG scenarios requiring &lt;strong&gt;both concrete facts and abstract associations&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Environments with &lt;strong&gt;frequent dynamic data updates&lt;/strong&gt; (incremental update advantage)&lt;/li&gt;
  &lt;li&gt;Requirements for &lt;strong&gt;retrieval efficiency&lt;/strong&gt; where GraphRAG-style full rebuild cost is unacceptable&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Local + global hybrid queries&lt;/strong&gt; on medium-scale document collections&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;unsuitable-scenarios&quot;&gt;Unsuitable Scenarios&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;Complex queries requiring &lt;strong&gt;deep multi-hop reasoning&lt;/strong&gt; (&amp;gt;1 hop) (neighbor expansion only one-hop)&lt;/li&gt;
  &lt;li&gt;Scenarios with extremely high demands on &lt;strong&gt;entity alignment precision&lt;/strong&gt; (simple deduplication may be insufficient)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Pure global semantic understanding&lt;/strong&gt; on very large document collections (GraphRAG’s community summaries are more advantageous here)&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;8-quick-reference&quot;&gt;8. Quick Reference&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;What You Want to Know&lt;/th&gt;
      &lt;th&gt;See Which Section&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;What is the complete pipeline?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#0-execution-overview&quot;&gt;0. Execution Overview&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How is the index constructed?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#2-offline-construction-indexing-detailed-execution&quot;&gt;2. Offline Construction&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How does dual-layer retrieval work?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#3-online-query-retrieval-detailed-execution&quot;&gt;3. Online Query&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Incremental update mechanism?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#step-25-incremental-update&quot;&gt;2.5 Incremental Update&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Difference from GraphRAG?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#5-key-design-decisions&quot;&gt;5. Key Design Decisions&lt;/a&gt; + &lt;a href=&quot;#step-25-incremental-update&quot;&gt;2.5 Incremental Update&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How does it perform?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#6-evaluation&quot;&gt;6. Evaluation&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;When should it NOT be used?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#7-limitations-and-applicability&quot;&gt;7. Limitations and Applicability&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h2 id=&quot;graphrag&quot;&gt;GraphRAG&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;One-sentence positioning&lt;/strong&gt;: Uses LLMs to construct a hierarchical knowledge graph from text corpora, then aggregates community summaries via Map-Reduce to answer global questions like “what does the entire dataset discuss.”&lt;/p&gt;

  &lt;p&gt;&lt;strong&gt;Key innovation&lt;/strong&gt;: Transforms traditional RAG’s “retrieve then generate” into “pre-compute community summaries, then aggregate on demand,” solving the inability of vector RAG to perform global semantic understanding.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;0-execution-overview-1&quot;&gt;0. Execution Overview&lt;/h3&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;Offline Phase (one-time construction)
  ├─ ① Text chunking
  ├─ ② Entity and relation extraction
  ├─ ③ Knowledge graph construction
  ├─ ④ Graph community detection (hierarchical)
  └─ ⑤ Community summary generation (bottom-up recursive aggregation)
         ↓
Online Phase (per query)
  ├─ ⑥ Load all community summaries at the target level
  ├─ ⑦ Map: Parallelly generate community answers + score
  └─ ⑧ Reduce: Sort by score and merge into global answer
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;1-offline-construction-knowledge-graph-indexing&quot;&gt;1. Offline Construction: Knowledge Graph Indexing&lt;/h3&gt;

&lt;h4 id=&quot;step-11-text-chunking&quot;&gt;Step 1.1 Text Chunking&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Raw document collection&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Split text into fixed token-size chunks&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Larger chunks → fewer LLM calls (cost savings), but recall of early information within chunks decreases&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Text Chunks&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-12-entity-and-relation-extraction&quot;&gt;Step 1.2 Entity and Relation Extraction&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Text Chunks&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;LLM extracts entities, relations, and claims from each chunk&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Entities &amp;amp; Relationships&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;strong&gt;Self-Reflection (Gleaning loop)&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;① First extract entities → ② Feed extraction results back to LLM → ③ Force yes/no judgment on “anything missed” → ④ If missed, prompt “MANY entities were missed in the last extraction” to complete&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Implementation evolution:&lt;/strong&gt;&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Version&lt;/th&gt;
      &lt;th&gt;Implementation&lt;/th&gt;
      &lt;th&gt;Principle&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Before v2.2.0&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;logit_bias=100&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Adds +100 bias to logits of “Y”/”N” tokens, forcing output of only Y or N at probability level&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;After v2.2.0&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Pure Prompt&lt;/td&gt;
      &lt;td&gt;Uses instruction “Please answer with a single letter Y or N” (because o-series reasoning models do not support logit_bias)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;strong&gt;Source code references:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Loop Prompt: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;prompts/index/extract_graph.py:129&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Gleaning loop: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;graph_extractor.py:115-121&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;step-13-entity-alignment--knowledge-graph&quot;&gt;Step 1.3 Entity Alignment → Knowledge Graph&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Entities/relations extracted from each chunk&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Exact string matching for entity alignment (merge different expressions of the same entity)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Error tolerance&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Duplicate entities are typically clustered into the same community in subsequent steps, with limited impact on results&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Knowledge Graph (nodes = entities, edges = relations)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-14-community-detection-hierarchical&quot;&gt;Step 1.4 Community Detection (Hierarchical)&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Knowledge Graph&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Leiden algorithm recursively detects communities: large graph → sub-communities → sub-sub-communities… until no further division (leaf communities)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Hierarchical Graph Communities (C0 = top-level full graph, C1, C2, C3… increasingly fine-grained)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-15-community-summary-generation-core-pre-computation&quot;&gt;Step 1.5 Community Summary Generation (Core Pre-computation)&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Hierarchical community structure&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Bottom-up recursive generation: leaf community summaries → use leaf summaries to generate parent summaries → layer by layer upward&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Summary content&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Comprehensive description of entities, relations, and claims within the community&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Community Summaries (one report-style summary per community)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Why this approach?&lt;/strong&gt; Traditional RAG retrieves relevant documents at query time; GraphRAG pre-computes “what each cluster of related entities is discussing” offline, making it directly available at query time.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;2-online-query-four-search-modes&quot;&gt;2. Online Query: Four Search Modes&lt;/h3&gt;

&lt;p&gt;Select mode based on query type:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Mode&lt;/th&gt;
      &lt;th&gt;Applicable Query&lt;/th&gt;
      &lt;th&gt;Context Source&lt;/th&gt;
      &lt;th&gt;Core Mechanism&lt;/th&gt;
      &lt;th&gt;Cost&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Global Search&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;“What are the main themes in the dataset?”&lt;/td&gt;
      &lt;td&gt;Community reports (full load)&lt;/td&gt;
      &lt;td&gt;Map-Reduce + LLM scoring and filtering&lt;/td&gt;
      &lt;td&gt;Medium&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Local Search&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;“What are the attributes of entity X?”&lt;/td&gt;
      &lt;td&gt;Entity neighborhood + graph structure&lt;/td&gt;
      &lt;td&gt;Vector retrieval of entities → graph traversal to expand neighbors → mixed context&lt;/td&gt;
      &lt;td&gt;Medium&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;DRIFT Search&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Complex multi-hop reasoning questions&lt;/td&gt;
      &lt;td&gt;Iterative graph exploration&lt;/td&gt;
      &lt;td&gt;HyDE → Action loop (LocalSearch) → Reduce&lt;/td&gt;
      &lt;td&gt;High&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Basic Search&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Simple QA&lt;/td&gt;
      &lt;td&gt;Raw text chunks&lt;/td&gt;
      &lt;td&gt;Pure vector retrieval (no graph structure utilization)&lt;/td&gt;
      &lt;td&gt;Low&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;21-global-search&quot;&gt;2.1 Global Search&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt;: Map-Reduce architecture, global summarization based on community reports&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Map phase&lt;/strong&gt;: Independently question each community report, generating local answers (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;global_search/search.py&lt;/code&gt;, using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MAP_SYSTEM_PROMPT&lt;/code&gt;)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Reduce phase&lt;/strong&gt;: Aggregate all local answers to generate the final answer (using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;REDUCE_SYSTEM_PROMPT&lt;/code&gt;)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Characteristic&lt;/strong&gt;: Does not rely on vector retrieval; uses the hierarchical community structure to answer top-down&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;22-local-search&quot;&gt;2.2 Local Search&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt;: Build context based on entity-relation graph neighborhoods&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Procedure&lt;/strong&gt;:
    &lt;ol&gt;
      &lt;li&gt;Find entities most relevant to the query via vector retrieval (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mixed_context.py:map_query_to_entities&lt;/code&gt;)&lt;/li&gt;
      &lt;li&gt;Obtain neighbor relations, descriptions, community reports, text fragments of these entities from the graph&lt;/li&gt;
      &lt;li&gt;Concatenate into mixed context and pass to LLM for answering&lt;/li&gt;
    &lt;/ol&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Characteristic&lt;/strong&gt;: Leverages graph structure for local diffusion, adding one layer of structured information beyond pure vector retrieval&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;23-drift-search-dynamic-reasoning-graph-search&quot;&gt;2.3 DRIFT Search (Dynamic Reasoning Graph Search)&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt;: Adds iterative reasoning and exploration on top of Local Search&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Essence&lt;/strong&gt;: Iteratively enhanced version of Local Search, similar to ReAct pattern, where the LLM autonomously decides the next exploration direction&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Introduced in&lt;/strong&gt;: v0.4.0 (2024-11-05)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Complete procedure (4 phases):&lt;/strong&gt;&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;User query
    │
    ▼
┌─────────────────────────────────────────────┐
│ Phase 1: Primer (Initialization)            │
│   HyDE hypothesis → Vector retrieval of community reports |
│   LLM generates intermediate answer + score + follow-up query list |
└──────────────┬──────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────┐
│ Phase 2: Action exploration loop (n_depth rounds) |
│   Select top-k incomplete actions from follow-up queries |
│   Each action uses LocalSearch to search graph neighborhood |
│   Generate new follow-ups → add to graph   |
└──────────────┬──────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────┐
│ Phase 3: Serialization                      │
│   Serialize entire action graph as JSON      |
└──────────────┬──────────────────────────────┘
               │
               ▼
┌─────────────────────────────────────────────┐
│ Phase 4: Reduce (Aggregation)               │
│   Aggregate all intermediate answers into one final answer |
└─────────────────────────────────────────────┘
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Phase 1: Primer — Initialize exploration (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;drift_search/primer.py&lt;/code&gt;)&lt;/strong&gt;&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Step&lt;/th&gt;
      &lt;th&gt;Operation&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;1a. HyDE query expansion&lt;/td&gt;
      &lt;td&gt;Randomly select a community report as template, have LLM generate hypothetical answer&lt;/td&gt;
      &lt;td&gt;HyDE (Hypothetical Document Embeddings) strategy&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1b. Vector retrieval of community reports&lt;/td&gt;
      &lt;td&gt;Use embedding of hypothetical answer for cosine similarity, take top-k&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;drift_context.py:168-229&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1c. LLM query decomposition&lt;/td&gt;
      &lt;td&gt;Use structured output to force JSON return&lt;/td&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DRIFT_PRIMER_PROMPT&lt;/code&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;strong&gt;PrimerResponse structure:&lt;/strong&gt;&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Field&lt;/th&gt;
      &lt;th&gt;Meaning&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;intermediate_answer&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Intermediate answer based on community reports (2000 characters)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;score&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;0-100, match degree between answer and query&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;follow_up_queries&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;At least 5 follow-up exploration queries&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;strong&gt;Phase 2: Action exploration loop — Core (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;drift_search/search.py:188-263&lt;/code&gt;)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Data structure&lt;/strong&gt;: NetworkX directed graph (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;QueryState.graph = nx.MultiDiGraph()&lt;/code&gt;)&lt;/p&gt;

    &lt;ul&gt;
      &lt;li&gt;Nodes = DriftAction (query + answer + score)&lt;/li&gt;
      &lt;li&gt;Edges = parent-child relations (which action generated the follow-up query)&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Loop logic&lt;/strong&gt;:&lt;/p&gt;

    &lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;&lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;epochs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n_depth&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# n_depth rounds
&lt;/span&gt;    &lt;span class=&quot;c1&quot;&gt;# 1. Find all incomplete actions, sort by score
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;actions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;query_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rank_incomplete_actions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# 2. Take top-k (drift_k_followups)
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;actions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;actions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;drift_k_followups&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# 3. Concurrently execute search for each action (LocalSearch)
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;await&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;search_step&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;actions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# 4. Add new results to graph
&lt;/span&gt;    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;action&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;results&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;query_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_action&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;action&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;query_state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_all_follow_ups&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;action&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;action&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;follow_ups&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;    &lt;/div&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Single Action search&lt;/strong&gt;: Calls LocalSearch, returns &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;{&quot;response&quot;, &quot;score&quot;, &quot;follow_up_queries&quot;}&lt;/code&gt;&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key configuration parameters:&lt;/strong&gt;&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Parameter&lt;/th&gt;
      &lt;th&gt;Function&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;n_depth&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Number of exploration loop rounds (depth)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;drift_k_followups&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Number of actions executed in parallel per round (width)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;primer_folds&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Number of groups to parallelize community report processing in the Primer phase&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;local_search_*&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Various parameters for LocalSearch (inherited)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;reduce_*&lt;/code&gt;&lt;/td&gt;
      &lt;td&gt;Temperature and token limits for the Reduce phase&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;24-basic-search&quot;&gt;2.4 Basic Search&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Core idea&lt;/strong&gt;: Pure vector retrieval, no graph structure utilization&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Procedure&lt;/strong&gt;: Directly performs vector similarity retrieval on raw text chunks (text units), using the most relevant text chunks as context&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Applicable scenario&lt;/strong&gt;: Simple QA, or scenarios where graph structure information is not important&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Characteristic&lt;/strong&gt;: The simplest RAG, existing as a baseline (code comment: “generic RAG algorithm (vector search on raw text chunks)”)&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;3-online-generation-map-reduce-answer-aggregation-global-search-example&quot;&gt;3. Online Generation: Map-Reduce Answer Aggregation (Global Search Example)&lt;/h3&gt;

&lt;h4 id=&quot;step-31-prepare-prepare-community-summaries&quot;&gt;Step 3.1 Prepare: Prepare Community Summaries&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;User query + all community summaries at the target level&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;① Take all community summaries at that level ② Randomly shuffle order ③ Chunk by preset token size&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Purpose&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Eliminate order bias + adapt to LLM context window&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Shuffled community summary chunks&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-32-map-parallelly-generate-intermediate-answers&quot;&gt;Step 3.2 Map: Parallelly Generate Intermediate Answers&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Each community summary chunk + user query&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Independently call LLM for each chunk: “Based on these community summaries, answer the query”&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;① &lt;strong&gt;Partial answer&lt;/strong&gt; (Answer Fragment) ② &lt;strong&gt;Helpfulness score&lt;/strong&gt; (0-100, indicating how helpful this chunk is for answering the target question)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Parallelism&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;All chunks processed simultaneously&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;A set of (partial answer, score) pairs&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-33-reduce-merge-into-global-answer&quot;&gt;Step 3.3 Reduce: Merge into Global Answer&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;All (partial answer, score) pairs&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;① Sort by helpfulness score ② Concatenate partial answers from high to low ③ Call LLM to synthesize the final answer&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Global Answer&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;Query: &quot;What are the main themes in the dataset?&quot;
         │
         ▼
  ┌─────────────────┐
  │ Load C2 level    │  ← Select community level (C0/C1/C2/C3)
  │ all community    │     Higher level = more macro = fewer summaries
  │ summaries        │     Lower level = more detailed = more summaries
  └────────┬────────┘
           │
           ▼
  ┌─────────────────┐
  │ Shuffle + Chunk  │
  └────────┬────────┘
           │
     ┌─────┼─────┐
     ▼     ▼     ▼
  [Chunk1] [Chunk2] [Chunk3]  ← Map: Process in parallel
     │     │     │
     ▼     ▼     ▼
  (Answer1, 85) (Answer2, 72) (Answer3, 90)  ← With scores
     │     │     │
     └─────┴─────┘
           │
           ▼
  ┌─────────────────┐
  │ Sort by score    │  ← Reduce
  │ and merge        │
  │ Generate final   │
  │ answer           │
  └─────────────────┘
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;4-key-design-decisions&quot;&gt;4. Key Design Decisions&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Decision Point&lt;/th&gt;
      &lt;th&gt;GraphRAG’s Choice&lt;/th&gt;
      &lt;th&gt;Alternative&lt;/th&gt;
      &lt;th&gt;Rationale&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Index structure&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Graph index + community summaries&lt;/td&gt;
      &lt;td&gt;Pure vector index / text index&lt;/td&gt;
      &lt;td&gt;Global questions require understanding entity relations, which pure vector cannot achieve&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Community detection&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Leiden hierarchical&lt;/td&gt;
      &lt;td&gt;Single-level communities&lt;/td&gt;
      &lt;td&gt;Hierarchical supports multi-granularity queries (macro → micro)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Summary generation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Bottom-up recursive&lt;/td&gt;
      &lt;td&gt;Independent generation per community&lt;/td&gt;
      &lt;td&gt;Recursive ensures higher-level summaries include lower-level information, avoiding omission&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Answer aggregation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Map-Reduce + scoring&lt;/td&gt;
      &lt;td&gt;Direct concatenation of all summaries&lt;/td&gt;
      &lt;td&gt;Scoring mechanism filters highly relevant information, controls context length&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Entity alignment&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Exact string matching&lt;/td&gt;
      &lt;td&gt;Semantic similarity matching&lt;/td&gt;
      &lt;td&gt;Simple and efficient; duplicate entities naturally cluster during community detection&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;5-evaluation&quot;&gt;5. Evaluation&lt;/h3&gt;

&lt;h4 id=&quot;51-llm-as-a-judge-no-gold-standard&quot;&gt;5.1 LLM-as-a-Judge (No Gold Standard)&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Meaning&lt;/th&gt;
      &lt;th&gt;GraphRAG vs. Vector RAG&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Comprehensiveness&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Thematic coverage of the answer&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;Superior&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Diversity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Diversity of viewpoints/angles in the answer&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;Superior&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Empowerment&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Degree of helpfulness to the user&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;Superior&lt;/strong&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Directness&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Whether the answer is concise and direct&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;Inferior&lt;/strong&gt; (trade-off)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;52-factuality-verification&quot;&gt;5.2 Factuality Verification&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Tool&lt;/strong&gt;: Claimify (LLM-based atomic fact decomposition)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Method&lt;/strong&gt;: Decompose answer sentences into simple, self-contained factual claims, verify each one individually&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;53-comparative-experimental-setup&quot;&gt;5.3 Comparative Experimental Setup&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Condition&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;GraphRAG C0/C1/C2/C3&lt;/td&gt;
      &lt;td&gt;Four different community levels (C0 = top-level, C3 = bottom-level)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;TS (Text Summarization)&lt;/td&gt;
      &lt;td&gt;Direct Map-Reduce summarization on source text (no graph structure)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;SS (Semantic Search)&lt;/td&gt;
      &lt;td&gt;Traditional vector RAG baseline&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;6-limitations-and-applicability&quot;&gt;6. Limitations and Applicability&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Limitation&lt;/th&gt;
      &lt;th&gt;Specific Manifestation&lt;/th&gt;
      &lt;th&gt;Mitigation&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Hierarchical rigidity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Enumerative classification; new categories require index reconstruction&lt;/td&gt;
      &lt;td&gt;Periodically rebuild index; or use more flexible dynamic community detection&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;High construction cost&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Multiple LLM calls (entity extraction × N + summary generation × M)&lt;/td&gt;
      &lt;td&gt;Batch processing; caching; select appropriate community level to balance quality/cost&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Global query focus&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Not optimal for “finding a specific piece of information”&lt;/td&gt;
      &lt;td&gt;Combine with Local Search or DRIFT Search&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Lower Directness&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Global summarization may be verbose&lt;/td&gt;
      &lt;td&gt;Adjust Map-Reduce scoring thresholds; post-processing refinement&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;best-applicable-scenarios-1&quot;&gt;Best Applicable Scenarios&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Global semantic understanding&lt;/strong&gt; of million-token-scale document collections&lt;/li&gt;
  &lt;li&gt;Queries like “what are the main themes/trends/debates in the dataset” — questions with &lt;strong&gt;no clear retrieval target&lt;/strong&gt;&lt;/li&gt;
  &lt;li&gt;Scenarios requiring &lt;strong&gt;synthesis across multiple documents&lt;/strong&gt; to generate comprehensive answers (literature review, dataset exploration)&lt;/li&gt;
  &lt;li&gt;QA over &lt;strong&gt;private data&lt;/strong&gt; (does not rely on pre-trained knowledge)&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;7-quick-reference&quot;&gt;7. Quick Reference&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;What You Want to Know&lt;/th&gt;
      &lt;th&gt;See Which Section&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;What is the complete pipeline?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#0-execution-overview&quot;&gt;0. Execution Overview&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How is the index constructed?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#1-offline-construction-knowledge-graph-indexing&quot;&gt;1. Offline Construction&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How are queries answered?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#2-online-query-four-search-modes&quot;&gt;2. Four Search Modes&lt;/a&gt; + &lt;a href=&quot;#3-online-generationmap-reduce-answer-aggregation&quot;&gt;3. Map-Reduce Aggregation&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Why is it better than vector RAG?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#4-key-design-decisions&quot;&gt;4. Key Design Decisions&lt;/a&gt; + &lt;a href=&quot;#5-evaluation&quot;&gt;5. Evaluation&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;When should it NOT be used?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#6-limitations-and-applicability&quot;&gt;6. Limitations&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h3 id=&quot;相关项目&quot;&gt;相关项目&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;LazyGraphRAG&lt;/li&gt;
  &lt;li&gt;nano GraphRAG&lt;/li&gt;
&lt;/ul&gt;

</description>
        <pubDate>Mon, 18 May 2026 00:00:00 +0000</pubDate>
        <link>https://www.caozihang.com//2026/05/18/week/</link>
        <guid isPermaLink="true">https://www.caozihang.com//2026/05/18/week/</guid>
        
        <category>日常</category>
        
        
      </item>
    
      <item>
        <title>【阅】本周阅读摘选2026-05-11 → 2026-05-17</title>
        <description>&lt;center style=&quot;margin-bottom: 20px; margin-top: 50px&quot;&gt;&lt;font color=&quot;#3879B1&quot; style=&quot;line-height: 1.4;font-weight: 700;font-size: 36px;box-sizing: border-box; &quot;&gt;本周阅读摘选&lt;/font&gt;&lt;/center&gt;

&lt;center style=&quot; margin-bottom: 30px;&quot;&gt;2026-05-11 → 2026-05-17&lt;/center&gt;

&lt;font style=&quot;font-weight: bold;&quot;&gt;目录&lt;/font&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#学术相关&quot; id=&quot;markdown-toc-学术相关&quot;&gt;学术相关&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#pasa&quot; id=&quot;markdown-toc-pasa&quot;&gt;PaSa&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#0-execution-overview&quot; id=&quot;markdown-toc-0-execution-overview&quot;&gt;0. Execution Overview&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#1-high-level-design-indexing--retrieval--generation&quot; id=&quot;markdown-toc-1-high-level-design-indexing--retrieval--generation&quot;&gt;1. High-level Design (Indexing → Retrieval → Generation)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#2-offline-construction-training-detailed-execution&quot; id=&quot;markdown-toc-2-offline-construction-training-detailed-execution&quot;&gt;2. Offline Construction: Training (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#3-online-query-retrieval-detailed-execution&quot; id=&quot;markdown-toc-3-online-query-retrieval-detailed-execution&quot;&gt;3. Online Query: Retrieval (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#4-online-generation-generation-detailed-execution&quot; id=&quot;markdown-toc-4-online-generation-generation-detailed-execution&quot;&gt;4. Online Generation: Generation (Detailed Execution)&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#5-key-design-decisions&quot; id=&quot;markdown-toc-5-key-design-decisions&quot;&gt;5. Key Design Decisions&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#6-evaluation&quot; id=&quot;markdown-toc-6-evaluation&quot;&gt;6. Evaluation&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#7-limitations-and-applicability&quot; id=&quot;markdown-toc-7-limitations-and-applicability&quot;&gt;7. Limitations and Applicability&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#8-quick-reference&quot; id=&quot;markdown-toc-8-quick-reference&quot;&gt;8. Quick Reference&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;学术相关&quot;&gt;学术相关&lt;/h1&gt;

&lt;h2 id=&quot;pasa&quot;&gt;PaSa&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;One-sentence positioning&lt;/strong&gt;: An LLM-powered academic literature search agent that autonomously completes search tool invocations, paper reading, and citation expansion through a Crawler-Selector dual-component architecture, optimized via reinforcement learning for complex academic queries.&lt;/p&gt;

  &lt;p&gt;&lt;strong&gt;Key innovation&lt;/strong&gt;: Models academic search as an agentic decision process, where the Crawler maximizes recall in citation networks and the Selector precisely determines relevance; both components are jointly optimized via the AGILE RL framework, with dedicated synthetic dataset AutoScholarQuery and real-world benchmark RealScholarQuery constructed for training and evaluation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;0-execution-overview&quot;&gt;0. Execution Overview&lt;/h3&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;Offline Phase (one-time training)
  ├─ ① Dataset preparation
  │   ├─ AutoScholarQuery: synthetic academic query-paper pairs (35k), generated by GPT-4o from Related Work
  │   └─ RealScholarQuery: real-world academic query benchmark
  ├─ ② Selector training (SFT)
  │   ├─ Input: academic query + paper
  │   ├─ Output: decision token (True/False) + supporting rationale
  │   └─ Decision token placed before rationale, enabling it to serve as a single-token auxiliary reward model for the Crawler
  └─ ③ Crawler training (SFT → RL)
      ├─ SFT: generate trajectories session by session (Search session / Expand session)
      └─ RL (PPO): token-level MDP, each session independently optimized
         ├─ Action space: [Search] generate query and retrieve / [Expand] extract citations / [Stop] switch to next paper
         ├─ Reward function: α × Σ I(q, p_i, t) − c(a_t)
         └─ Auxiliary reward: Selector as reward model to mitigate sparse rewards

Online Phase (per query)
  ├─ ① Input academic query q
  ├─ ② Crawler iterative execution
  │   ├─ [Search] → generate search query → invoke retrieval tool → add results to paper queue
  │   ├─ [Expand] → extract sub-section citations from current paper → add to paper queue
  │   ├─ [Stop] → switch to next paper in queue
  │   └─ Navigate through citation network, continuously discovering more relevant papers
  ├─ ③ Selector per-paper judgment
  │   ├─ Input: query q + each paper in paper queue
  │   ├─ Output: True/False judgment + rationale
  │   └─ Decision token probability used for ranking search results
  └─ ④ Return final retrieval result set
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;1-high-level-design-indexing--retrieval--generation&quot;&gt;1. High-level Design (Indexing → Retrieval → Generation)&lt;/h3&gt;

&lt;h4 id=&quot;11-indexing&quot;&gt;1.1 Indexing&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Chunking strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;—&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Index structure&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;— (relies on external search engine; notes do not explicitly document offline index construction)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Knowledge representation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;—&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Construction cost&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;—&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Notes do not document index construction; offline phase focuses on model training and dataset construction&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;12-retrieval&quot;&gt;1.2 Retrieval&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Retrieval method&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Agentic dynamic search (Crawler autonomously decides Search / Expand / Stop)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Retrieval granularity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Paper level (collects entire papers through citation networks)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Iteration strategy&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Multi-round iteration (Crawler continuously navigates citation network, expanding paper queue)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Query processing&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Model automatically generates search queries based on current context and query&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Crawler-Selector dual-component: Crawler maximizes recall, Selector maximizes precision; RL optimization teaches the agent optimal search strategies&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;13-generation&quot;&gt;1.3 Generation&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dimension&lt;/th&gt;
      &lt;th&gt;Approach&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Context injection&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Selector matches retrieved papers with query, generating True/False judgments and rationales&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Citation tracing&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Selector outputs rationale supporting its judgment for each paper&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Quality control&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Dual-component division: Crawler recall + Selector filtering; Selector decision token probability used for result ranking&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Core characteristic&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Selector serves as single-token reward model, used both for online judgment and as auxiliary reward for Crawler RL training&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;2-offline-construction-training-detailed-execution&quot;&gt;2. Offline Construction: Training (Detailed Execution)&lt;/h3&gt;

&lt;h4 id=&quot;step-21-dataset-construction&quot;&gt;Step 2.1 Dataset Construction&lt;/h4&gt;

&lt;h5 id=&quot;autoscholarquery&quot;&gt;AutoScholarQuery&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Related Work section of each paper&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Use GPT-4o to generate academic queries; answers correspond to references cited in Related Work&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Synthetic but high-quality dataset, curated for the AI domain&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;35k query-paper pairs&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Quality verification&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Sample 100 query-paper pairs to assess reasonableness and relevance&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h5 id=&quot;realscholarquery&quot;&gt;RealScholarQuery&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Real-world academic queries&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Collect real academic queries to construct benchmark&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Real academic query benchmark for evaluating more realistic scenarios&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-22-selector-training&quot;&gt;Step 2.2 Selector Training&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Academic query and a paper&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Fine-tune model to generate two outputs: (1) decision token d (True/False); (2) supporting rationale r&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Decision token placed before rationale, enabling Selector to serve as single-token reward model for Crawler training; token probability can be used for ranking search results&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Training config&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;SFT, 1 epoch, learning rate 1e-5, batch size 4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Trained Selector model&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-23-crawler-training-imitation-learning&quot;&gt;Step 2.3 Crawler Training: Imitation Learning&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;AutoScholarQuery training data subset&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Generate trajectories session by session for imitation learning&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Two session types: Search session (starting from state S&lt;sub&gt;q&lt;/sub&gt;) and Expand session (starting from state S&lt;sub&gt;{q+p}&lt;/sub&gt;)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Search Session&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;GPT-4o generates search query trajectories based on user query&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Expand Session&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Given query and retrieved papers, use Google retrieval sampling; check paper sub-sections; citations to training data papers must be included, otherwise 10% probability of random selection to augment diversity&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;SFT-initialized Crawler policy model&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-24-crawler-training-reinforcement-learning&quot;&gt;Step 2.4 Crawler Training: Reinforcement Learning&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Query q, SFT-initialized policy model π&lt;sub&gt;θ&lt;/sub&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Model Crawler as token-level MDP, optimize with PPO&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;State&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Current LLM context + paper queue&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Action space&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;LLM vocabulary; each token represents an action; when action matches function name, execute corresponding function ([Search]/[Expand]/[Stop])&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Reward function&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;r(s&lt;sub&gt;t&lt;/sub&gt;, a&lt;sub&gt;t&lt;/sub&gt;) = α × Σ&lt;sub&gt;i=1&lt;/sub&gt;&lt;sup&gt;n&lt;sub&gt;t&lt;/sub&gt;&lt;/sup&gt; I(q, p&lt;sub&gt;i&lt;/sub&gt;, t) − c(a&lt;sub&gt;t&lt;/sub&gt;), where I is indicator function for new papers matching query and not in queue&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Auxiliary reward&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Selector as auxiliary reward model, mitigating sparse reward problem from AutoScholarQuery matching only&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Independent PPO per session; session defined as sub-trajectory ending with [Stop]; Monte Carlo sampling estimates returns&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Total objective&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;L&lt;sub&gt;RL&lt;/sub&gt;(θ, φ) = L&lt;sub&gt;policy&lt;/sub&gt;(θ) + η · L&lt;sub&gt;value&lt;/sub&gt;(φ)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Trained Crawler policy model&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;step-25-model-assembly&quot;&gt;Step 2.5 Model Assembly&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Trained Selector and Crawler&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Sequentially train both components; both based on Qwen2.5-7b&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Final agent PaSa-7b&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;3-online-query-retrieval-detailed-execution&quot;&gt;3. Online Query: Retrieval (Detailed Execution)&lt;/h3&gt;

&lt;h4 id=&quot;31-retrieval-mode-overview&quot;&gt;3.1 Retrieval Mode Overview&lt;/h4&gt;

&lt;p&gt;PaSa uses a single agentic search mode, with the Crawler autonomously navigating through citation networks:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Mode&lt;/th&gt;
      &lt;th&gt;Applicable Scenario&lt;/th&gt;
      &lt;th&gt;Core Mechanism&lt;/th&gt;
      &lt;th&gt;Characteristic&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Agentic Search&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Complex academic queries&lt;/td&gt;
      &lt;td&gt;Crawler autonomously Search/Expand/Stop; Selector judges per paper&lt;/td&gt;
      &lt;td&gt;Learned optimal search strategy, not manually predefined&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;32-retrieval-procedure&quot;&gt;3.2 Retrieval Procedure&lt;/h4&gt;

&lt;h5 id=&quot;step-31-crawler-initialization-and-execution&quot;&gt;Step 3.1: Crawler Initialization and Execution&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;User academic query q&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Crawler executes token-level MDP, autonomously deciding the next action&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Action [Search]&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Generate search query, invoke retrieval tool, add all results to paper queue&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Action [Expand]&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Extract sub-sections from current paper context, parse all citations and add to paper queue&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Action [Stop]&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Stop current paper processing, reset context, begin processing next paper in queue&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Crawler aims to maximize recall of relevant papers, exploring increasingly relevant papers through citation networks&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Continuously growing paper queue (potentially containing hundreds or even thousands of papers)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h5 id=&quot;step-32-selector-judgment-and-ranking&quot;&gt;Step 3.2: Selector Judgment and Ranking&lt;/h5&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Query q + each paper in paper queue&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Selector carefully reads each paper, determining whether it meets query requirements&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;(1) True/False judgment; (2) supporting rationale; (3) decision token probability for ranking&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Selector emphasizes precise identification of papers meeting user needs, complementing Crawler’s recall objective&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;4-online-generation-generation-detailed-execution&quot;&gt;4. Online Generation: Generation (Detailed Execution)&lt;/h3&gt;

&lt;p&gt;PaSa’s “generation” phase is not traditional RAG answer generation, but rather relevance judgment and ranking of retrieval results through the Selector, ultimately outputting the set of papers satisfying the query.&lt;/p&gt;

&lt;h4 id=&quot;41-judgment-and-ranking-procedure&quot;&gt;4.1 Judgment and Ranking Procedure&lt;/h4&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;Input: query q, paper queue Q = {p_1, p_2, ..., p_n} collected by Crawler
    │
    ▼
┌─────────────────────────────┐
│ For each paper p_i in Q     │
│ Call Selector(q, p_i)       │
└─────────────┬───────────────┘
              │
         ┌────┴────┐
         ▼         ▼
      True       False
         │         │
         ▼         ▼
    Retain and   Filter
    rank
    (by decision probability)
         │
         ▼
    Output final paper set
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h4 id=&quot;step-41-relevance-judgment&quot;&gt;Step 4.1: Relevance Judgment&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Item&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Input&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Query q, paper p&lt;sub&gt;i&lt;/sub&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Operation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Selector generates decision token d (True/False) and supporting rationale r&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Key decision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Decision token probability can be used for ranking search results, enabling fine-grained sorting&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Relevance judgment + rationale + ranking score&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;5-key-design-decisions&quot;&gt;5. Key Design Decisions&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Decision Point&lt;/th&gt;
      &lt;th&gt;PaSa’s Choice&lt;/th&gt;
      &lt;th&gt;Alternative&lt;/th&gt;
      &lt;th&gt;Rationale&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;System architecture&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Crawler-Selector dual-component&lt;/td&gt;
      &lt;td&gt;Single-component end-to-end&lt;/td&gt;
      &lt;td&gt;Crawler maximizes recall, Selector maximizes precision; clear division of labor&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Training paradigm&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;RL (AGILE framework, PPO)&lt;/td&gt;
      &lt;td&gt;SFT / prompt engineering&lt;/td&gt;
      &lt;td&gt;Optimize PaSa in the AGILE RL framework, teaching the agent optimal search strategies&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Crawler training&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Imitation Learning + RL two-stage&lt;/td&gt;
      &lt;td&gt;SFT only / RL only&lt;/td&gt;
      &lt;td&gt;First generate trajectories for imitation learning, then apply RL&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Reward design&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Outcome-oriented + Selector auxiliary reward&lt;/td&gt;
      &lt;td&gt;Sparse matching reward only&lt;/td&gt;
      &lt;td&gt;AutoScholarQuery matching alone causes sparse rewards; Selector as auxiliary reward model mitigates this&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;RL optimization granularity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Independent PPO per session&lt;/td&gt;
      &lt;td&gt;Unified optimization over entire trajectory&lt;/td&gt;
      &lt;td&gt;Session defined as sub-trajectory ending with [Stop]; each session independently optimized&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Selector output&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Decision token before rationale + rationale&lt;/td&gt;
      &lt;td&gt;Rationale before decision token&lt;/td&gt;
      &lt;td&gt;Decision token placement enables Selector to serve as single-token reward model; probability usable for ranking&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Data generation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;GPT-4o synthesizes from Related Work&lt;/td&gt;
      &lt;td&gt;Manual annotation&lt;/td&gt;
      &lt;td&gt;Generates queries from Related Work sections; answers correspond to cited references&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;6-evaluation&quot;&gt;6. Evaluation&lt;/h3&gt;

&lt;h4 id=&quot;61-evaluation-metrics&quot;&gt;6.1 Evaluation Metrics&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Metric&lt;/th&gt;
      &lt;th&gt;Meaning&lt;/th&gt;
      &lt;th&gt;PaSa vs Baseline&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Recall@20&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Relevant paper recall in top 20 results&lt;/td&gt;
      &lt;td&gt;PaSa-7B vs Google with GPT-4o: +37.78%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Recall@50&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Relevant paper recall in top 50 results&lt;/td&gt;
      &lt;td&gt;PaSa-7B vs Google with GPT-4o: +39.90%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Recall&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Overall recall&lt;/td&gt;
      &lt;td&gt;PaSa-7B vs PaSa-GPT-4o: +30.36%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Precision&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Precision&lt;/td&gt;
      &lt;td&gt;PaSa-7B vs PaSa-GPT-4o: +4.25%&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;F1&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Composite F1 score&lt;/td&gt;
      &lt;td&gt;—&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;&lt;img src=&quot;attachments/Pasted%20image%2020260514160522.png&quot; alt=&quot;Pasted image 20260514160522.png&quot; /&gt;&lt;/p&gt;

&lt;h4 id=&quot;62-comparative-experimental-setup&quot;&gt;6.2 Comparative Experimental Setup&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Condition&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;PaSa-7B&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Full system (Qwen2.5-7b base, SFT + RL training)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Google&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Traditional search engine baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Google Scholar&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Academic search engine baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Google with GPT-4o&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Google + GPT-4o query rewriting baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;ChatGPT (search-enabled GPT-4o)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Search-augmented ChatGPT baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;GPT-o1&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;OpenAI o1 reasoning model baseline&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;PaSa-GPT-4o&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;PaSa baseline implemented by prompting GPT-4o&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;PaSa w/o RL&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;No-RL version baseline&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;63-datasets&quot;&gt;6.3 Datasets&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Dataset&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;AutoScholarQuery&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Synthetic academic query dataset (35k), for training&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;RealScholarQuery&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Real-world academic query benchmark, for evaluation&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;7-limitations-and-applicability&quot;&gt;7. Limitations and Applicability&lt;/h3&gt;

&lt;p&gt;Notes do not explicitly document PaSa’s limitations.&lt;/p&gt;

&lt;h4 id=&quot;best-applicable-scenarios&quot;&gt;Best Applicable Scenarios&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;Complex academic queries requiring long-tail expertise, comprehensive survey-level coverage, fine-grained queries&lt;/li&gt;
  &lt;li&gt;Literature retrieval tasks requiring deep exploration within citation networks&lt;/li&gt;
  &lt;li&gt;Academic search scenarios with high demands on both recall and precision&lt;/li&gt;
&lt;/ul&gt;

&lt;h4 id=&quot;unsuitable-scenarios&quot;&gt;Unsuitable Scenarios&lt;/h4&gt;

&lt;ul&gt;
  &lt;li&gt;—&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;h3 id=&quot;8-quick-reference&quot;&gt;8. Quick Reference&lt;/h3&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;What You Want to Know&lt;/th&gt;
      &lt;th&gt;See Which Section&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;What is the complete pipeline?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#0-execution-overview&quot;&gt;0. Execution Overview&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;High-level design comparison?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#1-high-level-design-indexing--retrieval--generation&quot;&gt;1. High-level Design&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How is the model trained?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#2-offline-construction-training-detailed-execution&quot;&gt;2. Offline Construction&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How is retrieval executed?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#3-online-query-retrieval-detailed-execution&quot;&gt;3. Online Query&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How is relevance judged?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#4-online-generation-generation-detailed-execution&quot;&gt;4. Online Generation&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Why is it designed this way?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#5-key-design-decisions&quot;&gt;5. Key Design Decisions&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;How does it perform?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#6-evaluation&quot;&gt;6. Evaluation&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;When should it NOT be used?&lt;/td&gt;
      &lt;td&gt;&lt;a href=&quot;#7-limitations-and-applicability&quot;&gt;7. Limitations and Applicability&lt;/a&gt;&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

</description>
        <pubDate>Mon, 11 May 2026 00:00:00 +0000</pubDate>
        <link>https://www.caozihang.com//2026/05/11/week/</link>
        <guid isPermaLink="true">https://www.caozihang.com//2026/05/11/week/</guid>
        
        <category>日常</category>
        
        
      </item>
    
      <item>
        <title>【阅】本周阅读摘选2026-04-27 → 2026-05-03</title>
        <description>&lt;center style=&quot;margin-bottom: 20px; margin-top: 50px&quot;&gt;&lt;font color=&quot;#3879B1&quot; style=&quot;line-height: 1.4;font-weight: 700;font-size: 36px;box-sizing: border-box; &quot;&gt;本周阅读摘选&lt;/font&gt;&lt;/center&gt;

&lt;center style=&quot; margin-bottom: 30px;&quot;&gt;2026-04-27 → 2026-05-03&lt;/center&gt;

&lt;font style=&quot;font-weight: bold;&quot;&gt;目录&lt;/font&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#学术相关&quot; id=&quot;markdown-toc-学术相关&quot;&gt;学术相关&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#1-a-brief-review-of-traditional-information-retrieval&quot; id=&quot;markdown-toc-1-a-brief-review-of-traditional-information-retrieval&quot;&gt;1. A Brief Review of Traditional Information Retrieval&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#11-the-evolution-of-retrieval-methods&quot; id=&quot;markdown-toc-11-the-evolution-of-retrieval-methods&quot;&gt;1.1 The Evolution of Retrieval Methods&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#12-the-evolution-of-ranking-methods&quot; id=&quot;markdown-toc-12-the-evolution-of-ranking-methods&quot;&gt;1.2 The Evolution of Ranking Methods&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#13-pre-trained-transformers-and-the-representation-learning-revolution-in-ir&quot; id=&quot;markdown-toc-13-pre-trained-transformers-and-the-representation-learning-revolution-in-ir&quot;&gt;1.3 Pre-trained Transformers and the Representation Learning Revolution in IR&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#2-ir-systems-in-the-llm-era-the-technical-evolution-path-of-rag&quot; id=&quot;markdown-toc-2-ir-systems-in-the-llm-era-the-technical-evolution-path-of-rag&quot;&gt;2. IR Systems in the LLM Era: The Technical Evolution Path of RAG&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#21-why-rag-is-needed&quot; id=&quot;markdown-toc-21-why-rag-is-needed&quot;&gt;2.1 Why RAG Is Needed&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#22-naive-rag-the-simplest-form&quot; id=&quot;markdown-toc-22-naive-rag-the-simplest-form&quot;&gt;2.2 Naive RAG: The Simplest Form&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#23-the-technical-evolution-path-of-rag&quot; id=&quot;markdown-toc-23-the-technical-evolution-path-of-rag&quot;&gt;2.3 The Technical Evolution Path of RAG&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#24-core-limitations-of-rag&quot; id=&quot;markdown-toc-24-core-limitations-of-rag&quot;&gt;2.4 Core Limitations of RAG&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#3-graph-based-rag&quot; id=&quot;markdown-toc-3-graph-based-rag&quot;&gt;3. Graph-based RAG&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#31-definitions-and-taxonomy-of-graph-based-rag&quot; id=&quot;markdown-toc-31-definitions-and-taxonomy-of-graph-based-rag&quot;&gt;3.1 Definitions and Taxonomy of Graph-based RAG&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#32-the-three-stage-workflow-of-graph-based-rag&quot; id=&quot;markdown-toc-32-the-three-stage-workflow-of-graph-based-rag&quot;&gt;3.2 The Three-Stage Workflow of Graph-based RAG&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#33-applicability-of-graph-based-rag-when-are-graph-structures-needed&quot; id=&quot;markdown-toc-33-applicability-of-graph-based-rag-when-are-graph-structures-needed&quot;&gt;3.3 Applicability of Graph-based RAG: When Are Graph Structures Needed?&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#4-agentic-search&quot; id=&quot;markdown-toc-4-agentic-search&quot;&gt;4. Agentic Search&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#41-definition-and-core-components&quot; id=&quot;markdown-toc-41-definition-and-core-components&quot;&gt;4.1 Definition and Core Components&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#42-workflow-patterns-from-linear-reasoning-to-graph-structured-exploration&quot; id=&quot;markdown-toc-42-workflow-patterns-from-linear-reasoning-to-graph-structured-exploration&quot;&gt;4.2 Workflow Patterns: From Linear Reasoning to Graph-structured Exploration&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#43-agent-orchestration-and-training-paradigms&quot; id=&quot;markdown-toc-43-agent-orchestration-and-training-paradigms&quot;&gt;4.3 Agent Orchestration and Training Paradigms&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#44-are-graph-structures-necessary-for-agentic-search--ragsearch-empirical-evidence&quot; id=&quot;markdown-toc-44-are-graph-structures-necessary-for-agentic-search--ragsearch-empirical-evidence&quot;&gt;4.4 Are Graph Structures Necessary for Agentic Search? — RAGSearch Empirical Evidence&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#45-open-challenges-and-future-directions&quot; id=&quot;markdown-toc-45-open-challenges-and-future-directions&quot;&gt;4.5 Open Challenges and Future Directions&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#5-literature-retrieval-systems-in-scientific-research&quot; id=&quot;markdown-toc-5-literature-retrieval-systems-in-scientific-research&quot;&gt;5. Literature Retrieval Systems in Scientific Research&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#51-academic-search-platforms-and-tools&quot; id=&quot;markdown-toc-51-academic-search-platforms-and-tools&quot;&gt;5.1 Academic Search Platforms and Tools&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#52-four-core-limitations-of-academic-search&quot; id=&quot;markdown-toc-52-four-core-limitations-of-academic-search&quot;&gt;5.2 Four Core Limitations of Academic Search&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#53-research-ai-frameworks-and-evaluation-benchmarks&quot; id=&quot;markdown-toc-53-research-ai-frameworks-and-evaluation-benchmarks&quot;&gt;5.3 Research AI Frameworks and Evaluation Benchmarks&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#6-references&quot; id=&quot;markdown-toc-6-references&quot;&gt;6. References&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;学术相关&quot;&gt;学术相关&lt;/h1&gt;

&lt;blockquote&gt;
  &lt;p&gt;近期梳理的一些RAG相关的综述笔记，具体内容由claude code基于笔记内容生成
Information Retrieval, RAG, and Intelligent Literature Systems: A Technical Survey&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;1-a-brief-review-of-traditional-information-retrieval&quot;&gt;1. A Brief Review of Traditional Information Retrieval&lt;/h2&gt;

&lt;p&gt;Modern large-scale information retrieval (IR) systems almost universally adopt a &lt;strong&gt;retrieval + ranking&lt;/strong&gt; two-stage pipeline (retrieve-then-rerank architecture), achieving a fundamental balance between efficiency and effectiveness [Hambarde et al., 2023; Xu et al., 2025].&lt;/p&gt;

&lt;h3 id=&quot;11-the-evolution-of-retrieval-methods&quot;&gt;1.1 The Evolution of Retrieval Methods&lt;/h3&gt;

&lt;p&gt;Retrieval methods have evolved from term-based matching to semantic vector-based approaches, which can be summarized into four categories:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Category&lt;/th&gt;
      &lt;th&gt;Core Idea&lt;/th&gt;
      &lt;th&gt;Representative Methods&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Traditional&lt;/td&gt;
      &lt;td&gt;Term matching, query expansion, topic models&lt;/td&gt;
      &lt;td&gt;TF-IDF, BM25, Query Expansion, Topic Model&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Sparse&lt;/td&gt;
      &lt;td&gt;Represent documents/queries as sparse vectors, activating few dimensions&lt;/td&gt;
      &lt;td&gt;Neural weight prediction, document expansion, BM25 variants&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Dense&lt;/td&gt;
      &lt;td&gt;Encode queries and documents as dense vectors, retrieve via vector similarity&lt;/td&gt;
      &lt;td&gt;Word2Vec, Sentence-BERT, DPR, ColBERT&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Hybrid&lt;/td&gt;
      &lt;td&gt;Combine the strengths of sparse and dense representations&lt;/td&gt;
      &lt;td&gt;SPLADE, ColBERT, and other late-fusion schemes&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;To enhance retrieval effectiveness, researchers have proposed two augmentation strategies. &lt;strong&gt;Query Augmentation&lt;/strong&gt; enriches the query representation on the query side through expansion, reformulation, and feedback, but is subject to &lt;strong&gt;query drift&lt;/strong&gt; and overfitting risks [Hambarde et al., 2023]. &lt;strong&gt;Document Augmentation&lt;/strong&gt; augments or semantically rewrites indexed content on the document side to narrow the semantic gap between queries and documents.&lt;/p&gt;

&lt;h3 id=&quot;12-the-evolution-of-ranking-methods&quot;&gt;1.2 The Evolution of Ranking Methods&lt;/h3&gt;

&lt;p&gt;The ranking stage is responsible for fine-grained relevance scoring of candidate documents returned by retrieval, and has similarly undergone a transformation from traditional methods to deep learning approaches:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Category&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
      &lt;th&gt;Representative Methods&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Learning-to-Rank&lt;/td&gt;
      &lt;td&gt;Classified by loss function into pointwise / pairwise / listwise&lt;/td&gt;
      &lt;td&gt;LambdaRank, LambdaMART&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Representation-based&lt;/td&gt;
      &lt;td&gt;Encode query and document separately, compute global similarity&lt;/td&gt;
      &lt;td&gt;DSSM, Siamese Network&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Interaction-based&lt;/td&gt;
      &lt;td&gt;Model query-document term-level interactions before scoring&lt;/td&gt;
      &lt;td&gt;DRMM, BERT-based Re-ranker&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Hybrid&lt;/td&gt;
      &lt;td&gt;Combine the efficiency of representation-based with the accuracy of interaction-based, dual-network parallel&lt;/td&gt;
      &lt;td&gt;DUET&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The introduction of continuous vector representations enabled text ranking to transcend the ceiling of exact term matching, while neural networks eliminated the need for manual feature engineering [Hambarde et al., 2023]. In practical systems, representation-based models typically pre-encode documents into vectors during offline stages, making them suitable for efficient initial retrieval; interaction-based models jointly process queries and documents during the online stage, achieving deeper and more precise relevance matching but at higher computational cost, making them better suited for re-ranking a smaller set of candidate results [Xu et al., 2025]. Recognizing the complementary strengths of these two architectures, researchers further proposed hybrid models that combine the efficiency of representation-based methods with the effectiveness of interaction-based methods [Xu et al., 2025].&lt;/p&gt;

&lt;h3 id=&quot;13-pre-trained-transformers-and-the-representation-learning-revolution-in-ir&quot;&gt;1.3 Pre-trained Transformers and the Representation Learning Revolution in IR&lt;/h3&gt;

&lt;p&gt;The advent of pre-trained Transformers has further reshaped IR model architectures. By network structure, they can be broadly categorized into three types:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Architecture&lt;/th&gt;
      &lt;th&gt;Representative Models&lt;/th&gt;
      &lt;th&gt;Characteristics&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Encoder-only&lt;/td&gt;
      &lt;td&gt;BERT (Cross-Encoder)&lt;/td&gt;
      &lt;td&gt;Deep interaction, high ranking quality, but computationally expensive&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Encoder-decoder&lt;/td&gt;
      &lt;td&gt;T5, BART&lt;/td&gt;
      &lt;td&gt;Balances understanding and generation capabilities&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Decoder-only&lt;/td&gt;
      &lt;td&gt;GPT series&lt;/td&gt;
      &lt;td&gt;Strong generation capability&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;To bridge the effectiveness gap between efficient Bi-Encoders and powerful Cross-Encoders, &lt;strong&gt;Knowledge Distillation (KD)&lt;/strong&gt; has been widely adopted in the IR domain. However, regardless of architectural evolution, retrieval systems always face a fundamental trade-off between &lt;strong&gt;effectiveness&lt;/strong&gt; (recall, MRR, nDCG) and &lt;strong&gt;resource efficiency&lt;/strong&gt; (latency, throughput, memory footprint, indexing and update costs). While neural retrievers can significantly improve effectiveness, they require additional computational and indexing engineering investment. Furthermore, &lt;strong&gt;Calibration&lt;/strong&gt; mechanisms aim to align model output scores with true relevance probabilities, ensuring that confidence scores faithfully reflect correctness.&lt;/p&gt;

&lt;p&gt;At this point, information retrieval has fully entered the representation learning era of deep learning. Pre-trained models have endowed retrieval systems with powerful semantic understanding capabilities, yet the fundamental paradigm remains &lt;strong&gt;user issues a query → system returns a ranked list of relevant documents&lt;/strong&gt;. The emergence of large language models is poised to fundamentally alter this paradigm.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;2-ir-systems-in-the-llm-era-the-technical-evolution-path-of-rag&quot;&gt;2. IR Systems in the LLM Era: The Technical Evolution Path of RAG&lt;/h2&gt;

&lt;p&gt;The fundamental paradigm of traditional information retrieval systems is to return a ranked list of relevant documents in response to user queries; the system itself is not responsible for understanding document content and generating coherent answers. Large Language Models (LLMs), while possessing powerful language understanding and generation capabilities, exhibit significant limitations when relying solely on their internal parametric knowledge. This complementarity gave rise to Retrieval-Augmented Generation (RAG), which combines the external knowledge acquisition capability of IR with the text generation capability of LLMs, forming an entirely new information processing paradigm.&lt;/p&gt;

&lt;h3 id=&quot;21-why-rag-is-needed&quot;&gt;2.1 Why RAG Is Needed&lt;/h3&gt;

&lt;p&gt;Before RAG, the field of information access went through two major phases, each with fundamental shortcomings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Limitations of Pure LLMs (Chatbot Mode).&lt;/strong&gt; When LLMs rely solely on parametric knowledge learned during pre-training to answer user queries, they face three core challenges: first, &lt;strong&gt;hallucination&lt;/strong&gt;, where the model may generate plausible but inaccurate content; second, &lt;strong&gt;lack of timeliness&lt;/strong&gt;, as the model has no knowledge of events beyond its training data cutoff and cannot provide up-to-date information; and third, &lt;strong&gt;context window limitations&lt;/strong&gt;, where even with massive parameter counts, the context length the model can process in a single pass remains limited (typically 2K–32K tokens), making it difficult to comprehensively understand complex queries requiring integration of extensive background knowledge [Zhang et al., 2025b].&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Limitations of Traditional IR Systems.&lt;/strong&gt; While traditional search engines or retrieval systems can return large volumes of relevant documents based on indexing, their responsibility ends there — they are designed to “find” information, not to “understand” and “answer.” After obtaining a document list, users must still read, screen, and synthesize on their own to form a complete answer. For complex questions requiring cross-document correlation analysis or synthetic reasoning, the cognitive burden of this process is extremely high.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Core Motivation of RAG.&lt;/strong&gt; The central idea of RAG is &lt;strong&gt;Retrieve-then-Read&lt;/strong&gt;: first use a retrieval system to retrieve relevant document fragments from an external knowledge base, then concatenate these retrieved texts as context into the LLM’s input prompt, guiding the model to generate responses based on these external factual inputs. This approach leverages the powerful language understanding and generation capabilities of LLMs while harnessing the retrieval system to ensure the factuality and timeliness of responses, simultaneously alleviating hallucination to some extent [Zhang et al., 2025b; Gao et al., 2024].&lt;/p&gt;

&lt;h3 id=&quot;22-naive-rag-the-simplest-form&quot;&gt;2.2 Naive RAG: The Simplest Form&lt;/h3&gt;

&lt;p&gt;The most rudimentary implementation of RAG is exceedingly straightforward: a user query enters the system, relevant fragments are retrieved from the document collection via keyword matching (such as TF-IDF or BM25), these fragments are concatenated with the original query to form a prompt, which is then passed to the LLM to generate a final answer. This design is conceptually clear and easy to implement, and can achieve reasonable results for simple factual question answering.&lt;/p&gt;

&lt;p&gt;However, this naive “one-shot retrieval + direct generation” architecture quickly reveals its fragility in complex scenarios: the retrieval stage may return insufficiently relevant fragments due to inherent limitations in similarity computation; the generation stage struggles to deeply integrate retrieved content with the query, as multi-source similar information tends to produce redundancy or incoherent content; and the entire pipeline is static and linear, lacking the ability to adjust based on intermediate results [Gao et al., 2024]. These limitations prompted researchers to explore more advanced RAG variants.&lt;/p&gt;

&lt;h3 id=&quot;23-the-technical-evolution-path-of-rag&quot;&gt;2.3 The Technical Evolution Path of RAG&lt;/h3&gt;

&lt;p&gt;From the historical perspective of information retrieval, the way humans access information has evolved from Web Search (static keyword-based search) to LLM as Chatbot (pure parametric knowledge generation) to LLM with RAG (retrieval-augmented generation), and is now advancing toward Multi-hop Retrieval (iterative multi-hop retrieval) and Deep Research [Zhang et al., 2025b]. Deep Research emphasizes a dynamic feedback loop between reasoning and search: the reasoning process actively influences search strategies (e.g., refining queries based on intermediate deductions), while retrieved information in turn recursively refines the reasoning process.&lt;/p&gt;

&lt;p&gt;From a technical perspective, the evolution of RAG can be summarized as a clear five-stage progression [Gao et al., 2024]:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;Naive RAG → Advanced RAG → Modular RAG → Graph RAG → Agentic RAG
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Stage&lt;/th&gt;
      &lt;th&gt;Key Characteristics&lt;/th&gt;
      &lt;th&gt;Core Improvements&lt;/th&gt;
      &lt;th&gt;Limitations&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Naive RAG&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Keyword retrieval (TF-IDF, BM25), results directly concatenated to prompt&lt;/td&gt;
      &lt;td&gt;Simple architecture, easy to implement&lt;/td&gt;
      &lt;td&gt;Lacks context awareness, fragmented output, limited scalability&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Advanced RAG&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Introduces dense vector retrieval models, vector search, context re-ranking, multi-hop iterative retrieval&lt;/td&gt;
      &lt;td&gt;Significantly improves retrieval quality and context relevance&lt;/td&gt;
      &lt;td&gt;Increased computational overhead, scalability still limited&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Modular RAG&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Decouples retrieval, ranking, and generation into independent reusable modules, supports hybrid retrieval strategies and external tool integration&lt;/td&gt;
      &lt;td&gt;Substantially improved system flexibility and scalability&lt;/td&gt;
      &lt;td&gt;Inter-module collaborative optimization still requires manual design&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Graph RAG&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Leverages graph node connectivity and edge relations to organize knowledge, enabling hierarchical knowledge management and structured navigation&lt;/td&gt;
      &lt;td&gt;Supports cross-document correlation, multi-hop reasoning, and global context understanding&lt;/td&gt;
      &lt;td&gt;High graph construction cost, scalability limited by graph scale, reliance on high-quality data&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Agentic RAG&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Introduces autonomous agents for iterative query refinement, adaptive retrieval strategy selection, and dynamic workflow orchestration&lt;/td&gt;
      &lt;td&gt;Possesses multi-step reasoning and autonomous decision-making capabilities&lt;/td&gt;
      &lt;td&gt;High coordination complexity, significant computational overhead, increased latency&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Meanwhile, Li et al. (2025), from the perspective of the interplay between RAG and Reasoning, proposed a higher-level framework that categorizes their integration into three paradigms:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Paradigm&lt;/th&gt;
      &lt;th&gt;Core Idea&lt;/th&gt;
      &lt;th&gt;Direction&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Reasoning-Enhanced RAG&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Leverages reasoning capabilities to optimize RAG’s retrieval, integration, and generation stages&lt;/td&gt;
      &lt;td&gt;Reasoning → RAG&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;RAG-Enhanced Reasoning&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Leverages external knowledge from retrieval to enhance LLM reasoning capabilities&lt;/td&gt;
      &lt;td&gt;RAG → Reasoning&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Synergized RAG-Reasoning&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Deep coupling of retrieval and reasoning, iterative alternating execution, forming a dynamic feedback loop&lt;/td&gt;
      &lt;td&gt;RAG ↔ Reasoning&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;These three paradigms reveal that RAG is not merely about “attaching an external retriever to an LLM,” but rather a process of continuous integration and mutual enhancement between retrieval and reasoning capabilities. Early RAG systems primarily belonged to the first two unidirectional enhancement modes, while current frontier trends are advancing toward the third mode of deep synergization.&lt;/p&gt;

&lt;h3 id=&quot;24-core-limitations-of-rag&quot;&gt;2.4 Core Limitations of RAG&lt;/h3&gt;

&lt;p&gt;Despite continuous advances in RAG technology, traditional RAG systems (primarily Naive and Advanced RAG) face fundamental challenges across multiple dimensions. Synthesizing findings from multiple surveys, these limitations can be summarized along three dimensions:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Retrieval Dimension.&lt;/strong&gt; A single retrieval pass cannot guarantee acquisition of all relevant information needed to resolve a query, and assessing the relevance of retrieval results is itself challenging. When queries involve complex knowledge requiring synthesis across multiple data sources, simultaneously satisfying the sufficiency and accuracy of retrieval becomes difficult [Li et al., 2025; Gao et al., 2024]. Furthermore, deep-seated limitations at the embedding level — including biases in training data leading to high-frequency task/language/domain preferences, poor performance of general-purpose corpus models in specialized domains (e.g., biomedicine, scientific literature), and insufficient capture of long-range structural information — further constrain the upper bound of retrieval quality [Zhang et al., 2025c].&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reasoning Dimension.&lt;/strong&gt; Traditional RAG lacks genuine multi-step reasoning capabilities. The system cannot dynamically refine retrieval strategies based on intermediate insights or user feedback, making it difficult to handle tasks requiring complex logical chains and deep contextual understanding [Singh et al., 2026; Li et al., 2025]. Errors in early reasoning paths may propagate through subsequent steps, affecting the completeness of the final output [Zhang et al., 2025b]. Additionally, models struggle to maintain fidelity to retrieved evidence when conflicts arise between external retrieved evidence and internal parametric knowledge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;System Dimension.&lt;/strong&gt; RAG pipelines are typically static and linear, lacking adaptive adjustment mechanisms, making it difficult to accommodate queries of varying complexity and dynamically updating knowledge bases [Li et al., 2025; Singh et al., 2026]. The entire workflow — from preprocessing and index construction to real-time retrieval and generation — faces efficiency bottlenecks in large-scale deployment [Zhang et al., 2025b]. Moreover, effectively integrating similar information retrieved from multiple sources, avoiding redundant output while ensuring stylistic and tonal consistency of generated content, remains a persistent practical challenge [Gao et al., 2024].&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Emergence of Enhanced Retrieval Modes.&lt;/strong&gt; To overcome the limitations of single-pass retrieval, researchers proposed three enhanced retrieval modes [Gao et al., 2024]: &lt;strong&gt;Iterative Retrieval&lt;/strong&gt; gradually enriches context through multiple retrieval-generation cycles; &lt;strong&gt;Recursive Retrieval&lt;/strong&gt; progressively decomposes complex problems for deeper retrieval; and &lt;strong&gt;Adaptive Retrieval&lt;/strong&gt; dynamically controls retrieval and generation behavior on demand. While these ideas had preliminary exploration in the first two generations of RAG, their potential was far from fully realized due to the static nature of system architecture. True breakthroughs came from two directions: first, using graph structures to explicitly model knowledge associations (Chapter 3, Graph-based RAG); and second, using autonomous agents to dynamically orchestrate retrieval and reasoning processes (Chapter 4, Agentic Search).&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;3-graph-based-rag&quot;&gt;3. Graph-based RAG&lt;/h2&gt;

&lt;p&gt;When RAG needs to handle cross-document correlation, multi-hop reasoning, and global knowledge integration, traditional text-fragment-based indexing and retrieval methods face structural limitations. Graph-based RAG elevates RAG from “retrieving text fragments” to “navigating and reasoning within structured knowledge networks” by introducing graph structures to explicitly model inter-knowledge associations.&lt;/p&gt;

&lt;h3 id=&quot;31-definitions-and-taxonomy-of-graph-based-rag&quot;&gt;3.1 Definitions and Taxonomy of Graph-based RAG&lt;/h3&gt;

&lt;h4 id=&quot;311-text-attributed-graphs-and-formal-definition&quot;&gt;3.1.1 Text-Attributed Graphs and Formal Definition&lt;/h4&gt;

&lt;p&gt;Graph-based RAG uniformly represents graph data as &lt;strong&gt;Text-Attributed Graphs (TAGs)&lt;/strong&gt; [Peng et al., 2024]:&lt;/p&gt;

\[G = (V, E,\{x_v\}_{v\in V}, \{e_{i,j}\}_{i,j \in E})\]

&lt;p&gt;where $V$ is the node set, $E \subseteq V \times V$ is the edge set, $A$ is the adjacency matrix, and ${x_v}$ and ${e_{i,j}}$ are the text attributes of nodes and edges, respectively. The objective of Graph-based RAG is to find the optimal answer $a^*$ given a query $q$ and a TAG $G$:&lt;/p&gt;

\[a^* = \arg\max p(a|q, G)\]

&lt;p&gt;Through joint modeling, the generation probability of the answer can be decomposed as the product of the probability of retrieving a subgraph and the probability of generating an answer based on that subgraph:&lt;/p&gt;

\[p(a|q, G) \approx p_\phi(a|q, G^*)p_\theta(G^*|q, G)\]

&lt;h4 id=&quot;312-three-category-taxonomy&quot;&gt;3.1.2 Three-Category Taxonomy&lt;/h4&gt;

&lt;p&gt;Based on the role the graph plays in the RAG system, Graph-based RAG can be classified into three types [Zhang et al., 2025a]:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Type&lt;/th&gt;
      &lt;th&gt;Core Idea&lt;/th&gt;
      &lt;th&gt;Characteristics&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Knowledge-based&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Graph as knowledge carrier&lt;/td&gt;
      &lt;td&gt;Explicitly models domain knowledge and semantic relations; understands complex relations through graph transformations&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Index-based&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Graph as indexing tool&lt;/td&gt;
      &lt;td&gt;Organizes raw text through graphs; optimizes retrieval and global navigation&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Hybrid&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Combines strengths of both&lt;/td&gt;
      &lt;td&gt;Provides more advanced solutions for complex reasoning tasks&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Graph-based RAG systems can reduce token usage by &lt;strong&gt;26% to 97%&lt;/strong&gt; compared to conventional methods when generating answers, with significant improvements in both speed and resource utilization [Zhang et al., 2025a].&lt;/p&gt;

&lt;h3 id=&quot;32-the-three-stage-workflow-of-graph-based-rag&quot;&gt;3.2 The Three-Stage Workflow of Graph-based RAG&lt;/h3&gt;

&lt;p&gt;The workflow of Graph-based RAG can be summarized as three core stages: &lt;strong&gt;G-Indexing&lt;/strong&gt; (graph-based index construction), &lt;strong&gt;G-Retrieval&lt;/strong&gt; (graph-guided retrieval), and &lt;strong&gt;G-Generation&lt;/strong&gt; (graph-enhanced generation) [Peng et al., 2024; Zhang et al., 2025a; Huang et al., 2026]. Rich optimization methods exist before and after each stage, enabling systematic improvement of retrieval and generation quality.&lt;/p&gt;

&lt;hr /&gt;

&lt;h4 id=&quot;321-g-indexing-graph-based-index-construction&quot;&gt;3.2.1 G-Indexing: Graph-based Index Construction&lt;/h4&gt;

&lt;h5 id=&quot;pre-indexing-data-preprocessing-and-index-optimization&quot;&gt;Pre-indexing: Data Preprocessing and Index Optimization&lt;/h5&gt;

&lt;p&gt;Index quality is directly determined by the effectiveness of raw data processing. Before constructing the graph index, systematic preprocessing and optimization of the data are required [Huang et al., 2026; Gao et al., 2024]:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Optimization Direction&lt;/th&gt;
      &lt;th&gt;Specific Methods&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Data Cleaning&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Remove irrelevant/redundant information, supplement additional information&lt;/td&gt;
      &lt;td&gt;Improves index purity, reduces noise interference&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Document Segmentation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Sliding Window, Fine-grained Segmentation&lt;/td&gt;
      &lt;td&gt;Balances context completeness with index granularity&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Metadata Augmentation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Metadata Incorporation&lt;/td&gt;
      &lt;td&gt;Enriches index information with metadata, supporting subsequent filtering and routing&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;These preprocessing techniques establish the data foundation for subsequent graph index construction.&lt;/p&gt;

&lt;h5 id=&quot;index-method-taxonomy&quot;&gt;Index Method Taxonomy&lt;/h5&gt;

&lt;p&gt;Based on the degree of graph structure preservation, indexing methods can be classified into four categories [Peng et al., 2024; Zhang et al., 2025a]:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Method&lt;/th&gt;
      &lt;th&gt;Characteristics&lt;/th&gt;
      &lt;th&gt;Algorithms&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Graph Indexing&lt;/td&gt;
      &lt;td&gt;Preserves complete graph structure&lt;/td&gt;
      &lt;td&gt;BFS, Shortest Path&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Text Indexing&lt;/td&gt;
      &lt;td&gt;Converts graph data into text descriptions&lt;/td&gt;
      &lt;td&gt;Sparse Retrieval, Dense Retrieval&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Vector Indexing&lt;/td&gt;
      &lt;td&gt;Converts to vector representations for efficiency&lt;/td&gt;
      &lt;td&gt;LSH (Locality Sensitive Hashing)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Hybrid Indexing&lt;/td&gt;
      &lt;td&gt;Combines all three approaches&lt;/td&gt;
      &lt;td&gt;—&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Beyond graph indexing, Graph-based RAG can additionally leverage &lt;strong&gt;hierarchical index structures&lt;/strong&gt; to establish multi-granularity retrieval support for documents, as well as &lt;strong&gt;knowledge graph indexing&lt;/strong&gt; that uses knowledge graphs to organize document relations and enable structured navigation [Gao et al., 2024].&lt;/p&gt;

&lt;h5 id=&quot;knowledge-graphs-as-structural-indices&quot;&gt;Knowledge Graphs as Structural Indices&lt;/h5&gt;

&lt;p&gt;In Graph-based RAG, knowledge graphs serve not merely as a knowledge representation form but as a powerful structural indexing mechanism for organizing inter-document relations and supporting structured navigation [Peng et al., 2024; Zhang et al., 2025a].&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;KG Construction Challenges.&lt;/strong&gt; At the knowledge graph construction level, different scenarios face distinct challenges. Domain-specific corpora face a triple challenge: complex knowledge dependencies (domain knowledge progresses from foundational to advanced concepts layer by layer, requiring cross-reference analysis), domain specificity (dense technical terminology and abbreviations), and limited reference knowledge (private technical documents are difficult to obtain externally) [Sun et al., 2025]. For general-purpose knowledge graphs, Li et al. (2026) note that while LLM pipelines can extract entities and relations at scale, the resulting graphs often lack a shared schema, with entity types and relation vocabularies being ad hoc. To address this, ontology-oriented construction methods emphasize building schema as a first-class resource for downstream tasks from the outset, rather than as a byproduct of graph construction.&lt;/p&gt;

&lt;hr /&gt;

&lt;h5 id=&quot;two-view-kg-the-ontological-dual-layer-architecture&quot;&gt;Two-View KG: The Ontological Dual-Layer Architecture&lt;/h5&gt;

&lt;p&gt;The &lt;strong&gt;Two-View KG&lt;/strong&gt; concept proposed by Hao et al. (2019) reveals the ontological dual-layer architecture of knowledge graphs, simultaneously representing two complementary views:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Ontology View&lt;/strong&gt;: The abstract concept layer, defining types, relation schemas, and other meta-knowledge&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Instance View&lt;/strong&gt;: The concrete entity layer, containing actual fact triples&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Between the two views exist &lt;strong&gt;cross-view links&lt;/strong&gt; that connect ontological concepts with their instantiated entities, while satisfying mutual disjointness constraints (the entity vocabulary set and concept vocabulary set are disjoint, as are the relation set and meta-relation set). This dual-layer structure provides Graph-based RAG systems with navigation paths from abstract concepts to concrete instances, enabling the system to both understand high-level semantic patterns and locate specific knowledge details.&lt;/p&gt;

&lt;p&gt;Yeom et al. (2024) further note that a large number of knowledge graphs are essentially Two-View KGs: abstract classes in the ontology view form tree-like hierarchical structures through class inheritance, while concrete entities in the instance view are instantiated from ontological classes. This structured dual-layer representation holds significant value for Graph-based RAG tasks that must simultaneously leverage abstract concept reasoning and concrete fact verification.&lt;/p&gt;

&lt;hr /&gt;

&lt;h4 id=&quot;322-g-retrieval-graph-guided-retrieval&quot;&gt;3.2.2 G-Retrieval: Graph-Guided Retrieval&lt;/h4&gt;

&lt;h5 id=&quot;pre-retrieval-query-optimization&quot;&gt;Pre-retrieval: Query Optimization&lt;/h5&gt;

&lt;p&gt;Before formal retrieval, the system can optimize queries through various means to improve retrieval effectiveness [Gao et al., 2024; Huang et al., 2026]:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Category&lt;/th&gt;
      &lt;th&gt;Sub-category&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Query Expansion&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Multi-Query&lt;/td&gt;
      &lt;td&gt;Generates multiple related queries to expand retrieval coverage&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;Sub-Query&lt;/td&gt;
      &lt;td&gt;Decomposes complex queries into sub-queries&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;CoVe (Chain-of-Verification)&lt;/td&gt;
      &lt;td&gt;Verification chain expansion, progressively confirms query completeness&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Query Transformation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Query Rewrite&lt;/td&gt;
      &lt;td&gt;Rewrites queries to improve retrieval effectiveness&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt; &lt;/td&gt;
      &lt;td&gt;Query Routing&lt;/td&gt;
      &lt;td&gt;Routes queries to different processing paths based on query characteristics (via metadata filtering or semantic similarity)&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Additionally, &lt;strong&gt;RRR (Rewrite-Retrieve-Read)&lt;/strong&gt; uses dedicated small language models for query rewriting, with some methods further introducing external adapters to assist retriever-generator alignment [Gao et al., 2024]. Query optimization essentially clarifies and expands information needs before retrieval, reducing the semantic gap between queries and the index.&lt;/p&gt;

&lt;h5 id=&quot;retriever-types&quot;&gt;Retriever Types&lt;/h5&gt;

&lt;p&gt;By degree of parameterization, retrievers can be classified into three categories [Peng et al., 2024]:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Non-parametric Retriever&lt;/strong&gt;: High retrieval efficiency, no training required&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;LM-based Retriever&lt;/strong&gt;: E.g., fine-tuned RoBERTa models, balancing semantic understanding with efficiency&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;GNN-based Retriever&lt;/strong&gt;: Leverages graph neural networks to capture structural information, but with higher computational cost&lt;/li&gt;
&lt;/ul&gt;

&lt;h5 id=&quot;retrieval-techniques&quot;&gt;Retrieval Techniques&lt;/h5&gt;

&lt;p&gt;By the mode of interaction between queries and graph data, retrieval techniques can be classified into three categories [Zhang et al., 2025a]:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Type&lt;/th&gt;
      &lt;th&gt;Core Idea&lt;/th&gt;
      &lt;th&gt;Representative Methods&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Semantics Similarity-based&lt;/td&gt;
      &lt;td&gt;Models similarity in discrete space (substring matching, regex) or embedding space (TF-IDF, Word2Vec)&lt;/td&gt;
      &lt;td&gt;Substring matching, TF-IDF, Word2Vec&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Logical Reasoning-based&lt;/td&gt;
      &lt;td&gt;Uses rule mining, inductive logic programming, constraint satisfaction to reveal implicit insights&lt;/td&gt;
      &lt;td&gt;Rule Mining, ILP, Constraint Satisfaction&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;GNN-based&lt;/td&gt;
      &lt;td&gt;Uses graph neural networks for graph modeling and mining&lt;/td&gt;
      &lt;td&gt;GCN, GAT&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;While semantic similarity methods are simple to implement, they cannot fully exploit graph structure information, resulting in significant underestimation of the inherent advantages of graph databases [Zhang et al., 2025a].&lt;/p&gt;

&lt;h5 id=&quot;retrieval-paradigms-and-granularity&quot;&gt;Retrieval Paradigms and Granularity&lt;/h5&gt;

&lt;p&gt;Retrieval paradigms include &lt;strong&gt;Once Retrieval&lt;/strong&gt;, &lt;strong&gt;Iterative Retrieval&lt;/strong&gt; (with non-adaptive and adaptive variants), and &lt;strong&gt;Multi-Stage Retrieval&lt;/strong&gt; [Huang et al., 2026]. Retrieval granularity spans four levels: Nodes, Triplets, Paths, and Subgraphs. The goal of granularity optimization is to balance relevance with efficiency: coarse-grained units provide richer context but may introduce redundancy and noise; fine-grained units are more semantically focused but may lack completeness and increase retrieval burden [Zhang et al., 2025a].&lt;/p&gt;

&lt;h5 id=&quot;retrieval-augmentation&quot;&gt;Retrieval Augmentation&lt;/h5&gt;

&lt;p&gt;To enhance retrieval, the system can perform &lt;strong&gt;Query Enhancement&lt;/strong&gt; (query expansion, query decomposition) before retrieval or &lt;strong&gt;Knowledge Enhancement&lt;/strong&gt; (knowledge merging, knowledge pruning) after retrieval.&lt;/p&gt;

&lt;h5 id=&quot;post-retrieval-re-ranking-and-context-compression&quot;&gt;Post-retrieval: Re-ranking and Context Compression&lt;/h5&gt;

&lt;p&gt;After retrieval and before generation, the retrieval results typically require processing to improve final generation quality [Gao et al., 2024]:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Reranking&lt;/strong&gt;: Fine-grained relevance scoring of retrieved candidate results&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Context Compression&lt;/strong&gt;: Removing redundant information and compressing retrieval results to a context length suitable for LLM processing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Redundant information disrupts LLM generation quality, while excessively long contexts may cause the LLM to exhibit the &lt;strong&gt;“Lost in the Middle”&lt;/strong&gt; problem — difficulty effectively utilizing information in the middle portions of long contexts [Gao et al., 2024].&lt;/p&gt;

&lt;hr /&gt;

&lt;h4 id=&quot;323-g-generation-graph-enhanced-generation-and-knowledge-integration&quot;&gt;3.2.3 G-Generation: Graph-enhanced Generation and Knowledge Integration&lt;/h4&gt;

&lt;h5 id=&quot;knowledge-integration&quot;&gt;Knowledge Integration&lt;/h5&gt;

&lt;p&gt;Retrieved graph data must be transformed into natural language responses through appropriate generation strategies. &lt;strong&gt;G-Integration&lt;/strong&gt; (knowledge integration) is the process of effectively fusing retrieval results with LLM generation capabilities. Zhang et al. (2025a) summarize the conventional RAG pipeline as comprising three core components: knowledge organization, knowledge retrieval, and knowledge integration. Knowledge integration occupies the pre-generation stage, and its quality directly impacts the accuracy and coherence of the final answer.&lt;/p&gt;

&lt;h5 id=&quot;generation-paradigms&quot;&gt;Generation Paradigms&lt;/h5&gt;

&lt;p&gt;Graph-enhanced generation can be categorized into three types [Peng et al., 2024]:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Type&lt;/th&gt;
      &lt;th&gt;Paradigm&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;GNNs&lt;/td&gt;
      &lt;td&gt;—&lt;/td&gt;
      &lt;td&gt;First processes graph data with GNNs, encapsulating structural and relational information for LM comprehension&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;LMs&lt;/td&gt;
      &lt;td&gt;—&lt;/td&gt;
      &lt;td&gt;Language models generate the final text response&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Hybrid&lt;/td&gt;
      &lt;td&gt;Cascaded Paradigm&lt;/td&gt;
      &lt;td&gt;Models sequentially process different aspects of the data&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Hybrid&lt;/td&gt;
      &lt;td&gt;Parallel Paradigm&lt;/td&gt;
      &lt;td&gt;Models simultaneously receive inputs, process collaboratively; outputs merged via rules or another model&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h5 id=&quot;generation-control-context-aware-and-grounded-constraints&quot;&gt;Generation Control: Context-Aware and Grounded Constraints&lt;/h5&gt;

&lt;p&gt;During generation, the system can actively control output quality through reasoning capabilities [Li et al., 2025]:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Control Direction&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
      &lt;th&gt;Representative Methods&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Context-Aware Generation&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Selectively utilizes context, avoids interference from irrelevant information&lt;/td&gt;
      &lt;td&gt;Open-RAG, RARE, Self-Reasoning&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Grounded Generation Control&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Fact verification, citation generation, ensures output fidelity to retrieval evidence&lt;/td&gt;
      &lt;td&gt;RARR, TRACE, AlignRAG&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h5 id=&quot;post-generation-iterative-refinement-and-verification&quot;&gt;Post-generation: Iterative Refinement and Verification&lt;/h5&gt;

&lt;p&gt;Generation is not the endpoint. Through Test-Time Scaling, the system can further refine output quality after generation [Zhang et al., 2025a]:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Iterative Self-Refinement&lt;/strong&gt;: Multiple rounds of self-improvement, progressively correcting errors in generated content [Madaan et al., 2023]&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Self-Consistency Decoding&lt;/strong&gt;: Consistency verification across multiple decoding paths to select the most reliable answer [Hao et al., 2023]&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These methods transform generation from a one-shot output into an iteratively optimizable process, which is particularly important in complex reasoning tasks.&lt;/p&gt;

&lt;h3 id=&quot;33-applicability-of-graph-based-rag-when-are-graph-structures-needed&quot;&gt;3.3 Applicability of Graph-based RAG: When Are Graph Structures Needed?&lt;/h3&gt;

&lt;p&gt;Not all tasks require introducing graph structures. Xiang et al. (2026) systematically compared vanilla RAG and Graph-based RAG across different task complexity levels through GraphRAG-Bench, proposing a series of key observations.&lt;/p&gt;

&lt;h4 id=&quot;331-task-complexity-and-the-graph-based-rag-advantage-threshold&quot;&gt;3.3.1 Task Complexity and the Graph-based RAG Advantage Threshold&lt;/h4&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Observation&lt;/th&gt;
      &lt;th&gt;Core Conclusion&lt;/th&gt;
      &lt;th&gt;Task Type&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Obs.1&lt;/td&gt;
      &lt;td&gt;Basic RAG and Graph-based RAG perform comparably on simple factual retrieval tasks&lt;/td&gt;
      &lt;td&gt;Simple factual retrieval&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Obs.2&lt;/td&gt;
      &lt;td&gt;Graph-based RAG excels in complex tasks&lt;/td&gt;
      &lt;td&gt;Complex reasoning&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Obs.3&lt;/td&gt;
      &lt;td&gt;Graph-based RAG ensures higher factual reliability in creative tasks&lt;/td&gt;
      &lt;td&gt;Creative generation&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Obs.4&lt;/td&gt;
      &lt;td&gt;RAG is adept at extracting discrete facts from simple questions not requiring complex logic&lt;/td&gt;
      &lt;td&gt;Simple QA&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Obs.5&lt;/td&gt;
      &lt;td&gt;As questions grow increasingly complex, the advantage of Graph-based RAG becomes evident&lt;/td&gt;
      &lt;td&gt;Increasing complexity&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;RAG performs excellently in scenarios requiring rapid access to discrete information, while Graph-based RAG excels at tasks requiring nuanced understanding of interconnected data [Xiang et al., 2026].&lt;/p&gt;

&lt;p&gt;In terms of retrieval performance, a trade-off exists: &lt;strong&gt;Global Graph-based RAG&lt;/strong&gt; achieves superior Evidence Recall (83.1%), accessing more relevant information; whereas &lt;strong&gt;RAG&lt;/strong&gt; achieves superior Context Relevance (78.8%), with more focused retrieval results and less redundancy. This indicates that while Graph-based RAG retrieves broader information, its retrieval method inevitably introduces some redundancy [Xiang et al., 2026].&lt;/p&gt;

&lt;p&gt;Additionally, different Graph-based RAG implementations produce index graphs with significant structural differences; compared to vanilla RAG, Graph-based RAG substantially increases prompt length, with prompt length exhibiting a clear upward trend as task complexity increases.&lt;/p&gt;

&lt;h4 id=&quot;332-practical-implications&quot;&gt;3.3.2 Practical Implications&lt;/h4&gt;

&lt;p&gt;Based on the above empirical findings, four scenario-specific recommendations can be summarized:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Simple Factual Retrieval&lt;/strong&gt;: Vanilla RAG is sufficient; there is no need to incur the additional overhead of Graph-based RAG.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Complex Reasoning / Multi-hop Queries&lt;/strong&gt;: Graph-based RAG offers clear advantages; graph structures can explicitly model cross-document associations.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Creative Generation&lt;/strong&gt;: Graph-based RAG provides higher factual reliability, but attention must be paid to the trade-off between retrieval redundancy and context relevance.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Efficiency-Sensitive Scenarios&lt;/strong&gt;: Prompt inflation in Graph-based RAG is a significant constraint, particularly in long-context or token-constrained environments.&lt;/li&gt;
&lt;/ol&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;4-agentic-search&quot;&gt;4. Agentic Search&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Core Paradigm Shift: From Fixed Pipelines to Autonomous Multi-turn RAG&lt;/strong&gt;&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt; &lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;&lt;strong&gt;Traditional RAG&lt;/strong&gt;&lt;/th&gt;
      &lt;th style=&quot;text-align: center&quot;&gt;&lt;strong&gt;Agentic Search&lt;/strong&gt;&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Retrieval Turns&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Single turn&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Multi-turn&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Retrieval Timing&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Fixed (one retrieval before generation)&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Dynamic (triggered on-demand during reasoning)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Query Construction&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Raw query used directly&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Agent dynamically constructs based on intermediate reasoning&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Decision-Making Entity&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Predefined workflow&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;LLM autonomous decision-making&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Typical Paradigm&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Retrieve → Generate&lt;/td&gt;
      &lt;td style=&quot;text-align: center&quot;&gt;Reason ⟷ Retrieve ⟷ Reason ⟷ … → Generate&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h3 id=&quot;41-definition-and-core-components&quot;&gt;4.1 Definition and Core Components&lt;/h3&gt;

&lt;p&gt;The core characteristics of Agentic Search include: autonomous reasoning — the agent dynamically plans and adjusts its approach based on intermediate results rather than following preset patterns; on-demand retrieval — dynamically triggered based on uncertainty or information needs during the reasoning process; and iterative synthesis — retrieved information recursively refines reasoning, forming a feedback loop where reasoning and retrieval mutually reinforce each other.&lt;/p&gt;

&lt;p&gt;The system comprises four core components:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Component&lt;/th&gt;
      &lt;th&gt;Function&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;LLM&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Core reasoning engine, providing role definition and task understanding capabilities&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Memory&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Short-term memory maintains current reasoning context; long-term memory stores cross-session knowledge and preferences&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Planning&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Dynamically plans task step sequences through Reflection and Self-Critique&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Tools&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;&lt;strong&gt;Retrieval backends&lt;/strong&gt; (Dense RAG, GraphRAG, web search, etc.) form core infrastructure; additionally includes external capabilities such as API calls&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Planning is the core differentiating component of Agentic Search. While traditional RAG follows a fixed “retrieve-then-generate” pattern, agents through Planning capability can autonomously decompose complex problems, prioritize information needs, and adjust strategies based on feedback during execution, transforming retrieval from passive data supply into an active reasoning resource.&lt;/p&gt;

&lt;h3 id=&quot;42-workflow-patterns-from-linear-reasoning-to-graph-structured-exploration&quot;&gt;4.2 Workflow Patterns: From Linear Reasoning to Graph-structured Exploration&lt;/h3&gt;

&lt;p&gt;The workflow design of Agentic Search determines how the system handles complex queries. From the macro control logic perspective, Singh et al. (2026) summarize five general patterns: Prompt Chaining (improving accuracy through sequential processing), Routing (routing to different processing strategies based on input characteristics), Parallelization (processing independent sub-tasks in parallel), Orchestrator-Workers (dynamically assigning tasks to worker threads), and Evaluator-Optimizer (iteratively evaluating and optimizing outputs).&lt;/p&gt;

&lt;p&gt;Li et al. (2025), from the reasoning structure perspective, categorize workflows into theoretically more significant classes:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Workflow Type&lt;/th&gt;
      &lt;th&gt;Structural Characteristics&lt;/th&gt;
      &lt;th&gt;Representative Methods&lt;/th&gt;
      &lt;th&gt;Advantages&lt;/th&gt;
      &lt;th&gt;Limitations&lt;/th&gt;
      &lt;th&gt;Applicable Scenarios&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Chain-based&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Linear sequence, one retrieval per reasoning step&lt;/td&gt;
      &lt;td&gt;IRCoT, Rat, CoV-RAG, RAFT&lt;/td&gt;
      &lt;td&gt;Low latency, low token cost, easy caching&lt;/td&gt;
      &lt;td&gt;Error propagation, rapid context growth&lt;/td&gt;
      &lt;td&gt;Single-hop or short multi-hop QA&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Tree-based (ToT)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Parallel exploration of multiple branches to hedge early errors&lt;/td&gt;
      &lt;td&gt;RATT, Tree of Clarifications, AirRAG&lt;/td&gt;
      &lt;td&gt;High recall, transparent hypothesis analysis&lt;/td&gt;
      &lt;td&gt;Quadratic cost, multiple retrieval calls&lt;/td&gt;
      &lt;td&gt;Ambiguous or multi-path tasks&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Tree-based (MCTS)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Budget-aware exploration, focusing on promising branches&lt;/td&gt;
      &lt;td&gt;MCTS-RAG, SeRTS&lt;/td&gt;
      &lt;td&gt;Graceful anytime stopping&lt;/td&gt;
      &lt;td&gt;Parameter-dependent, may converge to suboptima&lt;/td&gt;
      &lt;td&gt;Deep search under strict budgets&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Graph-based (Walk-on-Graph)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Efficient walks on explicit KG/document graphs&lt;/td&gt;
      &lt;td&gt;QA-GNN, LightRAG&lt;/td&gt;
      &lt;td&gt;Efficient on KGs, short paths&lt;/td&gt;
      &lt;td&gt;Requires high-quality KG, limited flexibility&lt;/td&gt;
      &lt;td&gt;Domain QA with existing KGs&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Graph-based (Think-on-Graph)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;LLM dynamically updates evidence graph, adaptive and verifiable&lt;/td&gt;
      &lt;td&gt;ToG, ToG-2.0, Graph-CoT&lt;/td&gt;
      &lt;td&gt;Node-level citation checking, high accuracy&lt;/td&gt;
      &lt;td&gt;High latency, search space explosion risk&lt;/td&gt;
      &lt;td&gt;Open-domain deep research&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Chain-based methods, exemplified by Chain-of-Thought (CoT), structure reasoning into linear sequences of intermediate steps. However, relying solely on LLM parametric knowledge readily leads to error propagation, where small deviations at each step are amplified in subsequent steps. Tree-based methods hedge early errors through parallel exploration of multiple branches; Tree-of-Thought (ToT) allows multiple hypotheses to coexist and be evaluated simultaneously, while Monte Carlo Tree Search (MCTS) focuses exploration on the most promising branches through budget-aware strategies.&lt;/p&gt;

&lt;p&gt;Graph-based methods represent deeper reasoning-retrieval coupling. Walk-on-Graph methods primarily rely on graph learning techniques for retrieval and reasoning, including GNNs (leveraging graph neural networks for graph modeling and retrieval reasoning) and lightweight graph techniques (vector indexing, PageRank, and other link-structure-based ranking methods). Think-on-Graph methods embed graph structures directly into the LLM reasoning loop, enabling the LLM to serve as a “reasoning field” on the graph, dynamically deciding which connected entity or relation to explore next, progressively constructing paths to the answer. The significant advantage of this approach lies in node-level citation checking and higher accuracy, at the cost of higher latency and potentially exploding search spaces.&lt;/p&gt;

&lt;h3 id=&quot;43-agent-orchestration-and-training-paradigms&quot;&gt;4.3 Agent Orchestration and Training Paradigms&lt;/h3&gt;

&lt;h4 id=&quot;431-system-architecture-taxonomy&quot;&gt;4.3.1 System Architecture Taxonomy&lt;/h4&gt;

&lt;p&gt;Based on the comprehensive classification of Singh et al. (2026) and Li et al. (2025), Agentic Search systems can be categorized by architectural complexity into multiple levels:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Type&lt;/th&gt;
      &lt;th&gt;Core Characteristics&lt;/th&gt;
      &lt;th&gt;Representative Methods&lt;/th&gt;
      &lt;th&gt;Applicable Scenarios&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Single-Agent (Prompt-only)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;ReAct loop, simple implementation&lt;/td&gt;
      &lt;td&gt;ReAct, Search-O1&lt;/td&gt;
      &lt;td&gt;Prototype demonstrations, simple queries&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Single-Agent (SFT/RL)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Fine-tuning or reinforcement learning enhances retrieval and reasoning capabilities&lt;/td&gt;
      &lt;td&gt;Toolformer, Search-R1&lt;/td&gt;
      &lt;td&gt;Production systems, open-domain research&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Multi-Agent (Decentralized)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Parallel expert collaboration, high recall&lt;/td&gt;
      &lt;td&gt;M-RAG, MDocAgent&lt;/td&gt;
      &lt;td&gt;Large-scale evidence aggregation across heterogeneous sources&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Multi-Agent (Centralized)&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Hierarchical manager coordinates sub-tasks&lt;/td&gt;
      &lt;td&gt;Chain of Agents&lt;/td&gt;
      &lt;td&gt;Complex tasks under strict budgets&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Hierarchical&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Strategic decision-making → delegation → aggregation&lt;/td&gt;
      &lt;td&gt;—&lt;/td&gt;
      &lt;td&gt;Scenarios requiring multi-level task decomposition&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Adaptive&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Dynamically selects strategies based on query complexity&lt;/td&gt;
      &lt;td&gt;—&lt;/td&gt;
      &lt;td&gt;Systems with diverse query types&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Corrective and Adaptive are two behavioral enhancement modes: Corrective introduces self-correction mechanisms to improve document utilization; Adaptive uses a classifier to assess query complexity and dynamically switches between single-step, multi-step, or skip-retrieval modes.&lt;/p&gt;

&lt;h4 id=&quot;432-evolution-of-training-paradigms&quot;&gt;4.3.2 Evolution of Training Paradigms&lt;/h4&gt;

&lt;p&gt;Agent capability acquisition has undergone a three-stage evolution:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Paradigm&lt;/th&gt;
      &lt;th&gt;Core Mechanism&lt;/th&gt;
      &lt;th&gt;Advantages&lt;/th&gt;
      &lt;th&gt;Limitations&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Prompt-based&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Prompt engineering defines retrieval and reasoning behavior&lt;/td&gt;
      &lt;td&gt;Simple, no training required&lt;/td&gt;
      &lt;td&gt;Constrained by fixed instruction patterns&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;SFT&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Fine-tuning on reasoning-retrieval joint data&lt;/td&gt;
      &lt;td&gt;Higher precision than prompting&lt;/td&gt;
      &lt;td&gt;Requires large amounts of synthetic data, prone to overfitting&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;RL&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Reward functions incentivize strategy discovery and adaptive optimization&lt;/td&gt;
      &lt;td&gt;Genuine agentic behavior&lt;/td&gt;
      &lt;td&gt;Difficult to define reward signals, expensive training&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The fundamental distinction: Prompt-based and SFT rely on offline supervision and fixed patterns; RL-trained agents are incentivized to autonomously discover search strategies rather than being told how to search.&lt;/p&gt;

&lt;h3 id=&quot;44-are-graph-structures-necessary-for-agentic-search--ragsearch-empirical-evidence&quot;&gt;4.4 Are Graph Structures Necessary for Agentic Search? — RAGSearch Empirical Evidence&lt;/h3&gt;

&lt;p&gt;The core question posed by Fan et al. (2026) addresses a key debate in Agentic Search: &lt;strong&gt;Do we still need GraphRAG?&lt;/strong&gt; — that is, can Agentic Search compensate for the absence of explicit graph structures through dynamic multi-turn retrieval and reasoning, thereby reducing dependence on high-cost GraphRAG?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Core Conclusion:&lt;/strong&gt; Agentic search can partially compensate for missing structural information in dense RAG through iterative retrieval, but explicit graph retrieval remains essential for robust multi-hop reasoning. GraphRAG consistently provides stronger performance and greater stability in complex settings, while dense RAG, with its lower construction cost, remains a practical choice for general-purpose QA.&lt;/p&gt;

&lt;p&gt;The following analysis supports this conclusion across three dimensions: the formal framework, eight empirical findings, and system case studies.&lt;/p&gt;

&lt;h4 id=&quot;441-formal-framework&quot;&gt;4.4.1 Formal Framework&lt;/h4&gt;

&lt;p&gt;RAGSearch formalizes Agentic Search as follows: given a query $q$, an LLM-equipped agent interacts with a retrieval backend $B$ (dense RAG or GraphRAG) over multiple turns. At each step, the agent decides whether to trigger retrieval or generate an answer based on reasoning history, with retrieved information appended to the reasoning sequence. Its core characteristics are: retrieval is executed dynamically rather than as one-time preprocessing; the same control logic can operate across different backends.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Two Implementation Paths:&lt;/strong&gt;&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Path&lt;/th&gt;
      &lt;th&gt;Mechanism&lt;/th&gt;
      &lt;th&gt;Representative Methods&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Training-Free&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Reasoning-driven on-demand search or Orchestrated multi-agent workflows&lt;/td&gt;
      &lt;td&gt;Search-o1, GraphSearch&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;RL-Based&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;GRPO training, Outcome-based + Format-based reward design&lt;/td&gt;
      &lt;td&gt;Search-R1, Graph-R1&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;442-eight-core-findings&quot;&gt;4.4.2 Eight Core Findings&lt;/h4&gt;

&lt;p&gt;RAGSearch revealed the relationship between Agentic Search and retrieval backends through systematic experiments:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Finding&lt;/th&gt;
      &lt;th&gt;Conclusion&lt;/th&gt;
      &lt;th&gt;Practical Implication&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Obs.1&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Under single inference, dense RAG is effective for general QA; GraphRAG primarily provides decisive improvements in multi-hop QA&lt;/td&gt;
      &lt;td&gt;Task complexity determines GraphRAG necessity&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Obs.2&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Agentic search can enhance dense RAG and partially close the gap with GraphRAG, but effectiveness depends on agent design&lt;/td&gt;
      &lt;td&gt;Structured agentic design is key, not simply increasing interaction turns&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Obs.3&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;RL-based training generally improves performance, but well-designed training-free pipelines remain competitive&lt;/td&gt;
      &lt;td&gt;Training cost and performance require trade-off&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Obs.4&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;In training-free workflows, explicit graph structures provide consistent and significant benefits for multi-hop QA&lt;/td&gt;
      &lt;td&gt;The robust advantage of graph structures in zero-training settings cannot be overlooked&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Obs.5&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;RL-based agentic performance is highly backend-dependent: graph retrievers gain larger improvements on multi-hop QA&lt;/td&gt;
      &lt;td&gt;RL + GraphRAG exhibits synergistic effects&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Obs.6&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;GraphRAG is more robust and stable than dense RAG in agentic search&lt;/td&gt;
      &lt;td&gt;Explicit structures reduce agentic control uncertainty&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Obs.7&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;GRPO is a favorable training paradigm for RL-based agentic systems&lt;/td&gt;
      &lt;td&gt;Validates GRPO’s effectiveness in retrieval augmentation&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Obs.8&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Larger backbones not only improve reasoning performance but also narrow the performance gap between GraphRAG and dense RAG&lt;/td&gt;
      &lt;td&gt;Larger models may reduce the marginal benefit of graph structures&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;h4 id=&quot;443-system-case-studies&quot;&gt;4.4.3 System Case Studies&lt;/h4&gt;

&lt;p&gt;This conclusion is corroborated in concrete system designs. Agent-G dynamically assigns retrieval tasks to specialized agents, simultaneously leveraging both graph knowledge bases and text documents. GeAR enhances conventional retrievers through graph expansion and introduces an agent framework for managing graph-structured data retrieval tasks. These systems demonstrate the deep synergy between graph structures and Agentic Search: explicit graph structures not only provide higher-quality retrieval results but also offer interpretable knowledge relation paths for agent decision-making, reducing the uncertainty of agentic control.&lt;/p&gt;

&lt;h3 id=&quot;45-open-challenges-and-future-directions&quot;&gt;4.5 Open Challenges and Future Directions&lt;/h3&gt;

&lt;h4 id=&quot;451-core-challenges&quot;&gt;4.5.1 Core Challenges&lt;/h4&gt;

&lt;p&gt;Agentic Search should not be viewed as a universal replacement for traditional RAG. While it provides superior adaptability and multi-step reasoning capabilities, it also introduces coordination complexity, latency, and computational costs. Core challenges include:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;Evaluation Difficulty&lt;/strong&gt; — Output-level metrics are insufficient to measure Agentic system quality; multi-dimensional evaluation of reasoning trajectories, planning depth, adaptability, robustness, and cost-effectiveness is needed.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Long-term Memory Design&lt;/strong&gt; — Knowledge drift, bias reinforcement, and frequent updates may amplify hallucination risks.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Coordination Complexity&lt;/strong&gt; — Multi-agent collaboration introduces communication and consensus overhead.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Computational Overhead&lt;/strong&gt; — Additional latency from agent reasoning is non-negligible in practical deployment.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Furthermore, a fundamental constraint is that agentic reasoning cannot compensate for persistently poor retrieval. Failures often stem from insufficient retrieval coverage, poorly constructed indices, or inadequate integration of structured and unstructured knowledge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Optimal Application Domains&lt;/strong&gt;: Agentic Search gains the strongest benefits in domains with structured knowledge and explicit constraints. Healthcare, finance, and legal analysis particularly benefit from combining retrieval with rule-based reasoning and graph-structured knowledge.&lt;/p&gt;

&lt;h4 id=&quot;452-efficiency-and-latency-optimization&quot;&gt;4.5.2 Efficiency and Latency Optimization&lt;/h4&gt;

&lt;p&gt;While Synergized RAG-Reasoning systems excel in complex reasoning, their iterative retrieval and multi-step reasoning loops may cause significant latency. Optimization directions include: budget-aware query planning — optimizing query strategies under strict API call or token budgets; memory-aware mechanisms — caching prior evidence or belief states to reduce redundant access.&lt;/p&gt;

&lt;h4 id=&quot;453-trustworthiness-and-adversarial-robustness&quot;&gt;4.5.3 Trustworthiness and Adversarial Robustness&lt;/h4&gt;

&lt;p&gt;Agentic Search systems remain vulnerable to adversarial attacks through poisoned or misleading external knowledge sources. Ensuring the trustworthiness of retrieved content is critical for maintaining fully reliable downstream reasoning. Systems need to establish credibility verification mechanisms for retrieved content, particularly in high-stakes scientific research and legal analysis scenarios.&lt;/p&gt;

&lt;h4 id=&quot;454-structured-data-and-multi-agent-deep-research&quot;&gt;4.5.4 Structured Data and Multi-agent Deep Research&lt;/h4&gt;

&lt;p&gt;Key future development directions include: iterative data organization — structuring intermediate search and reasoning content from agentic retrieval and reasoning processes to help agents maintain coherence and relevance in long-term contexts; and multi-agent deep research — leveraging graph structures to understand task requirements and role relationships, enabling effective task assignment and coordination. Graph structures help agents understand task requirements based on roles and relationships, support complex task decomposition, and make multi-agent collaboration more efficient.&lt;/p&gt;

&lt;hr /&gt;

&lt;h2 id=&quot;5-literature-retrieval-systems-in-scientific-research&quot;&gt;5. Literature Retrieval Systems in Scientific Research&lt;/h2&gt;

&lt;p&gt;The volume of scientific literature is growing at an exponential rate — according to statistics, the number of scientific papers doubles every 17 years. This trend has rendered traditional literature retrieval methods increasingly inadequate, giving rise to a new generation of AI-driven academic search platforms and research intelligence frameworks.&lt;/p&gt;

&lt;h3 id=&quot;51-academic-search-platforms-and-tools&quot;&gt;5.1 Academic Search Platforms and Tools&lt;/h3&gt;

&lt;p&gt;The current academic search ecosystem can be categorized by function into two types: &lt;strong&gt;search and synthesis platforms&lt;/strong&gt; (focused on literature discovery and content comprehension) and &lt;strong&gt;recommendation systems&lt;/strong&gt; (focused on personalized delivery and trend tracking).&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Platform&lt;/th&gt;
      &lt;th&gt;Core Functionality&lt;/th&gt;
      &lt;th&gt;Technical Characteristics&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Elicit&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Semantic search, paper summary extraction&lt;/td&gt;
      &lt;td&gt;AI-enhanced academic search&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Consensus&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Evidence synthesis, trend analysis&lt;/td&gt;
      &lt;td&gt;LLM-based scientific question answering&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;OpenScholar&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Large-scale academic literature retrieval&lt;/td&gt;
      &lt;td&gt;Semantic search + open access&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;SciSpace&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Paper summarization, multi-document information synthesis&lt;/td&gt;
      &lt;td&gt;Cross-document understanding and synthesis&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Connected Papers&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Visual literature graph exploration&lt;/td&gt;
      &lt;td&gt;Visualization based on bibliographic coupling and co-citation&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;ORKG ASK&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Structured knowledge access&lt;/td&gt;
      &lt;td&gt;KG-organized structured retrieval, more interpretable than conventional LLM QA&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Arxiv Sanity&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Paper recommendation&lt;/td&gt;
      &lt;td&gt;Personalized delivery based on ML and IR techniques&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Scholar Inbox&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Personalized academic information subscription&lt;/td&gt;
      &lt;td&gt;Interest-customized literature streams&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;ResearchTrend.ai&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Research trend discovery&lt;/td&gt;
      &lt;td&gt;Trend analysis and emerging direction identification&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Research Rabbit&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Visual literature exploration&lt;/td&gt;
      &lt;td&gt;Similarity-based literature network mapping&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Mainstream technical approaches for recommendation systems include &lt;strong&gt;content-based filtering&lt;/strong&gt;, &lt;strong&gt;collaborative filtering&lt;/strong&gt;, and &lt;strong&gt;hybrid approaches&lt;/strong&gt;. Notably, graph-structured systems such as ORKG ASK organize research contributions as structured data rather than unstructured text, offering unique advantages in interpretability.&lt;/p&gt;

&lt;h3 id=&quot;52-four-core-limitations-of-academic-search&quot;&gt;5.2 Four Core Limitations of Academic Search&lt;/h3&gt;

&lt;p&gt;Despite the important role these platforms play in literature discovery, academic search still faces systemic challenges:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Limitation&lt;/th&gt;
      &lt;th&gt;Description&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Data Quality and Coverage Gaps&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Incomplete, non-standard, or outdated data sources lead to inaccurate and inconsistent retrieval information&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Model Bias&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Search and ranking algorithms inherit biases from training data, affecting the visibility of certain research domains&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Scalability and Real-time Processing&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Efficiently processing large-scale datasets while maintaining low latency and high retrieval accuracy is challenging&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Matthew Effect Reinforcement&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Established researchers receive disproportionate attention; algorithms may exacerbate academic inequality&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Additionally, existing systems generally lack rigorous filtering options and advanced relevance ranking mechanisms. Many AI-assisted research tools rely on proprietary data, closed APIs, or evolving LLM backends, making strict reproducibility and long-term comparability difficult to ensure.&lt;/p&gt;

&lt;h3 id=&quot;53-research-ai-frameworks-and-evaluation-benchmarks&quot;&gt;5.3 Research AI Frameworks and Evaluation Benchmarks&lt;/h3&gt;

&lt;p&gt;To address the above challenges, researchers have proposed a series of AI frameworks targeting the full research pipeline, covering literature retrieval, survey generation, hypothesis generation, and experimental automation:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Framework&lt;/th&gt;
      &lt;th&gt;Core Functionality&lt;/th&gt;
      &lt;th&gt;Technical Characteristics&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;LitSearch&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Literature retrieval evaluation benchmark&lt;/td&gt;
      &lt;td&gt;Evaluates complex literature retrieval queries in ML and NLP domains&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;ResearchArena&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Academic survey LLM Agent evaluation&lt;/td&gt;
      &lt;td&gt;Three-stage: Information Discovery → Selection → Organization&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;SciLitLLM&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Scientific literature understanding enhancement&lt;/td&gt;
      &lt;td&gt;CPT + SFT hybrid strategy, domain knowledge injection&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;CiteME&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Citation management&lt;/td&gt;
      &lt;td&gt;Automated citation discovery and management&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;ResearchAgent&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;Research hypothesis generation&lt;/td&gt;
      &lt;td&gt;Multi-hop reasoning-assisted research ideation&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;&lt;strong&gt;Agent Laboratory&lt;/strong&gt;&lt;/td&gt;
      &lt;td&gt;End-to-end research automation&lt;/td&gt;
      &lt;td&gt;High success rates in data preparation, experimentation, and report writing; weak in literature review&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;The central tension facing current research AI frameworks is the &lt;strong&gt;trade-off between end-to-end automation capability and domain depth&lt;/strong&gt;. Systems such as Agent Laboratory perform excellently in data preparation, experiment execution, and report writing, but exhibit significant performance degradation during the literature review stage — precisely the phase requiring the most structured evaluation and domain expertise. While SciLitLLM and ResearchArena demonstrate promising results, they remain insufficient for tasks demanding deep domain knowledge and nuanced understanding. These limitations indicate that automated literature review remains a far-from-solved challenge, requiring better balance between structured evaluation, domain expertise, and reproducibility.&lt;/p&gt;

&lt;h2 id=&quot;6-references&quot;&gt;6. References&lt;/h2&gt;

&lt;p&gt;[1] Abou Ali, et al. (2025). Agentic AI: a comprehensive survey of architectures, applications, and future directions.&lt;/p&gt;

&lt;p&gt;[2] Bai, et al. (2023). Advancing abductive reasoning in knowledge graphs through complex logical hypothesis generation.&lt;/p&gt;

&lt;p&gt;[3] Borrego, et al. (2025). Research hypothesis generation over scientific knowledge graphs.&lt;/p&gt;

&lt;p&gt;[4] Eger, et al. (2025). Transforming science with large language models: a survey on AI-assisted scientific discovery, experimentation, content generation, and evaluation.&lt;/p&gt;

&lt;p&gt;[5] Fan, et al. (2026). Do we still need GraphRAG? Benchmarking RAG and GraphRAG for agentic search systems.&lt;/p&gt;

&lt;p&gt;[6] Gao, et al. (2024). Retrieval-augmented generation for large language models: a survey.&lt;/p&gt;

&lt;p&gt;[7] Gridach, et al. (2025). Agentic AI for scientific discovery: a survey of progress, challenges, and future directions.&lt;/p&gt;

&lt;p&gt;[8] Hambarde, et al. (2023). Information retrieval: recent advances and beyond.&lt;/p&gt;

&lt;p&gt;[9] Hao, et al. (2019). Universal representation learning of knowledge bases by jointly embedding instances and ontological concepts.&lt;/p&gt;

&lt;p&gt;[10] Huang, et al. (2026). A survey on retrieval-augmented text generation for large language models.&lt;/p&gt;

&lt;p&gt;[11] Li, et al. (2025). Towards agentic RAG with deep reasoning: a survey of RAG-reasoning systems in LLMs.&lt;/p&gt;

&lt;p&gt;[12] Li, et al. (2026). OntoKG: ontology-oriented knowledge graph construction with intrinsic-relational routing.&lt;/p&gt;

&lt;p&gt;[13] Niu, et al. (2026). A comprehensive survey of knowledge graph reasoning: approaches and applications.&lt;/p&gt;

&lt;p&gt;[14] Peng, et al. (2024). Graph retrieval-augmented generation: a survey.&lt;/p&gt;

&lt;p&gt;[15] Singh, et al. (2026). Agentic retrieval-augmented generation: a survey on agentic RAG.&lt;/p&gt;

&lt;p&gt;[16] Sun, et al. (2025). LKD-KGC: Domain-Specific KG Construction via LLM-driven Knowledge Dependency Parsing.&lt;/p&gt;

&lt;p&gt;[17] Thakur, et al. (2021). BEIR: a heterogenous benchmark for zero-shot evaluation of information retrieval models.&lt;/p&gt;

&lt;p&gt;[18] Xiang, et al. (2026). When to use graphs in RAG: a comprehensive analysis for graph retrieval-augmented generation.&lt;/p&gt;

&lt;p&gt;[19] Xu, et al. (2025). A Survey of Model Architectures in Information Retrieval.&lt;/p&gt;

&lt;p&gt;[20] Yeom, et al. (2024). Embedding two-view knowledge graphs with class inheritance and structural similarity.&lt;/p&gt;

&lt;p&gt;[21] Zhang, et al. (2025a). A survey of graph retrieval-augmented generation for customized large language models.&lt;/p&gt;

&lt;p&gt;[22] Zhang, et al. (2025b). From web search towards agentic deep research: incentivizing search with reasoning agents.&lt;/p&gt;

&lt;p&gt;[23] Zhang, et al. (2025c). On the role of pretrained language models in general-purpose text embeddings: a survey.&lt;/p&gt;
</description>
        <pubDate>Mon, 04 May 2026 00:00:00 +0000</pubDate>
        <link>https://www.caozihang.com//2026/05/04/week/</link>
        <guid isPermaLink="true">https://www.caozihang.com//2026/05/04/week/</guid>
        
        <category>日常</category>
        
        
      </item>
    
      <item>
        <title>【阅】本周阅读摘选2026-04-20 → 2026-04-26</title>
        <description>&lt;center style=&quot;margin-bottom: 20px; margin-top: 50px&quot;&gt;&lt;font color=&quot;#3879B1&quot; style=&quot;line-height: 1.4;font-weight: 700;font-size: 36px;box-sizing: border-box; &quot;&gt;本周阅读摘选&lt;/font&gt;&lt;/center&gt;

&lt;center style=&quot; margin-bottom: 30px;&quot;&gt;2026-04-20 → 2026-04-26&lt;/center&gt;

&lt;font style=&quot;font-weight: bold;&quot;&gt;目录&lt;/font&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#学术相关&quot; id=&quot;markdown-toc-学术相关&quot;&gt;学术相关&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#is-ai-an-algorithm-by-any-other-name-behavioral-reactions-to-ai--and-model-based-demand-planning-algorithms-&quot; id=&quot;markdown-toc-is-ai-an-algorithm-by-any-other-name-behavioral-reactions-to-ai--and-model-based-demand-planning-algorithms-&quot;&gt;Is AI an algorithm by any other name? Behavioral reactions to AI- and model-based demand planning algorithms &lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#unraveling-multifaceted-user-preferences-on-digital-platforms-a-bayesian-deep-learning-approach-2&quot; id=&quot;markdown-toc-unraveling-multifaceted-user-preferences-on-digital-platforms-a-bayesian-deep-learning-approach-2&quot;&gt;Unraveling multifaceted user preferences on digital platforms: a bayesian deep-learning approach [^2]&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;学术相关&quot;&gt;学术相关&lt;/h1&gt;

&lt;h2 id=&quot;is-ai-an-algorithm-by-any-other-name-behavioral-reactions-to-ai--and-model-based-demand-planning-algorithms-&quot;&gt;Is AI an algorithm by any other name? Behavioral reactions to AI- and model-based demand planning algorithms &lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/h2&gt;

&lt;h2 id=&quot;unraveling-multifaceted-user-preferences-on-digital-platforms-a-bayesian-deep-learning-approach-2&quot;&gt;Unraveling multifaceted user preferences on digital platforms: a bayesian deep-learning approach [^2]&lt;/h2&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;McKinley, F., Brau, R., Aloysius, J., &amp;amp; Hofer, A. R. (n.d.). Is AI an algorithm by any other name? Behavioral reactions to AI- and model-based demand planning algorithms. https://doi.org/10.1002/joom.70045 &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
        <pubDate>Mon, 27 Apr 2026 00:00:00 +0000</pubDate>
        <link>https://www.caozihang.com//2026/04/27/week/</link>
        <guid isPermaLink="true">https://www.caozihang.com//2026/04/27/week/</guid>
        
        <category>日常</category>
        
        
      </item>
    
      <item>
        <title>【阅】本周阅读摘选2026-04-13 → 2026-04-19</title>
        <description>&lt;center style=&quot;margin-bottom: 20px; margin-top: 50px&quot;&gt;&lt;font color=&quot;#3879B1&quot; style=&quot;line-height: 1.4;font-weight: 700;font-size: 36px;box-sizing: border-box; &quot;&gt;本周阅读摘选&lt;/font&gt;&lt;/center&gt;

&lt;center style=&quot; margin-bottom: 30px;&quot;&gt;2026-04-13 → 2026-04-19&lt;/center&gt;

&lt;font style=&quot;font-weight: bold;&quot;&gt;目录&lt;/font&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#学术相关&quot; id=&quot;markdown-toc-学术相关&quot;&gt;学术相关&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#towards-real-world-human-behavior-simulation-benchmarking-large-language-models-on-long-horizon-cross-scenario-heterogeneous-behavior-traces-&quot; id=&quot;markdown-toc-towards-real-world-human-behavior-simulation-benchmarking-large-language-models-on-long-horizon-cross-scenario-heterogeneous-behavior-traces-&quot;&gt;Towards real-world human behavior simulation: benchmarking large language models on long-horizon, cross-scenario, heterogeneous behavior traces &lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#an-introduction-to-doubledebiased-machine-learning-&quot; id=&quot;markdown-toc-an-introduction-to-doubledebiased-machine-learning-&quot;&gt;An introduction to double/debiased machine learning &lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#娱乐&quot; id=&quot;markdown-toc-娱乐&quot;&gt;娱乐&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#集智俱乐部丨物理学报告复杂系统的可预测性&quot; id=&quot;markdown-toc-集智俱乐部丨物理学报告复杂系统的可预测性&quot;&gt;集智俱乐部丨物理学报告：复杂系统的可预测性&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;学术相关&quot;&gt;学术相关&lt;/h1&gt;

&lt;h2 id=&quot;towards-real-world-human-behavior-simulation-benchmarking-large-language-models-on-long-horizon-cross-scenario-heterogeneous-behavior-traces-&quot;&gt;Towards real-world human behavior simulation: benchmarking large language models on long-horizon, cross-scenario, heterogeneous behavior traces &lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/h2&gt;

&lt;h2 id=&quot;an-introduction-to-doubledebiased-machine-learning-&quot;&gt;An introduction to double/debiased machine learning &lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/h2&gt;

&lt;h1 id=&quot;娱乐&quot;&gt;娱乐&lt;/h1&gt;

&lt;h2 id=&quot;集智俱乐部丨物理学报告复杂系统的可预测性&quot;&gt;&lt;a href=&quot;https://mp.weixin.qq.com/s/YM4rrJO7I-9mZxWMLbg0UQ?scene=1&amp;amp;click_id=26&quot;&gt;集智俱乐部丨物理学报告：复杂系统的可预测性&lt;/a&gt;&lt;/h2&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Chen, J., Xu, R., Cao, B., Pan, R., Zhang, Y., Hu, Y., Du, Y., Gao, T., Lu, Y., Sun, Y., Han, X., Sun, L., Wu, X., &amp;amp; Lin, H. (2026). Towards real-world human behavior simulation: Benchmarking large language models on long-horizon, cross-scenario, heterogeneous behavior traces (arXiv:2604.08362). arXiv. https://doi.org/10.48550/arXiv.2604.08362 &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Ahrens, A., Chernozhukov, V., Hansen, C., Kozbur, D., Schaffer, M., &amp;amp; Wiemann, T. (2025). An introduction to double/debiased machine learning (Version 2). arXiv. https://doi.org/10.48550/ARXIV.2504.08324 &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
        <pubDate>Mon, 20 Apr 2026 00:00:00 +0000</pubDate>
        <link>https://www.caozihang.com//2026/04/20/week/</link>
        <guid isPermaLink="true">https://www.caozihang.com//2026/04/20/week/</guid>
        
        <category>日常</category>
        
        
      </item>
    
      <item>
        <title>【阅】本周阅读摘选2026-04-06 → 2026-04-12</title>
        <description>&lt;center style=&quot;margin-bottom: 20px; margin-top: 50px&quot;&gt;&lt;font color=&quot;#3879B1&quot; style=&quot;line-height: 1.4;font-weight: 700;font-size: 36px;box-sizing: border-box; &quot;&gt;本周阅读摘选&lt;/font&gt;&lt;/center&gt;

&lt;center style=&quot; margin-bottom: 30px;&quot;&gt;2026-04-06 → 2026-04-12&lt;/center&gt;

&lt;font style=&quot;font-weight: bold;&quot;&gt;目录&lt;/font&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#学术相关&quot; id=&quot;markdown-toc-学术相关&quot;&gt;学术相关&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#技术技巧&quot; id=&quot;markdown-toc-技术技巧&quot;&gt;技术技巧&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#腾讯技术工程丨技术教科书顶级开发团队设计的harness工程项目源码什么样&quot; id=&quot;markdown-toc-腾讯技术工程丨技术教科书顶级开发团队设计的harness工程项目源码什么样&quot;&gt;腾讯技术工程丨技术教科书：顶级开发团队设计的Harness工程项目源码什么样&lt;/a&gt;        &lt;ul&gt;
          &lt;li&gt;&lt;a href=&quot;#快速启动&quot; id=&quot;markdown-toc-快速启动&quot;&gt;快速启动&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#工具系统&quot; id=&quot;markdown-toc-工具系统&quot;&gt;工具系统&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#查询系统&quot; id=&quot;markdown-toc-查询系统&quot;&gt;查询系统&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#multi-agent编排与任务系统&quot; id=&quot;markdown-toc-multi-agent编排与任务系统&quot;&gt;Multi-Agent编排与任务系统&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#tui&quot; id=&quot;markdown-toc-tui&quot;&gt;TUI&lt;/a&gt;&lt;/li&gt;
          &lt;li&gt;&lt;a href=&quot;#harness-engineering&quot; id=&quot;markdown-toc-harness-engineering&quot;&gt;Harness Engineering&lt;/a&gt;&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#cloc丨count-lines-of-code&quot; id=&quot;markdown-toc-cloc丨count-lines-of-code&quot;&gt;cloc丨Count Lines of Code&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#tree-sitter&quot; id=&quot;markdown-toc-tree-sitter&quot;&gt;Tree-Sitter&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#娱乐&quot; id=&quot;markdown-toc-娱乐&quot;&gt;娱乐&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#deepscientist--paperorchestra&quot; id=&quot;markdown-toc-deepscientist--paperorchestra&quot;&gt;DeepScientist &amp;amp; PaperOrchestra&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;学术相关&quot;&gt;学术相关&lt;/h1&gt;

&lt;h1 id=&quot;技术技巧&quot;&gt;技术技巧&lt;/h1&gt;

&lt;h2 id=&quot;腾讯技术工程丨技术教科书顶级开发团队设计的harness工程项目源码什么样&quot;&gt;&lt;a href=&quot;https://mp.weixin.qq.com/s/MKWckXraK1irNvMgCIJXZw&quot;&gt;腾讯技术工程丨技术教科书：顶级开发团队设计的Harness工程项目源码什么样&lt;/a&gt;&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;运行时：&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Bun&lt;/code&gt;启动速度快&lt;/li&gt;
  &lt;li&gt;语言：&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TypeScript&lt;/code&gt;确保类型安全&lt;/li&gt;
  &lt;li&gt;UI：&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;React+Ink&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;CLI解析：&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Commander.js&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Schema校验：&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Zod v4&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;代码搜索：&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ripgrep&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;协议：&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;MCP SDK + LSP&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;遥测：&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;OpenTelemetry + gRPC&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;特征标记：&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GrowthBook&lt;/code&gt; 支持A/B测试和渐进式发布&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;快速启动&quot;&gt;快速启动&lt;/h3&gt;

&lt;p&gt;分层启动架构&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;入口分发 &amp;amp; lazy import
    &lt;ul&gt;
      &lt;li&gt;重量级模块懒加载&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Paralle Prefetch&lt;/li&gt;
  &lt;li&gt;全局初始化
    &lt;ul&gt;
      &lt;li&gt;保证&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;memorize&lt;/code&gt;即便多次import也只执行一次&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;会话级初始化&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;工具系统&quot;&gt;工具系统&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;接口：&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Tool&amp;lt;Input, Output, Progress&amp;gt;&lt;/code&gt;
    &lt;ul&gt;
      &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;inputSchema&lt;/code&gt; Pydantic模型定义，Zod v4 shchema验证&lt;/li&gt;
      &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;checkPermissions&lt;/code&gt; 权限校验&lt;/li&gt;
      &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;isConcurrencySafe&lt;/code&gt; 并发安全控制流标记&lt;/li&gt;
      &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;isDestructive&lt;/code&gt; 标记危险操作&lt;/li&gt;
      &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;isAutoClassifierInput&lt;/code&gt; 为auto模式的安全分类器提供紧凑表示&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;buildTool&lt;/code&gt;工厂函数
    &lt;ul&gt;
      &lt;li&gt;所有tool必须通过工厂函数创建，提供安全默认值（默认假设不安全）&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;统一工具注册&lt;/li&gt;
  &lt;li&gt;通过&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;assembleToolPool()&lt;/code&gt;将内建工具与MCP工具集成，构建通过工具池出口&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;StreamingToolExecutor&lt;/code&gt;流式并行执行&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;查询系统&quot;&gt;查询系统&lt;/h3&gt;

&lt;blockquote&gt;
  &lt;p&gt;Agent Loop的核心&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
  &lt;li&gt;异步生成器驱动主循环
    &lt;ul&gt;
      &lt;li&gt;流式UI更新&lt;/li&gt;
      &lt;li&gt;中途中断&lt;/li&gt;
      &lt;li&gt;背压控制&lt;/li&gt;
      &lt;li&gt;消息独立持久化&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;循环状态管理
    &lt;ul&gt;
      &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;State&lt;/code&gt;对象
        &lt;ul&gt;
          &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;transition&lt;/code&gt;字段记录上次迭代的持续原因&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;四级上下文压缩管道 —— 渐进降级
    &lt;ul&gt;
      &lt;li&gt;Snip Compact 基于标记的历史剪裁（无需API调用）&lt;/li&gt;
      &lt;li&gt;Micro Compact 缓存编辑压缩（不破坏整体缓存的情况下删除特定工具调用结果）&lt;/li&gt;
      &lt;li&gt;Context Collapse 上下文折叠（多轮工具调用结果折叠为摘要，保留结构）&lt;/li&gt;
      &lt;li&gt;Auto Compact 全量摘要压缩（上下文接近窗口限制时使用生成摘要替换历史消息）&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;输出截断恢复机制
    &lt;ul&gt;
      &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;stop_reason === &apos;max_output_tokens&apos;&lt;/code&gt;输出截断有三层恢复机制
        &lt;ul&gt;
          &lt;li&gt;Token升级：尝试提高token预算上限&lt;/li&gt;
          &lt;li&gt;多轮恢复（最终3次）：注入恢复消息，要求模型从中断处继续&lt;/li&gt;
          &lt;li&gt;放弃：surface错误&lt;/li&gt;
        &lt;/ul&gt;
      &lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;模型降级容错&lt;/li&gt;
  &lt;li&gt;QueryEngine封装完整的查询生命周期&lt;/li&gt;
  &lt;li&gt;查询配置快照&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;QueryConfig&lt;/code&gt;
    &lt;ul&gt;
      &lt;li&gt;实时读取不能保证一个查询内部每次迭代行为一致&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;task_budget&lt;/code&gt; Token预算&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;multi-agent编排与任务系统&quot;&gt;Multi-Agent编排与任务系统&lt;/h3&gt;

&lt;blockquote&gt;
  &lt;p&gt;七种任务类型，独立生命周期管理；每种任务ID具有独立前缀字母和8个字符随机后缀&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;local_bash&lt;/code&gt; Shell命令 -&amp;gt; 后台进程&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;local_agent&lt;/code&gt; 本地subagent -&amp;gt; 独立进程&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;remote_agent&lt;/code&gt; 远程subagent -&amp;gt; WebSocket连接&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;in_process_teammate&lt;/code&gt; 进程内teammate -&amp;gt; 共享内存&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;local_workflow&lt;/code&gt; 本地工作流脚本&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;monitor_mcp&lt;/code&gt; MCP监控任务&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;dream&lt;/code&gt; “梦境”任务 -&amp;gt; 后台分析&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Agent Tool —— Subagent生成&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;subagent具有独立的对话历史和工具池&lt;/p&gt;

&lt;p&gt;关键约束：&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;工具白名单
    &lt;ul&gt;
      &lt;li&gt;禁止使用agent生成相关管理工具&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;权限继承&lt;/li&gt;
  &lt;li&gt;独立的AbortController&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Coordinator模式&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;当&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AGENT_COORDINATOR_MODE=1&lt;/code&gt;时，主线程变成协调器，仅负责分配任务&lt;/p&gt;

&lt;p&gt;此时，协调器线程仅有&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AgentTool&lt;/code&gt;+&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TaskStopTool&lt;/code&gt;+&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SendMessageTool&lt;/code&gt;，而&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Bash&lt;/code&gt;，&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;File Read/Edit/Write&lt;/code&gt;，&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Grep&lt;/code&gt;，&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Glob&lt;/code&gt;等工具由Worker线程拥有&lt;/p&gt;

&lt;p&gt;工具分离保证协调器不参与任务执行&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agent Swarms&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;进程内Teammate通过&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Unix Domain Socket&lt;/code&gt;进行通信&lt;/p&gt;

&lt;p&gt;本地快速通信&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DreamTask —— 后台分析&lt;/strong&gt;&lt;/p&gt;

&lt;h3 id=&quot;tui&quot;&gt;TUI&lt;/h3&gt;

&lt;p&gt;内置Ink渲染引擎&lt;/p&gt;

&lt;h3 id=&quot;harness-engineering&quot;&gt;Harness Engineering&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;上下文价格&lt;/li&gt;
  &lt;li&gt;架构约束
    &lt;ul&gt;
      &lt;li&gt;代码和工具的硬约束&lt;/li&gt;
      &lt;li&gt;Deny Rules -&amp;gt; Tool-level Permissions -&amp;gt; Generic Rules -&amp;gt; Permission Mode -&amp;gt; Auto Classifier&lt;/li&gt;
      &lt;li&gt;默认配置最严格的限制避免遗忘造成漏洞&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;自验证循环&lt;/li&gt;
  &lt;li&gt;上下文隔离
    &lt;ul&gt;
      &lt;li&gt;进程级隔离：subagent具有独立的进程&lt;/li&gt;
      &lt;li&gt;通信接口化&lt;/li&gt;
      &lt;li&gt;Coordinator模式控制&amp;amp;数据分离&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;熵治理：对抗系统自然熵增（上下文、记忆、文档混乱）
    &lt;ul&gt;
      &lt;li&gt;上下文蒸镏 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/compact&lt;/code&gt; + &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Auto Compact&lt;/code&gt;&lt;/li&gt;
      &lt;li&gt;知识沉淀 &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;memdir/&lt;/code&gt;持久化写入&lt;/li&gt;
      &lt;li&gt;状态清理 Session记忆自动失效&lt;/li&gt;
      &lt;li&gt;后台整理 AutoDream&lt;/li&gt;
      &lt;li&gt;碎片整理 Dream Phase 4: Prune &amp;amp; Index&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;模块化：防止与特定模型深度偶合&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;cloc丨count-lines-of-code&quot;&gt;&lt;a href=&quot;https://github.com/AlDanial/cloc&quot;&gt;cloc丨Count Lines of Code&lt;/a&gt;&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;file &amp;amp; directory &amp;amp; archive &amp;amp; git repo: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cloc [path]&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Iterative &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;for d in ./*/ ; do (cd &quot;$d&quot; &amp;amp;&amp;amp; echo &quot;$d&quot; &amp;amp;&amp;amp; cloc --vcs git); done&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;help &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cloc --help&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;有输入输出控制options，支持diff对比&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;table class=&quot;rouge-table&quot;&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class=&quot;rouge-gutter gl&quot;&gt;&lt;pre class=&quot;lineno&quot;&gt;1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
&lt;/pre&gt;&lt;/td&gt;&lt;td class=&quot;rouge-code&quot;&gt;&lt;pre&gt;prompt&amp;gt; cloc perl-5.22.0.tar.gz
    5605 text files.
    5386 unique files.
    2176 files ignored.

https://github.com/AlDanial/cloc v 1.65  &lt;span class=&quot;nv&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;25.49 s &lt;span class=&quot;o&quot;&gt;(&lt;/span&gt;134.7 files/s, 51980.3 lines/s&lt;span class=&quot;o&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;nt&quot;&gt;-----------------------------------------------------------------------------------&lt;/span&gt;
Language                         files          blank        comment           code
&lt;span class=&quot;nt&quot;&gt;-----------------------------------------------------------------------------------&lt;/span&gt;
Perl                              2892         136396         184362         536445
C                                  130          24676          33684         155648
C/C++ Header                       148           9766          16569         147858
Bourne Shell                       112           4044           6796          42668
Pascal                               8            458           1603           8592
XML                                 33            142              0           2410
YAML                                49             20             15           2078
C++                                 10            313            277           2033
make                                 4            426            488           1986
Prolog                              12            438              2           1146
JSON                                14              1              0           1037
yacc                                 1             85             76            998
Windows Message File                 1            102             11            489
DOS Batch                           14             92             41            389
Windows Resource File                3             10              0             85
D                                    1              5              7              8
Lisp                                 2              0              3              4
&lt;span class=&quot;nt&quot;&gt;-----------------------------------------------------------------------------------&lt;/span&gt;
SUM:                              3434         176974         243934         903874
&lt;span class=&quot;nt&quot;&gt;-----------------------------------------------------------------------------------&lt;/span&gt;
&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;tree-sitter&quot;&gt;&lt;a href=&quot;https://tree-sitter.github.io/tree-sitter/&quot;&gt;Tree-Sitter&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;增量式代码解析器生成器，将源代码转换为抽象语法树AST，并提供Tree Query Language (TQL)进行查询&lt;/p&gt;

&lt;p&gt;VS Code、Neovim等均使用Tree-Sitter进行语法高亮&lt;/p&gt;

&lt;h1 id=&quot;娱乐&quot;&gt;娱乐&lt;/h1&gt;

&lt;h2 id=&quot;deepscientist--paperorchestra&quot;&gt;&lt;a href=&quot;https://github.com/ResearAI/DeepScientist&quot;&gt;DeepScientist&lt;/a&gt; &amp;amp; &lt;a href=&quot;https://github.com/Ar9av/PaperOrchestra&quot;&gt;PaperOrchestra&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;主要关注Harness&lt;/p&gt;
</description>
        <pubDate>Mon, 13 Apr 2026 00:00:00 +0000</pubDate>
        <link>https://www.caozihang.com//2026/04/13/week/</link>
        <guid isPermaLink="true">https://www.caozihang.com//2026/04/13/week/</guid>
        
        <category>日常</category>
        
        
      </item>
    
      <item>
        <title>【阅】本周阅读摘选2026-03-30 → 2026-04-05</title>
        <description>&lt;center style=&quot;margin-bottom: 20px; margin-top: 50px&quot;&gt;&lt;font color=&quot;#3879B1&quot; style=&quot;line-height: 1.4;font-weight: 700;font-size: 36px;box-sizing: border-box; &quot;&gt;本周阅读摘选&lt;/font&gt;&lt;/center&gt;

&lt;center style=&quot; margin-bottom: 30px;&quot;&gt;2026-03-30 → 2026-04-05&lt;/center&gt;

&lt;font style=&quot;font-weight: bold;&quot;&gt;目录&lt;/font&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#学术相关&quot; id=&quot;markdown-toc-学术相关&quot;&gt;学术相关&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#beyond-means-a-dynamic-framework-for-predicting-customer-satisfaction-&quot; id=&quot;markdown-toc-beyond-means-a-dynamic-framework-for-predicting-customer-satisfaction-&quot;&gt;Beyond Means: A Dynamic Framework for Predicting Customer Satisfaction &lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#search-games-with-predictions-&quot; id=&quot;markdown-toc-search-games-with-predictions-&quot;&gt;Search games with predictions &lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#娱乐&quot; id=&quot;markdown-toc-娱乐&quot;&gt;娱乐&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#leveraging-llm-based-agents-for-social-science-research-insights-from-citation-network-simulations-&quot; id=&quot;markdown-toc-leveraging-llm-based-agents-for-social-science-research-insights-from-citation-network-simulations-&quot;&gt;Leveraging LLM-based agents for social science research: insights from citation network simulations &lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;学术相关&quot;&gt;学术相关&lt;/h1&gt;
&lt;h2 id=&quot;beyond-means-a-dynamic-framework-for-predicting-customer-satisfaction-&quot;&gt;Beyond Means: A Dynamic Framework for Predicting Customer Satisfaction &lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/h2&gt;

&lt;h2 id=&quot;search-games-with-predictions-&quot;&gt;Search games with predictions &lt;sup id=&quot;fnref:2&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/h2&gt;

&lt;h1 id=&quot;娱乐&quot;&gt;娱乐&lt;/h1&gt;

&lt;h2 id=&quot;leveraging-llm-based-agents-for-social-science-research-insights-from-citation-network-simulations-&quot;&gt;Leveraging LLM-based agents for social science research: insights from citation network simulations &lt;sup id=&quot;fnref:3&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;&lt;/h2&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Naumzik C., Maarouf A., Feuerriegel S., &amp;amp; Weinmann M. (2026). Beyond Means: A Dynamic Framework for Predicting Customer Satisfaction. International Journal of Research in Marketing. https://doi.org/10.1016/j.ijresmar.2026.03.006 &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Angelopoulos, S., Lidbetter, T., &amp;amp; Panagiotou, K. (2026). Search games with predictions. Operations Research, opre.2024.1498. https://doi.org/10.1287/opre.2024.1498 &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Ji, J., Lei, R., Pan, X., Wei, Z., Sun, H., Lin, Y., Chen, X., Yang, Y., Li, Y., Ding, B., &amp;amp; Wen, J.-R. (2025). Leveraging LLM-based agents for social science research: Insights from citation network simulations (arXiv:2511.03758). arXiv. https://doi.org/10.48550/arXiv.2511.03758 &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
        <pubDate>Tue, 07 Apr 2026 00:00:00 +0000</pubDate>
        <link>https://www.caozihang.com//2026/04/07/week/</link>
        <guid isPermaLink="true">https://www.caozihang.com//2026/04/07/week/</guid>
        
        <category>日常</category>
        
        
      </item>
    
      <item>
        <title>【阅】本周阅读摘选2026-03-23 → 2026-03-29</title>
        <description>&lt;center style=&quot;margin-bottom: 20px; margin-top: 50px&quot;&gt;&lt;font color=&quot;#3879B1&quot; style=&quot;line-height: 1.4;font-weight: 700;font-size: 36px;box-sizing: border-box; &quot;&gt;本周阅读摘选&lt;/font&gt;&lt;/center&gt;

&lt;center style=&quot; margin-bottom: 30px;&quot;&gt;2026-03-23 → 2026-03-29&lt;/center&gt;

&lt;font style=&quot;font-weight: bold;&quot;&gt;目录&lt;/font&gt;

&lt;ul id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#学术相关&quot; id=&quot;markdown-toc-学术相关&quot;&gt;学术相关&lt;/a&gt;    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#diversity-of-interactions-within-connectivist-learning-context-insights-from-flow-of-collective-attention-&quot; id=&quot;markdown-toc-diversity-of-interactions-within-connectivist-learning-context-insights-from-flow-of-collective-attention-&quot;&gt;Diversity of interactions within connectivist learning context: insights from flow of collective attention &lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#计量经济圈丨全网最全的社科计量方法手册含实操-社会科学计量模型一网打尽收藏级&quot; id=&quot;markdown-toc-计量经济圈丨全网最全的社科计量方法手册含实操-社会科学计量模型一网打尽收藏级&quot;&gt;计量经济圈丨全网最全的社科计量方法手册(含实操), 社会科学计量模型一网打尽(收藏级)&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;学术相关&quot;&gt;学术相关&lt;/h1&gt;
&lt;h2 id=&quot;diversity-of-interactions-within-connectivist-learning-context-insights-from-flow-of-collective-attention-&quot;&gt;Diversity of interactions within connectivist learning context: insights from flow of collective attention &lt;sup id=&quot;fnref:1&quot; role=&quot;doc-noteref&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/h2&gt;

&lt;h2 id=&quot;计量经济圈丨全网最全的社科计量方法手册含实操-社会科学计量模型一网打尽收藏级&quot;&gt;&lt;a href=&quot;https://mp.weixin.qq.com/s/qBB06bua4LI5ICMyu4g1CA&quot;&gt;计量经济圈丨全网最全的社科计量方法手册(含实操), 社会科学计量模型一网打尽(收藏级)&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;一些经典方法&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;Research Scenario&lt;/th&gt;
      &lt;th&gt;Recommended Method&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;Continuous dependent variable, no endogeneity&lt;/td&gt;
      &lt;td&gt;OLS / Multiple regression&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Binary dependent variable&lt;/td&gt;
      &lt;td&gt;Logit / Probit&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Unordered multicategorical dependent variable&lt;/td&gt;
      &lt;td&gt;Multinomial Logit&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Ordered multicategorical dependent variable&lt;/td&gt;
      &lt;td&gt;Ordered Logit&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Count data (no overdispersion)&lt;/td&gt;
      &lt;td&gt;Poisson&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Count data (overdispersed)&lt;/td&gt;
      &lt;td&gt;Negative binomial&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Count data with excess zeros&lt;/td&gt;
      &lt;td&gt;ZIP / ZINB&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Censored dependent variable&lt;/td&gt;
      &lt;td&gt;Tobit&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Nested data (student—school)&lt;/td&gt;
      &lt;td&gt;HLM / Mixed-effects model&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Panel data, time-invariant variables matter&lt;/td&gt;
      &lt;td&gt;Random effects (RE)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Panel data, control for individual heterogeneity&lt;/td&gt;
      &lt;td&gt;Fixed effects (FE)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Causal effect of policy intervention&lt;/td&gt;
      &lt;td&gt;DID / Synthetic control&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Endogeneity problem&lt;/td&gt;
      &lt;td&gt;Instrumental variables (IV / 2SLS)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Threshold-based policy eligibility&lt;/td&gt;
      &lt;td&gt;Regression discontinuity (RDD)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Selection bias (observable)&lt;/td&gt;
      &lt;td&gt;PSM / IPW&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Complex theoretical model + latent variables&lt;/td&gt;
      &lt;td&gt;SEM / CFA&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Time-to-event data&lt;/td&gt;
      &lt;td&gt;Cox / Discrete-time event history&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Multiple/competing events&lt;/td&gt;
      &lt;td&gt;Competing risks model&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Discovering types in data&lt;/td&gt;
      &lt;td&gt;LCA / k-means&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Longitudinal developmental trajectories&lt;/td&gt;
      &lt;td&gt;Latent growth model (LGM)&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Large-scale text data&lt;/td&gt;
      &lt;td&gt;LDA / BERT&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;Heterogeneous treatment effects&lt;/td&gt;
      &lt;td&gt;Causal forest / Double ML&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot; role=&quot;doc-endnote&quot;&gt;
      &lt;p&gt;Gao, M., Zhang, J., &amp;amp; Zhang, J. (2025). Diversity of interactions within connectivist learning context: Insights from flow of collective attention. International Journal of Educational Technology in Higher Education, 22(1), 76. https://doi.org/10.1186/s41239-025-00575-5 &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
        <pubDate>Mon, 30 Mar 2026 00:00:00 +0000</pubDate>
        <link>https://www.caozihang.com//2026/03/30/week/</link>
        <guid isPermaLink="true">https://www.caozihang.com//2026/03/30/week/</guid>
        
        <category>日常</category>
        
        
      </item>
    
  </channel>
</rss>
