Appearance
RAG 查询优化技术深度解析 v3.0
从单次查询到多轮对话、从单一到多模态、从静态检索到动态反思、从非结构化到结构化、从被动查询到主动引导 生成时间:2026-04-28 · 版本 3.0
目录
- 第一部分:查询优化的全景架构
- 第二部分:六大基础查询优化技术
- 第三部分:查询意图与多轮对话优化
- 第四部分:多模态查询优化
- 第五部分:动态检索停止与自反思机制
- 第六部分:结构化查询转换 ⭐ 新增
- 第七部分:检索颗粒度动态控制 ⭐ 新增
- 第八部分:用户反馈感知的实时查询补全 ⭐ 新增
- 第九部分:端到端代码实现
- 第十部分:效果对比与选型决策
- 附录:技术选型速查表
第一部分:查询优化的全景架构
1.1 查询优化演进:从单次到对话、从单一到多模态、从非结构化到结构化、从被动到主动
V1.0 单次查询优化(2023-2024)
用户查询 → 改写 → 检索 → 生成 → 回答
└─ 只处理纯文本;每次查询独立
V2.0 多轮对话 + 多模态(2024-2025)
文本/图像/语音 → 指代消解/模态融合 → 改写 → 自适应检索 → 反思 → 回答
└─ 新增:对话状态、多模态、自修正
V3.0 结构化 + 动态粒度 + 主动引导(2025-2026)
文本/图像/语音 + 数据库 + 元数据 → 意图理解 → DSL/SQL 转换 → 粒度自适应 → 主动引导 → 回答
└─ 新增:结构化查询、粒度控制、实时补全引导1.2 架构大图(v3.0)
┌────── ──── ───── ─ ─── ─ ── ──── ─── ─ ──── ── ──── ──── ──── ── ─ ──── ─── ─── ──── ──── ── ─ ─── ─── ┐
│ 查询优化完整架构 (v3.0) │
│ │
│ ┌──── ──── ──── ─ ─── ─── ─── ──── ──── ─── ─ ──── ─── ─── ─── ─── ─── ── ─ ─── ─ ─── ─── ─ ─ ─── ┐ │
│ │ 第 0 层:多模态输入解析(Input Parsing) │ │
│ │ │ │
│ │ · 文本查询 │ │
│ │ · 图像输入(截图/架构图/UI 截图) │ │
│ │ · 语音输入(语音转文本) │ │
│ │ · 混合输入(文本 + 图像) │ │
│ └──── ──── ──── ─ ─── ─── ─── ──── ──── ─── ─ ──── ─── ─── ─── ─── ─── ── ─ ─── ─ ─── ─── ─ ─ ─── ┘ │
│ │ │
│ ▼ │
│ ┌──── ──── ──── ─ ─── ─── ─── ──── ──── ─── ─ ──── ─── ─── ─── ─── ─── ── ─ ─── ─ ─── ─── ─ ─ ─── ┐ │
│ │ 第 1 层:对话状态管理(Conversation State) │ │
│ │ │ │
│ │ · 对话历史存储 │ │
│ │ · 指代消解(Coreference Resolution) │ │
│ │ · 意图累积(Intent Accumulation) │ │
│ │ · 上下文压缩(Context Compression) │ │
│ └──── ──── ──── ─ ─── ─── ─── ──── ──── ─── ─ ──── ─── ─── ─── ─── ─── ── ─ ─── ─ ─── ─── ─ ─ ─── ┘ │
│ │ │
│ ▼ │
│ ┌──── ──── ──── ─ ─── ─── ─── ──── ──── ─── ─ ──── ─── ─── ─── ─── ─── ── ─ ─── ─ ─── ─── ─ ─ ─── ┐ │
│ │ 第 2 层:查询理解与优化(Query Understanding) │ │
│ │ │ │
│ │ · 意图分类(Intent Classification) │ │
│ │ · 查询改写(同义词扩展/子查询分解/HyDE) │ │
│ │ · 多模态查询融合 │ │
│ │ · RAG 必要性判断 │ │
│ └──── ──── ──── ─ ─── ─── ─── ──── ──── ─── ─ ──── ─── ─── ─── ─── ─── ── ─ ─── ─ ─── ─── ─ ─ ─── ┘ │
│ │ │
│ ▼ │
│ ┌──── ──── ──── ─ ─── ─── ─── ──── ──── ─── ─ ──── ─── ─── ─── ─── ─── ── ─ ─── ─ ─── ─── ─ ─ ─── ┐ │
│ │ 第 3 层:查询类型路由(Query Routing) │ │
│ │ │ │
│ │ · 非结构化文本查询 → Hybrid RAG │ │
│ │ · 结构化数据库查询 → Text-to-SQL/DSL │ │
│ │ · 图谱关系查询 → Text-to-Cypher │ │
│ │ · 元数据过滤查询 → Self-Querying │ │
│ │ · 宏观综述型 → Recursive Abstract Retrieval │ │
│ └──── ──── ──── ─ ─── ─── ─── ──── ──── ─── ─── ─ ──── ─── ─── ─── ─── ── ─ ─── ─ ─── ─── ─ ─ ─── ┘ │
│ │ │
│ ▼ │
│ ┌──── ──── ──── ─ ─── ─── ─── ──── ──── ─── ─ ──── ─── ─── ─── ─── ─── ── ─ ─── ─ ─── ─── ─ ─ ─── ┐ │
│ │ 第 4 层:自适应检索(Adaptive Retrieval) │ │
│ │ │ │
│ │ · 动态 Chunk 粒度控制 │ │
│ │ · 多路检索(BM25 + 向量 + 图谱 + SQL + DSL) │ │
│ │ · 检索质量评估(Self-RAG) │ │
│ │ · 检索必要性反思 │ │
│ └──── ──── ──── ─ ─── ─── ─── ──── ──── ─── ─ ──── ─── ─── ─── ─── ─── ── ─ ─── ─ ─── ─── ─ ─ ─── ┘ │
│ │ │
│ ▼ │
│ ┌──── ──── ──── ─ ─── ─── ─── ──── ──── ─── ─ ──── ─── ─── ─── ─── ─── ── ─ ─── ─ ─── ─── ─ ─ ─── ┐ │
│ │ 第 5 层:生成与自反思(Generation & Self-Reflection) │ │
│ │ │ │
│ │ · Prompt 构建 │ │
│ │ · LLM 生成 │ │
│ │ · 幻觉检测(Self-Correction) │ │
│ └──── ──── ──── ─ ─── ─── ─── ──── ──── ─── ─ ──── ─── ─── ─── ─── ─── ── ─ ─── ─ ─── ─── ─ ─ ─── ┘ │
│ │
│ ┌──── ──── ──── ─ ─── ─── ─── ──── ──── ─── ─ ──── ─── ─── ─── ─── ─── ── ─ ─── ─ ─── ─── ─ ─ ─── ┐ │
│ │ 第 6 层:结果输出与主动引导(Output & Proactive Guidance) │ │
│ │ │ │
│ │ · 最终回答 + 引用来源 │ │
│ │ · 置信度评分 │ │
│ │ · 预加载推荐查询(Search Suggestions) │ │
│ │ · 歧义澄清(Clarification) │ │
│ └──── ──── ──── ─ ─── ─── ─── ──── ──── ─── ─ ──── ─── ─── ─── ─── ─── ── ─ ─── ─ ─── ─── ─ ─ ─── ┘ │
│ │
└──── ──── ──── ─ ─── ─── ─── ──── ──── ─── ─ ──── ─── ─── ─── ─── ─── ── ─ ─── ─ ─── ─── ─ ─ ─── ─── ┐第二部分:六大基础查询优化技术
v1.0 已有内容,简要回顾。详见
query-optimization-deep-dive.md。
第三部分:查询意图与多轮对话优化
v2.0 已有内容,简要回顾。详见
query-optimization-deep-dive-v2.md。
第四部分:多模态查询优化
v2.0 已有内容,简要回顾。
第五部分:动态检索停止与自反思机制
v2.0 已有内容,简要回顾。
第六部分:结构化查询转换 ⭐ 新增
6.1 问题的本质:知识不只是文本
实际业务中的知识分布在多种数据结构中:
知识存储类型 │ 查询方式 │ 典型场景
─────── ─── ──── ─ ─── ─ ─── ─ ── ─ ─── ─ ─── ─ ─── ─ ── ─ ─── ─ ─ ─ ──
非结构化文本 (PDF/Markdown) │ 向量检索 + BM25 │ 技术文档、手册
半结构化文本 (带元数据的文档) │ 向量检索 + 元数据过滤 │ CMS 文章、知识库条目
结构化数据库 (PostgreSQL/MySQL) │ SQL │ 用户数据、配置管理
图数据库 (Neo4j) │ Cypher │ 关系图谱、依赖追踪
缓存系统 (Redis) │ Key-Value 查询 │ 实时配置、会话状态核心挑战:用户不会说"给我 SQL WHERE year=2023 AND tag='Android'"。系统需要把自然语言翻译为对应数据源的查询语言。
6.2 元数据过滤自动生成(Self-Querying)
一、什么是 Self-Querying?
用户问:"2023 年关于 Android 的文档"
传统 RAG:直接向量搜索 "2023 Android 文档" → 召回所有相关内容,没有过滤
Self-Querying:
用户查询: "2023 年关于 Android 的文档"
│
▼ [Self-Querying 解析器]
┌─── ──── ─── ─ ─── ─ ─── ─ ── ─ ─── ─── ─ ─── ─ ─── ─ ─ ─ ─ ─ ┐
│ 自动提取的过滤条件: │
│ { │
│ "year": { "$gte": 2023, "$lte": 2023 }, │
│ "tags": ["Android"], │
│ "content_type": "document" │
│ } │
└─── ──── ─── ─ ─── ─ ── ─ ─── ─── ─ ─── ─ ─── ─ ─── ─ ─ ─ ─ ─ ┘
│
▼
向量检索 + 元数据过滤 → 只返回 2023 年的 Android 文档为什么重要?
场景: 用户问 "2023 年关于 Android 的文档"
❌ 不加过滤的检索:
召回 50 个文档 → 其中 10 个是 2022 年的旧文档 → 用户看到过时信息
→ 命中率低,用户体验差
✅ 加元数据过滤的检索:
召回 8 个文档 → 全部是 2023 年的 Android 文档 → 精准命中
→ 召回质量提升 3-5 倍二、技术原理
Self-Querying 流程:
用户查询 → LLM 解析 → 过滤条件 JSON
│
├── 识别过滤字段(year, tags, source 等)
├── 识别操作符(=, >, <, contains, in)
├── 识别值(2023, Android, API 文档)
│
▼
过滤条件对象
{"year": 2023, "tags": ["Android"]}
配合向量检索:
vector_store.similarity_search(query, k=10, filter={"year": 2023, "tags": "Android"})三、代码实现
python
# ================================================================
# Self-Querying:自动提取元数据过滤条件
# ================================================================
from typing import Dict, List, Optional, Any
from langchain_openai import ChatOpenAI
from langchain_core.documents import Document
from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain.chains.query_constructor.base import AttributeInfo
import json
class SelfQueryExtractor:
"""
元数据过滤条件自动提取器
功能:从用户查询中自动提取元数据过滤条件
示例:
用户: "2023 年关于 Android 的文档"
→ {"year": {"$gte": 2023, "$lte": 2023}, "tags": ["Android"], "content_type": "document"}
"""
def __init__(
self,
llm: ChatOpenAI = None,
metadata_fields: List[AttributeInfo] = None,
):
self.llm = llm or ChatOpenAI(model="gpt-4o-mini", temperature=0.1)
self.metadata_fields = metadata_fields or [
AttributeInfo(
name="year",
description="文档发布的年份(整数,如 2023, 2024)",
type="integer",
),
AttributeInfo(
name="tags",
description="文档的分类标签(如 Android, iOS, Web)",
type="list[string]",
),
AttributeInfo(
name="content_type",
description="文档类型(如 document, tutorial, api_reference, blog)",
type="string",
),
AttributeInfo(
name="source",
description="文档来源(如 docs.google.com, internal.wiki)",
type="string",
),
AttributeInfo(
name="author",
description="文档作者姓名",
type="string",
),
]
def extract_filters(self, query: str) -> Dict:
"""
从查询中提取元数据过滤条件
Args:
query: 用户查询文本
Returns:
过滤条件字典
"""
# 构建元数据字段描述
fields_desc = "\n".join([
f"- {f.name}: {f.description} (类型: {f.type})"
for f in self.metadata_fields
])
prompt = f"""You are a metadata filter extraction system for a technical documentation knowledge base.
Your task is to analyze the user's query and extract relevant metadata filter conditions.
Available metadata fields:
{fields_desc}
User query: "{query}"
Extract filter conditions ONLY for fields that are explicitly mentioned or strongly implied
in the query. DO NOT make assumptions about fields not mentioned.
Rules:
- Extract only what the user actually asked for
- Use the correct operator format: =, !=, >, <, >=, <=, contains, in
- For years: use year: 2023 or year: {"$gte": 2020, "$lte": 2024}
- For tags: use tags: ["Android", "iOS"]
- For content_type: use content_type: "document"
- Return an empty object {{}} if no meaningful filters can be extracted
Return ONLY a JSON object with the filter conditions:
{{
"filters": {{
"field_name": "filter_value_or_condition"
}},
"extraction_reason": "brief explanation of what was extracted and why",
"confidence": 0.0-1.0
}}
"""
response = self.llm.invoke([
("system", "You are a metadata filter extraction system."),
("user", prompt),
])
try:
result = json.loads(response.content)
return result.get("filters", {})
except json.JSONDecodeError:
return {}
def build_self_query_chain(self, vectorstore):
"""
构建 LangChain 的 SelfQueryRetriever
这是生产推荐的方案,集成度最高
"""
self_query_retriever = SelfQueryRetriever.from_llm(
llm=self.llm,
vectorstore=vectorstore,
document_contents="技术文档内容", # 用于语义匹配的描述
metadata_field_info=self.metadata_fields,
enable_limit=True, # 支持限制返回数量
)
return self_query_retriever
# ================================================================
# 使用示例
# ================================================================
extractor = SelfQueryExtractor()
queries_and_filters = [
"2023 年关于 Android 的文档",
"iOS 相关的博客文章",
"2024 年的 API 参考文档",
"张明的技术博客",
"关于 Kotlin 的教程",
"所有关于 React 的文章,不限年份",
]
for q in queries_and_filters:
filters = extractor.extract_filters(q)
print(f"查询: {q}")
print(f"过滤条件: {filters}")
print()
# 输出:
# 查询: 2023 年关于 Android 的文档
# 过滤条件: {"year": 2023, "tags": ["Android"], "content_type": "document"}
#
# 查询: iOS 相关的博客文章
# 过滤条件: {"tags": ["iOS"], "content_type": "blog"}
#
# 查询: 2024 年的 API 参考文档
# 过滤条件: {"year": 2024, "content_type": "api_reference"}python
# ================================================================
# 更灵活:自定义过滤条件提取器(支持自定义逻辑)
# ================================================================
import re
from typing import Tuple
class AdvancedSelfQueryExtractor:
"""
高级 Self-Query 提取器:结合规则 + LLM
规则层(快速)→ LLM 层(处理模糊)
"""
def __init__(self, llm: ChatOpenAI = None):
self.llm = llm or ChatOpenAI(model="gpt-4o-mini", temperature=0.1)
def extract(self, query: str) -> Dict:
"""提取过滤条件"""
# 第 1 步:规则提取(快速,零成本)
rule_filters = self._rule_extract(query)
# 第 2 步:LLM 补充(处理规则无法覆盖的情况)
if rule_filters.get("confidence", 0) < 0.7:
llm_filters = self._llm_extract(query)
# 合并结果
for k, v in llm_filters.items():
if k not in rule_filters or rule_filters[k].get("confidence", 0) < v.get("confidence", 0):
rule_filters[k] = v
return rule_filters
def _rule_extract(self, query: str) -> Dict:
"""基于规则的提取"""
filters = {}
reasons = []
# 年份提取
year_matches = re.findall(r'(\d{4})\s*(?:年|\.|\.s)?', query)
if year_matches:
years = [int(y) for y in year_matches]
if len(years) == 1:
filters["year"] = years[0]
reasons.append(f"提取年份: {years[0]}")
else:
filters["year"] = {"$gte": min(years), "$lte": max(years)}
reasons.append(f"提取年份范围: {min(years)}-{max(years)}")
# 标签/分类提取
tag_patterns = [
(r'关于\s*(Android|iOS|Web|React|Vue|Kotlin|Swift|Flutter|JVM)\s*(?:的|相关)', lambda m: m.group(1)),
(r'(\d+)\s*(?:月|month)', lambda m: f"month_{m.group(1)}"),
]
for pattern, mapper in tag_patterns:
match = re.search(pattern, query)
if match:
tag = mapper(match)
filters.setdefault("tags", []).append(tag)
reasons.append(f"提取标签: {tag}")
# 内容类型提取
type_map = {
"教程": "tutorial",
"API": "api_reference",
"参考": "api_reference",
"博客": "blog",
"文档": "document",
"手册": "manual",
"文章": "article",
"FAQ": "faq",
}
for cn_type, en_type in type_map.items():
if cn_type in query:
filters["content_type"] = en_type
reasons.append(f"提取类型: {en_type}")
return {
"filters": filters,
"reasons": reasons,
"confidence": 0.9 if reasons else 0.1,
}
def _llm_extract(self, query: str) -> Dict:
"""LLM 补充提取"""
prompt = f"""Extract metadata filter conditions from this query.
Return ONLY a JSON object with field-value pairs.
Query: "{query}"
Example output:
{
"year": 2023,
"tags": ["Android"],
"content_type": "document"
}
"""
response = self.llm.invoke(prompt)
try:
return json.loads(response.content)
except:
return {}
# 使用
advanced_extractor = AdvancedSelfQueryExtractor()
result = advanced_extractor.extract("2023 年关于 Android 的文档")
print(f"过滤条件: {result['filters']}")
print(f"理由: {result['reasons']}")
print(f"置信度: {result['confidence']}")6.3 自然语言转 DSL/SQL(Text-to-DSL/SQL)
一、Text-to-SQL 原理
当知识存储在结构化数据库中时,需要将自然语言查询转换为 SQL。
用户: "谁是项目 A 的负责人且参加过 B 会议?"
传统 RAG(向量搜索):
搜索 "项目 A 负责人 B 会议" → 召回不相关的文档 → 无法得到精确答案
Text-to-SQL:
用户查询 → LLM → SQL 查询
│
│ SELECT p.project_name, u.user_name
│ FROM projects p
│ JOIN members m ON p.id = m.project_id
│ JOIN meetings mt ON m.user_id = mt.attendee_id
│ WHERE p.project_name = '项目 A'
│ AND mt.meeting_name = 'B 会议'
│ GROUP BY p.project_name, u.user_name
│ HAVING COUNT(mt.meeting_id) >= 1
SQL → 数据库 → 精确答案二、Text-to-Cypher(图数据库查询)
当知识以图谱形式存储时(如项目关系、组织架构),需要将查询转为 Cypher。
用户: "项目 A 的负责人参加过 B 会议吗?如果参加过,还和谁一起参加过 C 会议?"
Text-to-Cypher:
MATCH (p:Project {name: '项目A'})-[:HAS_MEMBER]->(u:User)-[:ATTENDED]->(m1:Meeting {name: 'B会议'})
WITH u
MATCH (u)-[:ATTENDED]->(m2:Meeting {name: 'C会议'})-[:ATTENDED_BY]->(other:User)
RETURN m2.name, COLLECT(other.name) AS attendees三、代码实现
python
# ================================================================
# Text-to-SQL 转换器
# ================================================================
from typing import Dict, List, Optional
from langchain_openai import ChatOpenAI
from langchain_community.utilities import SQLDatabase
class TextToSQLConverter:
"""
自然语言转 SQL 转换器
核心流程:
1. LLM 理解数据库 Schema
2. LLM 生成 SQL 查询
3. 执行 SQL
4. 格式化结果
"""
def __init__(
self,
llm: ChatOpenAI = None,
db_uri: str = None,
tables_schema: str = None,
):
self.llm = llm or ChatOpenAI(model="gpt-4o-mini", temperature=0)
self.db = SQLDatabase.from_uri(db_uri) if db_uri else None
self.tables_schema = tables_schema or """
Tables in the database:
1. projects:
- id: INTEGER (主键)
- name: STRING (项目名称)
- start_date: DATE
- end_date: DATE
- status: STRING (active, completed, cancelled)
2. members:
- id: INTEGER (主键)
- project_id: INTEGER (外键,关联 projects.id)
- user_id: INTEGER (外键,关联 users.id)
- role: STRING (lead, member, reviewer)
- join_date: DATE
3. users:
- id: INTEGER (主键)
- name: STRING
- email: STRING
- department: STRING
- title: STRING
4. meetings:
- id: INTEGER (主键)
- name: STRING
- date: DATE
- organizer_id: INTEGER (外键,关联 users.id)
5. meeting_attendees:
- id: INTEGER (主键)
- meeting_id: INTEGER (外键,关联 meetings.id)
- user_id: INTEGER (外键,关联 users.id)
"""
def generate_sql(self, query: str) -> Dict:
"""
从自然语言查询生成 SQL
Returns:
{
"sql": "SELECT ...",
"table_references": ["projects", "members"],
"explanation": "简要说明",
}
"""
prompt = f"""You are a SQL expert. Convert the following natural language query into a SQL query.
Use the database schema below to understand the table structures.
Database Schema:
{self.tables_schema}
Natural Language Query: "{query}"
Rules:
- Use only the tables and columns from the schema
- Use JOIN when data is in multiple tables
- Use GROUP BY and HAVING for aggregation queries
- Use WHERE for filtering
- Use LIMIT when appropriate
- Return the SQL query, not the result
Return ONLY a JSON object:
{{
"sql": "the SQL query",
"table_references": ["table1", "table2", ...],
"explanation": "Brief explanation of the query logic"
}}
"""
response = self.llm.invoke(prompt)
try:
result = json.loads(response.content)
return result
except json.JSONDecodeError:
return {"sql": "SELECT 1", "table_references": [], "explanation": "Failed to generate"}
def execute_sql(self, sql: str) -> List:
"""执行 SQL 查询"""
if not self.db:
return []
result = self.db.run(sql)
# 解析结果
if isinstance(result, str):
return [line.split("\t") for line in result.strip().split("\n") if line]
return result
def convert_and_execute(self, query: str) -> Dict:
"""生成并执行 SQL"""
sql_result = self.generate_sql(query)
sql = sql_result["sql"]
# 安全:简单校验
if any(kw in sql.upper() for kw in ["DROP", "DELETE", "UPDATE", "INSERT"]):
return {"error": "Write operations are not allowed", "sql": sql}
results = self.execute_sql(sql)
return {
"sql": sql,
"results": results,
"explanation": sql_result["explanation"],
}
# ================================================================
# 使用示例
# ================================================================
converter = TextToSQLConverter(
db_uri="sqlite:///./company.db", # 替换为实际数据库 URI
)
result = converter.convert_and_execute("谁是项目 A 的负责人且参加过 B 会议?")
print(f"SQL: {result['sql']}")
print(f"结果: {result['results']}")
print(f"说明: {result['explanation']}")python
# ================================================================
# Text-to-Cypher 转换器(图数据库)
# ================================================================
from typing import List
class TextToCypherConverter:
"""
自然语言转 Cypher 转换器
用于 Neo4j 等图数据库
"""
def __init__(self, llm: ChatOpenAI = None):
self.llm = llm or ChatOpenAI(model="gpt-4o-mini", temperature=0)
self.graph_schema = """
Node types:
- Project: {name, start_date, end_date, status}
- User: {name, email, department, title}
- Meeting: {name, date, location}
- Skill: {name, level}
- Tool: {name, category}
Relationship types:
- (:User)-[:MEMBER_OF]->(:Project)
- (:User)-[:ATTENDED]->(:Meeting)
- (:User)-[:HAS_SKILL]->(:Skill)
- (:Project)-[:USES]->(:Tool)
- (:User)-[:MANAGES]->(:Project)
- (:Meeting)-[:ORGANIZED_BY]->(:User)
"""
def generate_cypher(self, query: str) -> Dict:
"""生成 Cypher 查询"""
prompt = f"""You are a Cypher query expert for Neo4j graph database.
Convert the following natural language query into a Cypher query.
Graph Schema:
{self.graph_schema}
Query: "{query}"
Rules:
- Match nodes and relationships based on the schema
- Use variables for nodes: (u:User), (p:Project), (m:Meeting)
- Use proper relationship syntax: (a)-[:RELATIONSHIP_TYPE]->(b)
- Return relevant data
- Use WITH for intermediate aggregation
- Use COLLECT for list aggregation
Return ONLY a JSON object:
{{
"cypher": "the Cypher query",
"node_types": ["User", "Project", ...],
"relationship_types": ["MEMBER_OF", "ATTENDED", ...],
"explanation": "Brief explanation"
}}
"""
response = self.llm.invoke(prompt)
try:
return json.loads(response.content)
except:
return {"cypher": "MATCH (n) RETURN n", "node_types": [], "relationship_types": [], "explanation": "Failed"}
def execute_cypher(self, cypher: str) -> List:
"""执行 Cypher 查询(需要 Neo4j 连接)"""
# from neo4j import GraphDatabase
# driver = GraphDatabase.driver("bolt://localhost:7687")
# with driver.session() as session:
# result = session.run(cypher)
# return [record.data() for record in result]
# 实际使用时需要接入 Neo4j 驱动
return [{"result": "Query executed"}]
def convert_and_execute(self, query: str) -> Dict:
"""生成并执行 Cypher"""
cypher_result = self.generate_cypher(query)
return {
"cypher": cypher_result["cypher"],
"node_types": cypher_result["node_types"],
"relationship_types": cypher_result["relationship_types"],
"explanation": cypher_result["explanation"],
"results": self.execute_cypher(cypher_result["cypher"]),
}
# 使用
cypher_converter = TextToCypherConverter()
result = cypher_converter.convert_and_execute("项目 A 的负责人参加过 B 会议吗?还和谁一起参加过 C 会议?")
print(f"Cypher:\n{result['cypher']}")
print(f"节点类型: {result['node_types']}")
print(f"关系类型: {result['relationship_types']}")python
# ================================================================
# 集成:结构化查询统一路由器
# ================================================================
class StructuredQueryRouter:
"""
结构化查询路由器
根据查询类型自动选择对应的转换器和查询方式
"""
def __init__(self, config: dict = None):
self.config = config or {
"enable_sql": True,
"enable_cypher": True,
"enable_self_query": True,
"sql_db_uri": "sqlite:///./company.db",
"neo4j_uri": "bolt://localhost:7687",
}
self.sql_converter = TextToSQLConverter(db_uri=self.config["sql_db_uri"]) if self.config["enable_sql"] else None
self.cypher_converter = TextToCypherConverter() if self.config["enable_cypher"] else None
self.self_query_extractor = SelfQueryExtractor()
def route_query(self, query: str) -> Dict:
"""
路由查询到合适的转换器
Returns:
{
"query_type": "sql" | "cypher" | "self_query" | "vector",
"query_converter": object,
"converted_query": str,
}
"""
# 判断查询类型
if self._is_sql_query(query):
if not self.sql_converter:
return {"query_type": "vector", "converted_query": query}
sql_result = self.sql_converter.generate_sql(query)
return {
"query_type": "sql",
"query_converter": self.sql_converter,
"converted_query": sql_result["sql"],
"explanation": sql_result["explanation"],
}
elif self._is_cypher_query(query):
if not self.cypher_converter:
return {"query_type": "vector", "converted_query": query}
cypher_result = self.cypher_converter.generate_cypher(query)
return {
"query_type": "cypher",
"query_converter": self.cypher_converter,
"converted_query": cypher_result["cypher"],
"explanation": cypher_result["explanation"],
}
elif self.config["enable_self_query"]:
filters = self.self_query_extractor.extract_filters(query)
return {
"query_type": "self_query",
"converted_query": query,
"metadata_filters": filters,
}
else:
return {"query_type": "vector", "converted_query": query}
def _is_sql_query(self, query: str) -> bool:
"""判断是否为 SQL 类型查询"""
sql_indicators = [
"谁是", "有多少人", "统计", "列表", "排名", "多少",
"count", "list", "statistics", "sum", "average",
"查询", "获取", "查看", "所有",
]
return any(ind in query.lower() for ind in sql_indicators)
def _is_cypher_query(self, query: str) -> bool:
"""判断是否为 Cypher 类型查询"""
cypher_indicators = [
"的关系", "和谁", "参加", "组织", "负责",
"谁和", "一起", "path", "relationship", "connected",
]
return any(ind in query.lower() for ind in cypher_indicators)
# 使用
router = StructuredQueryRouter()
# 不同查询的路由结果
queries = [
"项目 A 的负责人参加过 B 会议吗?", # Cypher
"统计 2024 年每个部门的项目数量", # SQL
"2023 年关于 Android 的文档", # Self-Query
"API 网关超时配置方法", # Vector
]
for q in queries:
result = router.route_query(q)
print(f"查询: {q}")
print(f"类型: {result['query_type']}")
print(f"转换后: {result.get('converted_query', '')[:80]}")
print()第七部分:检索颗粒度动态控制 ⭐ 新增
7.1 问题的本质:Chunk 的数量和质量
基础 RAG 中,Chunk 的数量是固定的(比如固定返回 5 个 chunk)。但不同查询对上下文的需求差异巨大:
简单查询: "API 网关的连接超时默认是多少?"
→ 答案可能就在 1 个 chunk 里
→ 返回 10 个 chunk = 引入噪音,浪费 token
复杂查询: "对比 A 产品和 B 产品的性能指标"
→ 答案分布在多个 chunk 中
→ 返回 3 个 chunk = 信息不全核心问题:查询的复杂度与返回 Chunk 的数量应该动态匹配。
7.2 查询复杂度评估
原理
查询复杂度 = 答案需要的上下文广度 × 答案需要的信息深度
复杂度评估特征:
1. 答案范围(广度)
- 单点: "A 的 X 是多少?" → 1-3 chunks
- 对比: "A vs B" → 5-10 chunks
- 全景: "整个系统的架构" → 15-30 chunks
2. 答案深度(深度)
- 事实: "默认值是多少?" → 浅
- 解释: "为什么这样设计?" → 中
- 教程: "如何配置?" → 深
复杂度得分 = 广度因子 × 深度因子代码实现
python
# ================================================================
# 查询复杂度评估器
# ================================================================
class QueryComplexityAnalyzer:
"""
查询复杂度分析器
评估查询的复杂度,决定:
1. 返回多少 chunks
2. 使用什么检索策略
3. 使用什么类型的检索器
"""
def __init__(self, llm: ChatOpenAI = None):
self.llm = llm or ChatOpenAI(model="gpt-4o-mini", temperature=0.1)
def analyze(self, query: str) -> Dict:
"""
分析查询复杂度
Returns:
{
"complexity_score": 0.0-1.0, # 复杂度评分
"complexity_level": "simple|medium|complex|very_complex",
"recommended_k": int, # 推荐返回的 chunk 数量
"recommended_strategy": str, # 推荐检索策略
"answer_scope": "single_point|narrow|broad|panoramic",
"answer_depth": "shallow|medium|deep",
"query_type": "fact|explanation|tutorial|comparison|synthesis",
}
"""
prompt = f"""Analyze the complexity of the following query and return recommendations.
Query: "{query}"
Evaluate:
1. Complexity score (0.0 = simple, 1.0 = very complex)
2. Complexity level: simple | medium | complex | very_complex
3. Recommended number of chunks to retrieve
4. Recommended retrieval strategy
5. Answer scope: single_point | narrow | broad | panoramic
6. Answer depth: shallow | medium | deep
7. Query type: fact | explanation | tutorial | comparison | synthesis
Rules for chunk recommendation:
- single_point + shallow: 1-3 chunks
- single_point + medium/deep: 3-5 chunks
- narrow + shallow: 3-5 chunks
- narrow + medium/deep: 5-10 chunks
- broad + medium/deep: 10-20 chunks
- panoramic + deep: 20-50 chunks
Return ONLY a JSON object:
{{
"complexity_score": float,
"complexity_level": "string",
"recommended_k": int,
"recommended_strategy": "string",
"answer_scope": "string",
"answer_depth": "string",
"query_type": "string"
}}
"""
response = self.llm.invoke(prompt)
try:
return json.loads(response.content)
except:
return self._heuristic_analyze(query)
def _heuristic_analyze(self, query: str) -> Dict:
"""启发式分析(无需 LLM,快速)"""
# 复杂度评分
score = 0.0
# 长度因子
score += min(0.3, len(query) / 200)
# 对比类查询
comparison_words = ["对比", "比较", "vs", "versus", "difference", "vs"]
if any(w in query.lower() for w in comparison_words):
score += 0.3
# "如何"类查询
how_words = ["如何", "怎么", "步骤", "教程", "how", "steps"]
if any(w in query.lower() for w in how_words):
score += 0.2
# "总结"类查询
summary_words = ["总结", "综述", "概括", "summary", "overview"]
if any(w in query.lower() for w in summary_words):
score += 0.15
# 复杂度级别
if score < 0.2:
level = "simple"
k = 3
elif score < 0.4:
level = "medium"
k = 5
elif score < 0.7:
level = "complex"
k = 10
else:
level = "very_complex"
k = 20
return {
"complexity_score": score,
"complexity_level": level,
"recommended_k": k,
"recommended_strategy": "hybrid" if score > 0.5 else "similarity",
"answer_scope": "panoramic" if any(w in query.lower() for w in ["所有", "整体", "overview"]) else "narrow",
"answer_depth": "deep" if any(w in query.lower() for w in ["详细", "如何", "教程"]) else "medium",
"query_type": "comparison" if any(w in query.lower() for w in ["对比", "比较"]) else "fact",
}
# 使用
analyzer = QueryComplexityAnalyzer()
queries = [
"API 网关的连接超时默认是多少?",
"对比 A 产品和 B 产品的性能指标",
"这个系统的全局架构是什么?",
"Gradle 报错怎么解决?",
]
for q in queries:
result = analyzer.analyze(q)
print(f"查询: {q}")
print(f" 复杂度: {result['complexity_level']} ({result['complexity_score']:.2f})")
print(f" 推荐 k: {result['recommended_k']}")
print(f" 推荐策略: {result['recommended_strategy']}")
print()7.3 动态 Chunk 数量分配
python
# ================================================================
# 动态 Chunk 检索器
# ================================================================
class DynamicChunkRetriever:
"""
动态 Chunk 数量检索器
根据查询复杂度动态决定返回多少 chunks
"""
def __init__(self, vectorstore, complexity_analyzer: QueryComplexityAnalyzer = None):
self.vectorstore = vectorstore
self.analyzer = complexity_analyzer or QueryComplexityAnalyzer()
self.min_k = 3
self.max_k = 20
self.clamping_k = True # 是否限制在 [min_k, max_k]
def retrieve(self, query: str, custom_k: int = None) -> List:
"""
检索
Args:
query: 用户查询
custom_k: 自定义返回数量(覆盖自动计算)
Returns:
{
"documents": List[Document],
"actual_k": int, # 实际返回的 chunk 数量
"complexity": Dict, # 复杂度分析结果
}
"""
# 确定 k
if custom_k:
k = custom_k
complexity = None
else:
complexity = self.analyzer.analyze(query)
k = complexity["recommended_k"]
if self.clamping_k:
k = max(self.min_k, min(self.max_k, k))
# 执行检索
if hasattr(self.vectorstore, 'similarity_search_with_score'):
results = self.vectorstore.similarity_search_with_score(query, k=k)
docs = [doc for doc, score in results]
else:
docs = self.vectorstore.similarity_search(query, k=k)
return {
"documents": docs,
"actual_k": len(docs),
"complexity": complexity,
}
# 使用
retriever = DynamicChunkRetriever(vectorstore=vector_store)
# 自动确定 k
simple_result = retriever.retrieve("API 网关的连接超时默认是多少?")
print(f"复杂度: {simple_result['complexity']['complexity_level']}")
print(f"实际 k: {simple_result['actual_k']}") # 自动设为 3
complex_result = retriever.retrieve("对比 A 产品和 B 产品的性能指标")
print(f"复杂度: {complex_result['complexity']['complexity_level']}")
print(f"实际 k: {complex_result['actual_k']}") # 自动设为 107.4 递归式摘要检索(Recursive Abstract Retrieval)
一、场景:宏观问题 vs 微观问题
微观问题(需要具体文档):
"API 网关的连接超时怎么配置?"
→ 检索具体文本 chunk ✓
宏观问题(需要文档摘要):
"总结这篇文档的核心思想"
"这个知识库有哪些主题?"
"整个系统的关键组件是什么?"
→ 检索具体文本 chunk ✗(太细碎,无法全局理解)
→ 检索文档摘要层 ✓二、原理
递归式摘要检索流程:
1. 识别宏观问题
→ "总结这篇文档的核心思想" → 宏观问题
→ "API 网关超时怎么配?" → 微观问题
2. 宏观问题:检索摘要层
文档 A 摘要: "本文档介绍了 API 网关的核心设计原则..."
文档 B 摘要: "本文档涵盖了负载均衡器的实现方案..."
文档 C 摘要: "本文档讲述了微服务架构的最佳实践..."
→ 检索这些摘要 → 找到相关的文档摘要 → 返回摘要
3. 微观问题:检索具体 chunk
→ 检索具体文本块 → 返回具体答案三、代码实现
python
# ================================================================
# 递归式摘要检索
# ================================================================
from typing import List, Dict
class RecursiveAbstractRetriever:
"""
递归式摘要检索器
核心思路:
1. 对每个文档预先生成摘要
2. 宏观问题 → 在摘要层检索
3. 微观问题 → 在具体 chunk 层检索
"""
def __init__(self, llm: ChatOpenAI = None, vectorstore=None):
self.llm = llm or ChatOpenAI(model="gpt-4o-mini", temperature=0.1)
self.vectorstore = vectorstore
self.abstract_store = {} # doc_id → abstract
def register_documents(self, documents: List[Document]):
"""注册文档并生成摘要"""
for doc in documents:
doc_id = doc.metadata.get("id", f"doc_{len(self.abstract_store)}")
if doc_id not in self.abstract_store:
# 生成文档摘要
abstract = self._generate_abstract(doc)
self.abstract_store[doc_id] = abstract
# 向量化摘要
abstracts = list(self.abstract_store.values())
abstract_ids = list(self.abstract_store.keys())
if self.vectorstore:
# 将摘要存入向量库(单独的 collection)
# 这样宏观问题可以直接在摘要层检索
pass
def _generate_abstract(self, doc: Document, max_tokens: int = 256) -> str:
"""生成文档摘要"""
content = doc.page_content
if len(content) <= max_tokens:
return content # 太短的文档直接返回原文
prompt = f"""Summarize the following document in {max_tokens // 4} words.
Focus on the core ideas and key takeaways.
Document:
{content[:max_tokens]}
Summary:"""
response = self.llm.invoke(prompt)
return response.content
def search(self, query: str, k: int = 5) -> Dict:
"""
搜索(自动选择摘要层或 chunk 层)
Returns:
{
"results": List[dict],
"search_layer": "abstract" | "chunk",
"is_abstract_query": bool,
}
"""
# 判断是否为宏观问题
abstract_indicators = [
"总结", "综述", "核心", "概括", "overall",
"key", "summary", "main", "top", "概述",
"这个知识库有哪些", "全篇", "全局",
]
is_abstract_query = any(ind in query.lower() for ind in abstract_indicators)
if is_abstract_query and self.abstract_store:
# 摘要层检索
abstract_texts = list(self.abstract_store.values())
abstract_ids = list(self.abstract_store.keys())
if self.vectorstore:
# 在摘要向量库中搜索
results = self.vectorstore.similarity_search(query, k=k)
return {
"results": [
{"layer": "abstract", "doc_id": aid, "content": self.abstract_store[aid]}
for aid, _ in zip(abstract_ids, results)
],
"search_layer": "abstract",
"is_abstract_query": True,
}
# chunk 层检索(默认)
if hasattr(self.vectorstore, 'similarity_search_with_score'):
results = self.vectorstore.similarity_search_with_score(query, k=k)
docs = [doc for doc, score in results]
else:
docs = self.vectorstore.similarity_search(query, k=k)
return {
"results": [
{"layer": "chunk", "doc_id": doc.metadata.get("id", "?"), "content": doc.page_content}
for doc in docs
],
"search_layer": "chunk",
"is_abstract_query": False,
}
# 使用
abstract_retriever = RecursiveAbstractRetriever(vectorstore=vector_store)
abstract_retriever.register_documents(all_documents)
# 宏观问题 → 摘要层
macro_result = abstract_retriever.search("总结这个知识库的核心主题")
print(f"搜索层: {macro_result['search_layer']}") # abstract
print(f"结果数: {len(macro_result['results'])}")
# 微观问题 → chunk 层
micro_result = abstract_retriever.search("API 网关超时配置")
print(f"搜索层: {micro_result['search_layer']}") # chunk第八部分:用户反馈感知的实时查询补全 ⭐ 新增
8.1 问题的本质:在用户犯错之前拦截
用户输入"安卓内存"→ 不知道系统会理解成:
- Android 内存管理?
- Android 内存泄漏?
- Android 内存优化?
- JVM 堆配置?
在用户提交前提供智能引导,比在用户提交后修正更高效。
8.2 预检索引导(Pre-retrieval Guidance / Search Suggestions)
原理
用户输入: "安卓内存"
↓
[输入事件:用户输入中]
↓
[语义理解:Android 内存相关的查询]
↓
[知识库索引搜索:找到高频相关查询]
↓
推荐提示:
┌─────── ──── ─ ─── ─ ─── ──── ─── ─ ──── ──── ──── ─ ─ ─ ─ ─── ─── ─── ─── ─ ─ ─ ─ ─── ┐
│ 📝 您可能想搜索: │ │
│ · Android 内存泄漏排查方案 │ │
│ · Android JVM 堆内存配置指南 │ │
│ · Android 内存优化最佳实践 │ │
│ · Android 14 内存管理变化 │ │
└─────── ──── ─ ─── ─ ─── ──── ─── ─ ──── ──── ──── ─ ─ ─ ─ ─── ─── ─── ─── ─ ─ ─ ─ ─── ┘代码实现
python
# ================================================================
# 预检索引导(搜索建议)
# ================================================================
class SearchSuggestionEngine:
"""
搜索建议引擎
根据用户输入的前缀,从知识库索引中推荐相关查询
"""
def __init__(
self,
llm: ChatOpenAI = None,
top_k: int = 5,
min_confidence: float = 0.5,
):
self.llm = llm or ChatOpenAI(model="gpt-4o-mini", temperature=0.3)
self.top_k = top_k
self.min_confidence = min_confidence
def get_suggestions(
self,
partial_query: str,
knowledge_base_titles: List[str] = None,
popular_queries: List[str] = None,
) -> List[Dict]:
"""
获取搜索建议
Args:
partial_query: 用户当前输入(未完成)
knowledge_base_titles: 知识库标题/主题列表
popular_queries: 热门搜索查询列表
Returns:
搜索建议列表
"""
suggestions = []
# 策略 1: 知识库标题匹配
if knowledge_base_titles:
title_matches = self._match_titles(partial_query, knowledge_base_titles)
suggestions.extend([
{"text": t, "type": "topic", "confidence": 0.8}
for t in title_matches
])
# 策略 2: 热门搜索
if popular_queries:
popular_matches = self._match_popular(partial_query, popular_queries)
suggestions.extend([
{"text": q, "type": "popular", "confidence": 0.7}
for q in popular_matches
])
# 策略 3: LLM 补全(当以上策略不够时)
if len(suggestions) < 3:
llm_suggestions = self._llm_suggest(partial_query)
suggestions.extend([
{"text": s, "type": "llm_generated", "confidence": 0.6}
for s in llm_suggestions
])
# 去重 + 排序
unique_suggestions = []
seen = set()
for s in sorted(suggestions, key=lambda x: x["confidence"], reverse=True):
if s["text"] not in seen:
seen.add(s["text"])
unique_suggestions.append(s)
if len(unique_suggestions) >= self.top_k:
break
return unique_suggestions
def _match_titles(self, partial: str, titles: List[str]) -> List[str]:
"""匹配知识库标题"""
matches = []
for title in titles:
if partial.lower() in title.lower():
matches.append(title)
return matches
def _match_popular(self, partial: str, popular: List[str]) -> List[str]:
"""匹配热门搜索"""
return [q for q in popular if partial.lower() in q.lower()]
def _llm_suggest(self, partial: str) -> List[str]:
"""LLM 补全查询"""
prompt = f"""Complete this partial search query with 3 likely full queries.
The user is typing: "{partial}"
Provide 3 completions that would be useful for a developer looking up technical documentation.
Each completion should be a realistic search query.
Return ONLY a JSON array:
["completion1", "completion2", "completion3"]
"""
response = self.llm.invoke(prompt)
try:
return json.loads(response.content)
except:
return [partial + " 的配置方法", partial + " 的最佳实践", partial + " 常见问题"]
# 使用
suggestion_engine = SearchSuggestionEngine(top_k=5)
# 知识库标题
kb_titles = [
"Android 内存泄漏排查方案",
"Android JVM 堆内存配置指南",
"Android 内存优化最佳实践",
"Android 14 内存管理变化",
"Android 内存分配机制详解",
]
# 热门搜索
popular_queries = [
"Android 内存泄漏",
"Android 堆内存配置",
"Android GC 调优",
"Android 14 新特性",
"Android 性能优化",
]
# 获取建议
suggestions = suggestion_engine.get_suggestions(
partial_query="安卓内存",
knowledge_base_titles=kb_titles,
popular_queries=popular_queries,
)
for s in suggestions:
print(f" · {s['text']} [{s['type']} (confidence: {s['confidence']})]")
# 输出:
# · Android 内存泄漏排查方案 [topic (confidence: 0.8)]
# · Android JVM 堆内存配置指南 [topic (confidence: 0.8)]
# · Android 内存优化最佳实践 [topic (confidence: 0.8)]
# · Android 内存泄漏 [popular (confidence: 0.7)]
# · Android 堆内存配置 [popular (confidence: 0.7)]8.3 意图澄清(Clarification Questions)
原理
用户输入: "Gradle" ← 歧义:可能指多个主题
系统判断: 查询过于模糊(分类为"歧义")
系统输出:
┌────── ──── ─── ──── ─── ──── ── ─── ─ ─── ─── ─── ─── ─ ─── ─── ─── ─── ─ ──── ┐
│ "Gradle" 可以指多个话题。您是想了解: │
│ │
│ 1. Gradle 版本兼容性问题? │
│ 2. Gradle 构建加速优化? │
│ 3. Gradle Kotlin DSL 配置? │
│ 4. Gradle 依赖管理? │
│ 5. 其他... │
│ │
│ 请选择或输入您想了解的 Gradle 话题: │
└────── ──── ─── ──── ─── ──── ── ─── ─ ─── ─── ─── ─── ─ ─── ─── ─── ─── ─ ──── ┘代码实现
python
# ================================================================
# 意图澄清器
# ================================================================
class ClarificationEngine:
"""
意图澄清引擎
当查询极度模糊时,主动生成澄清问题
"""
def __init__(self, llm: ChatOpenAI = None, max_questions: int = 4):
self.llm = llm or ChatOpenAI(model="gpt-4o-mini", temperature=0.3)
self.max_questions = max_questions
def check_ambiguity(self, query: str, knowledge_base: List[str] = None) -> Dict:
"""
检查查询的歧义程度
Returns:
{
"is_ambiguous": bool,
"ambiguity_level": "none|low|medium|high",
"possible_intents": List[str], # 可能的意图
"should_clarify": bool, # 是否需要澄清
"clarification_question": str, # 澄清问题(如果需要)
"clarification_options": List[str], # 澄清选项
}
"""
# 歧义判断规则
ambiguity_score = 0.0
# 规则 1: 单字词
if len(query.strip()) <= 3:
ambiguity_score += 0.4
# 规则 2: 没有谓语/动词
has_verb = any(w in query for w in ["怎么", "如何", "配置", "安装", "使用", "解决"])
if not has_verb:
ambiguity_score += 0.3
# 规则 3: 没有限定词
has_modifier = any(w in query for w in ["的", "关于", "的文档", "教程"])
if not has_modifier:
ambiguity_score += 0.2
# 规则 4: 知识库中有多篇相关文档
if knowledge_base:
related_docs = [doc for doc in knowledge_base if any(w in query.lower() for w in doc.lower().split())]
if len(related_docs) > 3:
ambiguity_score += 0.1
# 歧义级别
if ambiguity_score >= 0.8:
level = "high"
elif ambiguity_score >= 0.5:
level = "medium"
elif ambiguity_score >= 0.3:
level = "low"
else:
level = "none"
should_clarify = level in ["high", "medium"]
if should_clarify:
clarification = self._generate_clarification(query, level)
else:
clarification = {
"clarification_question": "",
"clarification_options": [],
}
# 提取可能意图
intents = self._extract_intents(query, knowledge_base)
return {
"is_ambiguous": level != "none",
"ambiguity_level": level,
"possible_intents": intents,
"should_clarify": should_clarify,
"ambiguity_score": ambiguity_score,
**clarification,
}
def _generate_clarification(self, query: str, level: str) -> Dict:
"""生成澄清问题"""
prompt = f"""The user asked about "{query}" which is ambiguous.
Generate a clarification question and 3-4 specific options.
Requirements:
- The clarification question should be friendly and helpful
- Each option should be a specific, answerable topic
- Keep the options concise (under 20 characters each)
- Include an "Other..." option as the last choice
Format:
{{
"question": "您是想了解关于 '{query}' 的哪些方面?",
"options": ["选项 1", "选项 2", "选项 3", "其他..."]
}}
"""
response = self.llm.invoke(prompt)
try:
return json.loads(response.content)
except:
return {
"question": f"关于 '{query}',您具体想了解什么?",
"options": [f"{query} 的配置", f"{query} 的最佳实践", f"{query} 常见问题", "其他..."],
}
def _extract_intents(self, query: str, knowledge_base: List[str] = None) -> List[str]:
"""从知识库中提取可能意图"""
if not knowledge_base:
return [query]
intents = []
for doc in knowledge_base:
# 简单关键词匹配
query_words = set(query.lower().split())
doc_words = set(doc.lower().split())
if query_words & doc_words: # 有交集
# 提取标题作为意图
title = doc.split()[0] if doc else doc
intents.append(title)
return intents[:self.max_questions] if intents else [query]
# 使用
clarifier = ClarificationEngine()
# 测试不同查询
queries = [
"Gradle",
"内存",
"API 网关",
"Android 内存泄漏排查方案", # 完整查询
]
for q in queries:
result = clarifier.check_ambiguity(q)
print(f"查询: {q}")
print(f" 歧义程度: {result['ambiguity_level']} (score: {result['ambiguity_score']:.2f})")
print(f" 需要澄清: {result['should_clarify']}")
if result['should_clarify']:
print(f" 澄清问题: {result['question']}")
print(f" 选项: {result['options']}")
print()python
# ================================================================
# 前端交互组件(React 示例)
# ================================================================
"""
React 前端组件示例:
<SearchBoxWithSuggestions>
- 输入时实时显示搜索建议
- 歧义时显示澄清对话框
- 点击建议直接搜索
- 输入完成后可预览
</SearchBoxWithSuggestions>
"""
# (此处为前端伪代码,实际实现根据框架而定)第九部分:端到端代码实现
完整查询优化管线(v3.0)
python
"""
===== 完整查询优化管线 v3.0 =====
整合:
1. 多轮对话
2. 多模态查询
3. RAG 必要性分类
4. 查询优化(改写/分解/HyDE)
5. 结构化查询转换(Text-to-SQL/Text-to-Cypher/Self-Query)
6. 动态 Chunk 粒度控制
7. 递归摘要检索
8. 搜索建议 + 意图澄清
9. 自修正检索
"""
class UniversalQueryOptimizer:
"""统一查询优化器(v3.0 完整版)"""
def __init__(self, config: dict = None):
# 基础组件
self.llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.1)
self.conv_manager = ConversationStateManager()
self.rag_classifier = RAGNecessityClassifier()
self.query_analyzer = QueryComplexityAnalyzer()
self.hallucination_detector = HallucinationDetector()
self.suggestion_engine = SearchSuggestionEngine()
self.clarification_engine = ClarificationEngine()
# 查询转换器
self.sql_converter = TextToSQLConverter()
self.cypher_converter = TextToCypherConverter()
self.self_query_extractor = SelfQueryExtractor()
self.abstract_retriever = RecursiveAbstractRetriever()
# 检索器(注入)
self.vector_retriever = None
self.sql_retriever = None
def get_suggestions(self, partial_query: str) -> List[Dict]:
"""获取搜索建议(前端实时调用)"""
return self.suggestion_engine.get_suggestions(partial_query)
def check_clarification(self, query: str) -> Dict:
"""检查是否需要澄清"""
return self.clarification_engine.check_ambiguity(query)
def handle_query(self, query: str) -> Dict:
"""处理查询的完整流程"""
# 1. 对话状态管理
resolved = self.conv_manager.resolve_coreference(query)
final_query = resolved["resolved_query"]
# 2. RAG 必要性分类
rag_result = self.rag_classifier.classify(final_query)
# 3. 查询复杂度分析
complexity = self.query_analyzer.analyze(final_query)
# 4. 查询路由
if self._is_structured_query(final_query):
route_result = self._route_structured(final_query)
else:
route_result = self._route_vector(final_query, complexity)
# 5. 生成回答
answer = self._generate(final_query, route_result)
# 6. 幻觉检测
hall_result = self.hallucination_detector.detect(final_query, route_result.get("documents", []), answer)
# 7. 更新对话历史
self.conv_manager.add_message("user", query)
self.conv_manager.add_message("assistant", answer)
return {
"answer": answer,
"resolved_query": final_query,
"rag_needed": rag_result["should_use_rag"],
"complexity": complexity,
"route": route_result.get("route_type", "vector"),
"hallucination_check": hall_result,
}
def _route_structured(self, query: str) -> Dict:
"""路由到结构化查询"""
if self.sql_converter:
sql_result = self.sql_converter.generate_sql(query)
return {
"route_type": "sql",
"sql": sql_result["sql"],
"documents": self._execute_sql(sql_result["sql"]),
}
return {"route_type": "vector", "documents": []}
def _route_vector(self, query: str, complexity: Dict) -> Dict:
"""路由到向量检索"""
k = complexity.get("recommended_k", 5)
if complexity.get("is_abstract_query", False):
results = self.abstract_retriever.search(query, k=k)
else:
if self.vector_retriever:
docs = self.vector_retriever.invoke(query)
else:
docs = []
results = {"results": docs, "search_layer": "chunk"}
return {
"route_type": "vector",
"documents": results["results"],
"search_layer": results.get("search_layer", "chunk"),
}
# 使用
optimizer = UniversalQueryOptimizer()
# 获取建议
suggestions = optimizer.get_suggestions("安卓内存")
for s in suggestions:
print(f"建议: {s['text']}")
# 检查澄清
clarify = optimizer.check_clarification("Gradle")
if clarify["should_clarify"]:
print(f"澄清问题: {clarify['question']}")
# 处理查询
result = optimizer.handle_query("2023 年关于 Android 的文档")
print(f"答案: {result['answer'][:200]}")第十部分:效果对比与选型决策
各维度效果对比
| 维度 | Recall 提升 | 成本影响 | 推荐度 | |------|----------|--|--|------| | 自查询(元数据过滤) | Recall+35% | 低(LLM 调用) | ★★★★★ | | Text-to-SQL | Recall+50% | 中(需数据库) | ★★★★☆ | | Text-to-Cypher | Recall+45% | 中(需图数据库) | ★★★★☆ | | 动态 Chunk 数量 | Recall+20% | 零(仅改变 k) | ★★★★★ | | 递归摘要检索 | Recall+30%(宏观问题) | 中(预计算摘要) | ★★★★☆ | | 搜索建议 | 用户体验+50% | 低(缓存命中) | ★★★★★ | | 意图澄清 | 减少错误查询 40% | 低(澄清 LLM) | ★★★★☆ |
附录:技术选型速查表
| 维度 | 推荐实现 | 轻量实现 | 适用场景 |
|---|---|---|---|
| Self-Query | LangChain SelfQueryRetriever | 自定义 LLM 提取 | 元数据过滤 |
| Text-to-SQL | LangChain SQLDatabase + LLM | 自定义 LLM | 结构化数据库 |
| Text-to-Cypher | Neo4j LLM Cypher 生成 | 自定义 LLM | 图数据库 |
| 动态 Chunk | QueryComplexityAnalyzer | 启发式规则 | 所有场景 |
| 递归摘要 | RecursiveAbstractRetriever | 文档摘要缓存 | 宏观问题 |
| 搜索建议 | SearchSuggestionEngine | 关键词匹配 | 前端 UX |
| 意图澄清 | ClarificationEngine | 规则-based | 模糊查询 |
核心原则:查询优化的最高境界是"无感优化"——用户不需要知道系统背后做了多少工作,只需要感受到每次回答都精准、快速、可信。