<?xml version="1.0" encoding="utf-8"?><?xml-stylesheet type="text/xsl" href="rss.xsl"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
        <title>James' Blog Blog</title>
        <link>https://jameszbl.github.io/</link>
        <description>James' Blog Blog</description>
        <lastBuildDate>Fri, 03 Apr 2026 00:00:00 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <language>zh-Hans</language>
        <item>
            <title><![CDATA[商品评价AI回复系统的演进]]></title>
            <link>https://jameszbl.github.io/feedback-reply-rag-history</link>
            <guid>https://jameszbl.github.io/feedback-reply-rag-history</guid>
            <pubDate>Fri, 03 Apr 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[{/ truncate /}]]></description>
            <content:encoded><![CDATA[
<h1>第一阶段：规则+Prompt</h1>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="业务阶段">业务阶段<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E4%B8%9A%E5%8A%A1%E9%98%B6%E6%AE%B5" class="hash-link" aria-label="业务阶段的直接链接" title="业务阶段的直接链接" translate="no">​</a></h3>
<p>刚开始每天评价量不高，运营主要关注回复效率。</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="方案">方案<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E6%96%B9%E6%A1%88" class="hash-link" aria-label="方案的直接链接" title="方案的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">评价</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">情感分类</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Prompt模板</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM生成回复</span><br></div></code></pre></div></div>
<p>例如：</p>
<p>用户评价：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">苹果很好吃</span><br></div></code></pre></div></div>
<p>Prompt：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">你是电商客服，请礼貌回复用户好评</span><br></div></code></pre></div></div>
<p>生成：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">感谢您的支持，欢迎再次购买。</span><br></div></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="遇到的问题">遇到的问题<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E9%81%87%E5%88%B0%E7%9A%84%E9%97%AE%E9%A2%98" class="hash-link" aria-label="遇到的问题的直接链接" title="遇到的问题的直接链接" translate="no">​</a></h3>
<p>随着评价量增长，回复开始出现：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">草莓不甜</span><br></div></code></pre></div></div>
<p>模型可能回复：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">感谢您的支持</span><br></div></code></pre></div></div>
<p>或者</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">水果甜度因人而异</span><br></div></code></pre></div></div>
<p>明显与运营预期不符。</p>
<hr>
<h1>第二阶段：意图识别+知识库匹配</h1>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="业务阶段-1">业务阶段<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E4%B8%9A%E5%8A%A1%E9%98%B6%E6%AE%B5-1" class="hash-link" aria-label="业务阶段的直接链接" title="业务阶段的直接链接" translate="no">​</a></h3>
<p>差评开始增多。</p>
<p>运营希望：</p>
<p>不同问题有不同回复逻辑。</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="方案-1">方案<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E6%96%B9%E6%A1%88-1" class="hash-link" aria-label="方案的直接链接" title="方案的直接链接" translate="no">​</a></h3>
<p>增加意图分类：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">评价</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">情感分类</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">意图识别</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ├─物流问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ├─商品质量</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ├─包装问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ├─售后问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">知识库匹配</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Prompt组装</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM</span><br></div></code></pre></div></div>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">快递太慢了</span><br></div></code></pre></div></div>
<p>识别：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">物流问题</span><br></div></code></pre></div></div>
<p>匹配：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">物流FAQ</span><br></div></code></pre></div></div>
<p>Prompt：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">参考以下物流说明回复用户：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">......</span><br></div></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="遇到的问题-1">遇到的问题<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E9%81%87%E5%88%B0%E7%9A%84%E9%97%AE%E9%A2%98-1" class="hash-link" aria-label="遇到的问题的直接链接" title="遇到的问题的直接链接" translate="no">​</a></h3>
<p>知识库越来越大：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">500+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">1000+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">5000+</span><br></div></code></pre></div></div>
<p>关键词匹配开始失效。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">收到的时候已经烂了</span><br></div></code></pre></div></div>
<p>可能命中：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">水果不甜</span><br></div></code></pre></div></div>
<p>而不是：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">运输损坏</span><br></div></code></pre></div></div>
<hr>
<h1>第三阶段：向量检索RAG</h1>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="业务阶段-2">业务阶段<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E4%B8%9A%E5%8A%A1%E9%98%B6%E6%AE%B5-2" class="hash-link" aria-label="业务阶段的直接链接" title="业务阶段的直接链接" translate="no">​</a></h3>
<p>商品数快速增长。</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">水果</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">零食</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">酒水</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生鲜</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">家电</span><br></div></code></pre></div></div>
<p>知识库达到数千条。</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="方案-2">方案<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E6%96%B9%E6%A1%88-2" class="hash-link" aria-label="方案的直接链接" title="方案的直接链接" translate="no">​</a></h3>
<p>引入Embedding。</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">评价</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Embedding</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Vector Search</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">TopK召回</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Prompt</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM</span><br></div></code></pre></div></div>
<p>例如：</p>
<p>评价：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">荔枝打开全是黑的</span><br></div></code></pre></div></div>
<p>向量检索：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">水果腐坏处理规范</span><br></div></code></pre></div></div>
<p>而不是：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">荔枝商品介绍</span><br></div></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="收益">收益<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E6%94%B6%E7%9B%8A" class="hash-link" aria-label="收益的直接链接" title="收益的直接链接" translate="no">​</a></h3>
<p>Recall明显提升。</p>
<p>运营开始能够接受AI回复质量。</p>
<hr>
<h1>第四阶段：Hybrid Search</h1>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="业务阶段-3">业务阶段<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E4%B8%9A%E5%8A%A1%E9%98%B6%E6%AE%B5-3" class="hash-link" aria-label="业务阶段的直接链接" title="业务阶段的直接链接" translate="no">​</a></h3>
<p>发现向量检索并不是万能。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">A123型号</span><br></div></code></pre></div></div>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">订单998812</span><br></div></code></pre></div></div>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">阳光玫瑰葡萄</span><br></div></code></pre></div></div>
<p>Embedding效果反而不好。</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="方案-3">方案<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E6%96%B9%E6%A1%88-3" class="hash-link" aria-label="方案的直接链接" title="方案的直接链接" translate="no">​</a></h3>
<p>Hybrid Search。</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户评价</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">     ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ┌─────────┐</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> │Keyword  │</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> └─────────┘</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">     ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      Merge</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">     ↑</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ┌─────────┐</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> │Vector   │</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> └─────────┘</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">     ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">   Recall</span><br></div></code></pre></div></div>
<p>常见实现：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">ES BM25</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Qdrant</span><br></div></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="效果">效果<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E6%95%88%E6%9E%9C" class="hash-link" aria-label="效果的直接链接" title="效果的直接链接" translate="no">​</a></h3>
<p>商品型号、品牌名称召回率大幅提升。</p>
<hr>
<h1>第五阶段：ReRank重排</h1>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="业务阶段-4">业务阶段<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E4%B8%9A%E5%8A%A1%E9%98%B6%E6%AE%B5-4" class="hash-link" aria-label="业务阶段的直接链接" title="业务阶段的直接链接" translate="no">​</a></h3>
<p>召回结果越来越多。</p>
<p>Top10里面：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">相关</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">相关</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">不相关</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">相关</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">不相关</span><br></div></code></pre></div></div>
<p>Prompt中混入噪声。</p>
<p>导致回复质量下降。</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="方案-4">方案<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E6%96%B9%E6%A1%88-4" class="hash-link" aria-label="方案的直接链接" title="方案的直接链接" translate="no">​</a></h3>
<p>增加ReRank。</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Query</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Recall Top50</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Cross Encoder</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">ReRank</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Top5</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM</span><br></div></code></pre></div></div>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">草莓收到后发酸</span><br></div></code></pre></div></div>
<p>召回：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">水果变质</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">水果成熟度</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">物流时效</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">退款规则</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">...</span><br></div></code></pre></div></div>
<p>ReRank后：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">水果变质处理规范</span><br></div></code></pre></div></div>
<p>排到第一位。</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="效果-1">效果<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E6%95%88%E6%9E%9C-1" class="hash-link" aria-label="效果的直接链接" title="效果的直接链接" translate="no">​</a></h3>
<p>准确率进一步提升。</p>
<hr>
<h1>第六阶段：Query Rewrite</h1>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="业务阶段-5">业务阶段<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E4%B8%9A%E5%8A%A1%E9%98%B6%E6%AE%B5-5" class="hash-link" aria-label="业务阶段的直接链接" title="业务阶段的直接链接" translate="no">​</a></h3>
<p>用户评价越来越口语化。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">不行</span><br></div></code></pre></div></div>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">差</span><br></div></code></pre></div></div>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">一般般</span><br></div></code></pre></div></div>
<p>检索效果很差。</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="方案-5">方案<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E6%96%B9%E6%A1%88-5" class="hash-link" aria-label="方案的直接链接" title="方案的直接链接" translate="no">​</a></h3>
<p>增加Query Rewrite。</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">评价</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Rewrite</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">回复</span><br></div></code></pre></div></div>
<p>例如：</p>
<p>原评价：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">不甜</span><br></div></code></pre></div></div>
<p>改写：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户反馈水果甜度不足</span><br></div></code></pre></div></div>
<p>原评价：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">都坏了</span><br></div></code></pre></div></div>
<p>改写：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户反馈收到商品存在腐坏问题</span><br></div></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="效果-2">效果<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E6%95%88%E6%9E%9C-2" class="hash-link" aria-label="效果的直接链接" title="效果的直接链接" translate="no">​</a></h3>
<p>召回准确率提升。</p>
<hr>
<h1>第七阶段：在线评测闭环</h1>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="业务阶段-6">业务阶段<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E4%B8%9A%E5%8A%A1%E9%98%B6%E6%AE%B5-6" class="hash-link" aria-label="业务阶段的直接链接" title="业务阶段的直接链接" translate="no">​</a></h3>
<p>运营开始关注：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">AI回复是否真的有效？</span><br></div></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="方案-6">方案<a href="https://jameszbl.github.io/feedback-reply-rag-history#%E6%96%B9%E6%A1%88-6" class="hash-link" aria-label="方案的直接链接" title="方案的直接链接" translate="no">​</a></h3>
<p>建立评测体系。</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户评价</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">AI回复</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">人工审核</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">用户反馈</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">效果统计</span><br></div></code></pre></div></div>
<p>指标：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">知识命中率</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">召回准确率</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">审核通过率</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">人工修改率</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">用户满意度</span><br></div></code></pre></div></div>
<hr>
<h1>最终企业级架构</h1>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户评价</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">情感分析</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">意图识别</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Query Rewrite</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Router</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Vector</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">ReRank</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Prompt Builder</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">AI回复</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">人工审核</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">反馈回流</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">RAG评测平台</span><br></div></code></pre></div></div>]]></content:encoded>
            <category>ai</category>
            <category>java</category>
        </item>
        <item>
            <title><![CDATA[从0到1搭建企业AI平台]]></title>
            <link>https://jameszbl.github.io/enterprise-ai-platform</link>
            <guid>https://jameszbl.github.io/enterprise-ai-platform</guid>
            <pubDate>Sat, 28 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[{/ truncate /}]]></description>
            <content:encoded><![CDATA[
<p>随着大模型能力不断增强，越来越多的企业开始将AI能力从单点试验项目逐步扩展到多个业务场景。从最初的智能问答、知识库助手，到后来的客服Agent、运营助手、内容生成平台，企业内部往往会同时出现多个AI应用。当应用数量越来越多之后，一个新的问题开始出现：如果每个团队都独立接入模型、独立维护Prompt、独立构建知识库，那么技术栈会越来越分散，成本会越来越高，后续的治理也会变得十分困难。</p>
<p>因此，对于已经进入AI应用落地阶段的企业来说，建设统一的AI平台往往比单独开发某个AI应用更重要。平台的目标并不是替代业务团队，而是提供统一的模型能力、知识库能力、Agent能力以及监控治理能力，让业务团队能够专注于业务创新，而不需要重复建设底层基础设施。</p>
<p>本文结合企业AI应用建设过程中常见的技术方案，介绍一个从0到1搭建企业AI平台的整体思路。</p>
<hr>
<h1>一、企业AI平台总体架构</h1>
<p>企业AI平台本质上是一套分层架构，其核心目标是将业务逻辑与AI能力解耦，使模型升级、知识更新以及Agent能力扩展不会影响上层业务应用。</p>
<p>整体架构如下：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">应用层</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Agent层</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Workflow层</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">RAG层</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Model层</span><br></div></code></pre></div></div>
<p>其中每一层都承担着不同的职责。</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="model层">Model层<a href="https://jameszbl.github.io/enterprise-ai-platform#model%E5%B1%82" class="hash-link" aria-label="Model层的直接链接" title="Model层的直接链接" translate="no">​</a></h2>
<p>Model层是整个AI平台最底层的能力中心，负责统一接入各种大模型。</p>
<p>在实际建设过程中，很少会只接入单一模型。因为不同模型的优势并不相同，有些模型推理能力较强，有些模型生成速度更快，有些模型成本更低，因此平台通常需要支持多个模型的统一管理。</p>
<p>例如：</p>
<ul>
<li class="">DeepSeek</li>
<li class="">GPT</li>
<li class="">Claude</li>
<li class="">Qwen</li>
<li class="">Gemini</li>
</ul>
<p>为了避免业务系统直接依赖某个厂商SDK，通常会在平台内部封装统一的模型网关。</p>
<p>业务侧只需要关注：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">chatModel.chat(prompt);</span><br></div></code></pre></div></div>
<p>至于最终路由到哪个模型、是否进行重试、是否启用流式输出、是否进行降级处理，都由平台统一管理。</p>
<p>这种设计最大的价值在于模型切换成本极低，当出现新的模型或者模型价格发生变化时，只需要调整配置即可完成迁移。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="rag层">RAG层<a href="https://jameszbl.github.io/enterprise-ai-platform#rag%E5%B1%82" class="hash-link" aria-label="RAG层的直接链接" title="RAG层的直接链接" translate="no">​</a></h2>
<p>对于企业应用而言，仅依赖模型自身知识往往无法满足业务需求。</p>
<p>例如：</p>
<ul>
<li class="">商品信息</li>
<li class="">客服话术</li>
<li class="">企业制度</li>
<li class="">产品文档</li>
<li class="">运维知识</li>
</ul>
<p>这些内容都在不断变化，而模型训练数据通常无法实时同步。</p>
<p>因此RAG（Retrieval Augmented Generation）成为企业AI平台中的核心组成部分。</p>
<p>RAG层主要负责：</p>
<p>文档解析、Chunk切分、Embedding生成、向量存储、检索召回以及重排序等能力。</p>
<p>用户提问后，系统首先完成向量检索，然后将检索结果与用户问题一起发送给大模型，从而获得更加准确的回答。</p>
<p>这一层通常是企业AI项目效果差异最大的地方，因为真正决定回答质量的往往不是模型，而是检索出来的上下文是否准确。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="workflow层">Workflow层<a href="https://jameszbl.github.io/enterprise-ai-platform#workflow%E5%B1%82" class="hash-link" aria-label="Workflow层的直接链接" title="Workflow层的直接链接" translate="no">​</a></h2>
<p>当业务逻辑逐渐复杂之后，单次Prompt调用已经无法解决问题。</p>
<p>例如一个商品评价自动回复场景：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">评价内容</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">情感分析</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">意图识别</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">知识库检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Prompt构建</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">模型生成</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">人工审核</span><br></div></code></pre></div></div>
<p>整个过程实际上已经形成了一个工作流。</p>
<p>Workflow层负责管理这些节点之间的执行关系。</p>
<p>包括：</p>
<ul>
<li class="">顺序执行</li>
<li class="">并行执行</li>
<li class="">条件路由</li>
<li class="">人工审核</li>
<li class="">异常处理</li>
<li class="">重试机制</li>
</ul>
<p>很多企业在项目初期会将这些逻辑全部写在Service层中，但随着场景增加，代码会迅速失控，因此需要引入统一的工作流引擎进行管理。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="agent层">Agent层<a href="https://jameszbl.github.io/enterprise-ai-platform#agent%E5%B1%82" class="hash-link" aria-label="Agent层的直接链接" title="Agent层的直接链接" translate="no">​</a></h2>
<p>如果说Workflow解决的是流程编排问题，那么Agent解决的是自主决策问题。</p>
<p>传统Workflow更像预先设计好的流程图，而Agent则可以根据目标动态决定下一步应该做什么。</p>
<p>例如：</p>
<p>用户提出问题：</p>
<blockquote>
<p>查询本月销售情况，并分析销量下降原因。</p>
</blockquote>
<p>Agent可能自动完成：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">调用销售系统</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">获取数据</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">分析异常指标</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">查询运营活动</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生成分析报告</span><br></div></code></pre></div></div>
<p>整个过程不再依赖固定流程，而是依赖模型的推理能力和工具调用能力。</p>
<p>因此Agent层通常包含：</p>
<ul>
<li class="">Tool Calling</li>
<li class="">Memory</li>
<li class="">Planning</li>
<li class="">Reflection</li>
<li class="">Multi-Step Reasoning</li>
</ul>
<p>这些能力共同构成了企业级Agent的基础设施。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="应用层">应用层<a href="https://jameszbl.github.io/enterprise-ai-platform#%E5%BA%94%E7%94%A8%E5%B1%82" class="hash-link" aria-label="应用层的直接链接" title="应用层的直接链接" translate="no">​</a></h2>
<p>应用层是最终面向业务的部分。</p>
<p>例如：</p>
<ul>
<li class="">AI客服</li>
<li class="">AI运营助手</li>
<li class="">AI知识库</li>
<li class="">AI代码助手</li>
<li class="">AI数据分析平台</li>
</ul>
<p>这些应用并不直接依赖底层模型，而是通过平台能力进行快速构建。</p>
<p>这样能够避免重复建设，提高整体研发效率。</p>
<hr>
<h1>二、技术选型</h1>
<p>在Java技术栈下，构建企业AI平台的技术组合相对成熟。</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="springboot3">SpringBoot3<a href="https://jameszbl.github.io/enterprise-ai-platform#springboot3" class="hash-link" aria-label="SpringBoot3的直接链接" title="SpringBoot3的直接链接" translate="no">​</a></h2>
<p>SpringBoot3依然是整个系统的基础框架。</p>
<p>其优势在于：</p>
<ul>
<li class="">企业级生态完善</li>
<li class="">与微服务体系兼容</li>
<li class="">易于整合缓存、中间件和数据库</li>
</ul>
<p>对于绝大多数企业而言，没有必要为了AI项目重新引入新的后端框架。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="langchain4j">LangChain4j<a href="https://jameszbl.github.io/enterprise-ai-platform#langchain4j" class="hash-link" aria-label="LangChain4j的直接链接" title="LangChain4j的直接链接" translate="no">​</a></h2>
<p>如果说SpringBoot负责系统建设，那么LangChain4j负责AI能力建设。</p>
<p>其提供了：</p>
<ul>
<li class="">ChatModel</li>
<li class="">StreamingChatModel</li>
<li class="">AiService</li>
<li class="">Tool Calling</li>
<li class="">RAG</li>
<li class="">Memory</li>
</ul>
<p>等能力。</p>
<p>很多原本需要大量Prompt拼接和模型调用逻辑的代码，通过LangChain4j可以快速完成封装。</p>
<p>对于Java团队来说，其学习成本远低于Python生态中的LangChain。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="qdrant">Qdrant<a href="https://jameszbl.github.io/enterprise-ai-platform#qdrant" class="hash-link" aria-label="Qdrant的直接链接" title="Qdrant的直接链接" translate="no">​</a></h2>
<p>Qdrant是目前比较流行的向量数据库之一。</p>
<p>主要负责：</p>
<ul>
<li class="">向量存储</li>
<li class="">相似度检索</li>
<li class="">Metadata过滤</li>
</ul>
<p>在企业知识库场景中，经常会同时利用向量搜索和元数据过滤。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">产品 = 手机</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">部门 = 客服</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">时间 &gt;= 2025</span><br></div></code></pre></div></div>
<p>先完成过滤，再进行向量检索，能够显著提升召回质量。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="redis">Redis<a href="https://jameszbl.github.io/enterprise-ai-platform#redis" class="hash-link" aria-label="Redis的直接链接" title="Redis的直接链接" translate="no">​</a></h2>
<p>Redis承担多个角色。</p>
<p>包括：</p>
<ul>
<li class="">Prompt缓存</li>
<li class="">会话缓存</li>
<li class="">Token缓存</li>
<li class="">限流控制</li>
<li class="">分布式锁</li>
</ul>
<p>尤其是在高并发场景下，大量重复问题实际上没有必要重复调用模型，通过缓存可以显著降低成本。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="elasticsearch">Elasticsearch<a href="https://jameszbl.github.io/enterprise-ai-platform#elasticsearch" class="hash-link" aria-label="Elasticsearch的直接链接" title="Elasticsearch的直接链接" translate="no">​</a></h2>
<p>很多企业在建设RAG系统后发现，仅依赖向量搜索并不能解决所有问题。</p>
<p>例如：</p>
<p>产品型号查询。</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">iPhone16 Pro Max</span><br></div></code></pre></div></div>
<p>这种精确匹配场景往往关键词检索效果更好。</p>
<p>因此实际项目中经常采用：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Keyword Search</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Vector Search</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">=</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Hybrid Search</span><br></div></code></pre></div></div>
<p>而Elasticsearch正是关键词搜索的重要基础设施。</p>
<hr>
<h1>三、可观测性建设</h1>
<p>很多团队把大量精力投入到模型效果优化，却忽略了可观测性建设。</p>
<p>实际上，线上问题排查往往比功能开发更重要。</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="日志体系">日志体系<a href="https://jameszbl.github.io/enterprise-ai-platform#%E6%97%A5%E5%BF%97%E4%BD%93%E7%B3%BB" class="hash-link" aria-label="日志体系的直接链接" title="日志体系的直接链接" translate="no">​</a></h2>
<p>一次完整请求通常需要记录：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Prompt</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">模型名称</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Token消耗</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">响应结果</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">耗时</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">异常信息</span><br></div></code></pre></div></div>
<p>这样在出现幻觉或者回复异常时，能够快速定位问题。</p>
<p>如果缺少这些日志，后续排查几乎无从下手。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="tracing链路追踪">Tracing链路追踪<a href="https://jameszbl.github.io/enterprise-ai-platform#tracing%E9%93%BE%E8%B7%AF%E8%BF%BD%E8%B8%AA" class="hash-link" aria-label="Tracing链路追踪的直接链接" title="Tracing链路追踪的直接链接" translate="no">​</a></h2>
<p>随着Agent和Workflow的复杂度提升，一个请求可能触发多个模型调用。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">意图识别</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">知识检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">模型生成</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">审核模型</span><br></div></code></pre></div></div>
<p>如果没有链路追踪，很难知道时间消耗在哪个环节。</p>
<p>因此平台通常会引入类似OpenTelemetry的方案，对整个执行过程进行追踪。</p>
<p>这样可以清晰看到：</p>
<ul>
<li class="">哪个节点耗时最长</li>
<li class="">哪个工具调用失败</li>
<li class="">哪次检索未命中</li>
</ul>
<p>从而快速定位问题。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="token统计">Token统计<a href="https://jameszbl.github.io/enterprise-ai-platform#token%E7%BB%9F%E8%AE%A1" class="hash-link" aria-label="Token统计的直接链接" title="Token统计的直接链接" translate="no">​</a></h2>
<p>对于企业来说，Token实际上就是成本。</p>
<p>因此平台必须具备完善的Token统计能力。</p>
<p>统计维度通常包括：</p>
<ul>
<li class="">用户维度</li>
<li class="">应用维度</li>
<li class="">部门维度</li>
<li class="">模型维度</li>
<li class="">租户维度</li>
</ul>
<p>很多企业在项目初期并不关注Token消耗，等到应用规模扩大后才发现成本增长速度远超预期。</p>
<p>因此从第一天开始建立Token监控体系是非常必要的。</p>
<hr>
<h1>四、多租户设计</h1>
<p>当AI平台开始服务多个业务部门时，多租户能力就成为必备功能。</p>
<p>平台通常需要隔离：</p>
<ul>
<li class="">知识库</li>
<li class="">Prompt</li>
<li class="">模型配置</li>
<li class="">Agent配置</li>
<li class="">工作流配置</li>
</ul>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">租户A</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ├─ 知识库A</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ├─ AgentA</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> └─ WorkflowA</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">租户B</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ├─ 知识库B</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ├─ AgentB</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> └─ WorkflowB</span><br></div></code></pre></div></div>
<p>在数据库设计上，一般会引入TenantId作为核心隔离字段。</p>
<p>所有查询都会自动附带租户条件。</p>
<p>这样既保证数据安全，又能够共享底层基础设施。</p>
<hr>
<h1>五、高并发设计</h1>
<p>AI平台与传统系统最大的区别在于，大模型调用耗时通常远高于普通接口。</p>
<p>传统接口可能几十毫秒完成响应。</p>
<p>而模型请求可能需要数秒甚至十几秒。</p>
<p>因此必须从架构层面考虑并发问题。</p>
<p>首先需要采用异步化设计。</p>
<p>对于长时间任务，尽量避免同步阻塞线程。</p>
<p>其次需要引入流式输出。</p>
<p>用户在数百毫秒内看到首个Token，体验会明显提升。</p>
<p>此外还需要建立：</p>
<ul>
<li class="">模型连接池</li>
<li class="">请求队列</li>
<li class="">熔断降级</li>
<li class="">限流机制</li>
</ul>
<p>当模型服务出现波动时，系统能够自动保护自身而不是整体崩溃。</p>
<p>在实际生产环境中，很多性能问题并不是数据库瓶颈，而是模型调用能力成为新的系统瓶颈，因此容量规划也需要围绕模型QPS重新设计。</p>
<hr>
<h1>六、未来演进方向</h1>
<p>AI平台仍然处于快速发展阶段，很多能力还在持续演进。</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="mcp">MCP<a href="https://jameszbl.github.io/enterprise-ai-platform#mcp" class="hash-link" aria-label="MCP的直接链接" title="MCP的直接链接" translate="no">​</a></h2>
<p>MCP正在逐渐成为AI工具调用领域的重要标准。</p>
<p>未来企业内部的ERP、CRM、订单系统以及数据平台，都有可能通过MCP协议向Agent开放能力。</p>
<p>这样Agent无需针对每个系统开发专用接口。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="a2a">A2A<a href="https://jameszbl.github.io/enterprise-ai-platform#a2a" class="hash-link" aria-label="A2A的直接链接" title="A2A的直接链接" translate="no">​</a></h2>
<p>随着Agent数量增加，Agent之间协同工作将成为新的需求。</p>
<p>A2A（Agent To Agent）希望解决不同Agent之间的通信问题。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">客服Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">物流Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">运营Agent</span><br></div></code></pre></div></div>
<p>多个Agent共同完成复杂任务。</p>
<p>未来企业内部很可能会出现大量专业Agent协同工作的场景。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="multi-agent">Multi-Agent<a href="https://jameszbl.github.io/enterprise-ai-platform#multi-agent" class="hash-link" aria-label="Multi-Agent的直接链接" title="Multi-Agent的直接链接" translate="no">​</a></h2>
<p>单个Agent的能力终究有限。</p>
<p>对于复杂任务，更合理的方式是让多个Agent分别承担不同职责。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Planner Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Research Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Execution Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Review Agent</span><br></div></code></pre></div></div>
<p>每个Agent负责特定领域的工作，最终形成完整结果。</p>
<p>这种模式已经开始在复杂业务流程中展现出较大的潜力。</p>
<hr>
<h1>总结</h1>
<p>企业AI平台建设本质上并不是简单地接入一个大模型接口，而是在模型能力之上构建一套完整的工程体系。从模型管理、知识库管理、工作流编排，到Agent执行框架、可观测性体系以及多租户治理，每一个环节都会直接影响平台最终的稳定性和可扩展性。</p>
<p>对于Java工程师而言，SpringBoot3、LangChain4j、Qdrant、Redis以及Elasticsearch已经能够支撑绝大多数企业AI平台建设需求。而随着MCP、A2A以及Multi-Agent逐渐成熟，未来的AI平台也会从“调用模型”逐步演进为“管理智能体”，这或许才是企业AI建设真正值得投入和长期积累的方向。</p>]]></content:encoded>
            <category>ai</category>
            <category>java</category>
        </item>
        <item>
            <title><![CDATA[AI应用规模化后的治理体系]]></title>
            <link>https://jameszbl.github.io/enterprise-ai-project</link>
            <guid>https://jameszbl.github.io/enterprise-ai-project</guid>
            <pubDate>Fri, 13 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[{/ truncate /}]]></description>
            <content:encoded><![CDATA[
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="前言">前言<a href="https://jameszbl.github.io/enterprise-ai-project#%E5%89%8D%E8%A8%80" class="hash-link" aria-label="前言的直接链接" title="前言的直接链接" translate="no">​</a></h2>
<p>很多团队在接触AI开发时，关注点往往集中在模型选型、Prompt设计、RAG实现以及Agent编排等技术问题。</p>
<p>但当AI应用真正进入生产环境后，会发现一个新的问题：</p>
<p><strong>系统上线只是开始，治理才是真正的挑战。</strong></p>
<p>一个内部试点阶段表现优秀的AI系统，当面对数十万用户、数百万次调用时，很容易出现各种问题：</p>
<ul>
<li class="">Prompt被频繁修改导致效果波动</li>
<li class="">知识库内容过期</li>
<li class="">模型成本快速上涨</li>
<li class="">响应时间不稳定</li>
<li class="">输出内容存在风险</li>
</ul>
<p>因此，企业级AI应用不仅需要开发能力，更需要完善的治理体系。</p>
<p>本文结合实际项目经验，介绍AI应用规模化后的核心治理方案。</p>
<hr>
<h1>一、企业AI项目为什么容易失控</h1>
<p>传统软件系统的逻辑是确定性的。</p>
<p>例如：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">if(score &gt; 60){</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    return "pass";</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>同样的输入永远得到同样的输出。</p>
<p>但AI系统并不是这样。</p>
<p>AI应用本质上由多个动态组件组成：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Prompt</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">知识库</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">模型</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">结果输出</span><br></div></code></pre></div></div>
<p>其中每个环节都可能发生变化。</p>
<p>例如：</p>
<ul>
<li class="">Prompt调整</li>
<li class="">知识库更新</li>
<li class="">Embedding模型升级</li>
<li class="">LLM切换版本</li>
</ul>
<p>这些变化叠加后，很容易导致线上效果不可预测。</p>
<p>因此企业AI治理的目标可以总结为：</p>
<blockquote>
<p>让AI系统具备可观测、可控制、可回滚、可审计能力。</p>
</blockquote>
<hr>
<h1>二、Prompt治理</h1>
<p>Prompt已经成为AI应用中的核心资产。</p>
<p>很多企业AI项目中，Prompt数量甚至超过业务代码。</p>
<p>如果缺乏管理机制，很快就会变成：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">prompt_v1</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">prompt_v2</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">prompt_v2_final</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">prompt_v2_final_new</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">prompt_v2_final_new2</span><br></div></code></pre></div></div>
<p>最终没人知道线上到底在运行哪个版本。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="1-版本管理">1. 版本管理<a href="https://jameszbl.github.io/enterprise-ai-project#1-%E7%89%88%E6%9C%AC%E7%AE%A1%E7%90%86" class="hash-link" aria-label="1. 版本管理的直接链接" title="1. 版本管理的直接链接" translate="no">​</a></h2>
<p>Prompt需要像代码一样进行管理。</p>
<p>例如：</p>
<div class="language-yaml codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-yaml codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token key atrule" style="color:#00a4db">name</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> product_review_reply</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token key atrule" style="color:#00a4db">version</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> 1.0.3</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token key atrule" style="color:#00a4db">description</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> 商品评价回复Prompt</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token key atrule" style="color:#00a4db">owner</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> ai</span><span class="token punctuation" style="color:#393A34">-</span><span class="token plain">team</span><br></div></code></pre></div></div>
<p>企业通常会将Prompt存储在：</p>
<ul>
<li class="">Git</li>
<li class="">配置中心</li>
<li class="">Prompt管理平台</li>
</ul>
<p>每次修改都记录：</p>
<ul>
<li class="">修改人</li>
<li class="">修改时间</li>
<li class="">修改内容</li>
<li class="">影响范围</li>
</ul>
<p>这样出现问题时能够快速回滚。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="2-灰度发布">2. 灰度发布<a href="https://jameszbl.github.io/enterprise-ai-project#2-%E7%81%B0%E5%BA%A6%E5%8F%91%E5%B8%83" class="hash-link" aria-label="2. 灰度发布的直接链接" title="2. 灰度发布的直接链接" translate="no">​</a></h2>
<p>Prompt修改后不要直接全量上线。</p>
<p>推荐采用：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">新Prompt</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">   ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">5%流量</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">   ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">20%流量</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">   ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">50%流量</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">   ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">100%流量</span><br></div></code></pre></div></div>
<p>重点观察：</p>
<ul>
<li class="">用户满意度</li>
<li class="">点击率</li>
<li class="">人工接管率</li>
<li class="">回复准确率</li>
</ul>
<p>如果指标下降，可以立即回退。</p>
<p>这种方式与传统互联网系统的灰度发布完全一致。</p>
<hr>
<h1>三、知识库治理</h1>
<p>很多团队认为：</p>
<blockquote>
<p>只要搭建RAG，知识库问题就解决了。</p>
</blockquote>
<p>实际上知识库往往是AI项目中最容易失效的部分。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="1-增量更新">1. 增量更新<a href="https://jameszbl.github.io/enterprise-ai-project#1-%E5%A2%9E%E9%87%8F%E6%9B%B4%E6%96%B0" class="hash-link" aria-label="1. 增量更新的直接链接" title="1. 增量更新的直接链接" translate="no">​</a></h2>
<p>企业文档每天都在变化。</p>
<p>例如：</p>
<ul>
<li class="">商品信息更新</li>
<li class="">仓储规则调整</li>
<li class="">物流政策变化</li>
<li class="">售后流程修改</li>
</ul>
<p>如果每次都全量重建向量库：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">重新解析</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">重新切Chunk</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">重新Embedding</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">重新写入</span><br></div></code></pre></div></div>
<p>成本会非常高。</p>
<p>因此需要增量更新机制：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">文档变更</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">变更检测</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">重新切分</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">更新向量</span><br></div></code></pre></div></div>
<p>仅更新变化部分。</p>
<p>这样能够显著降低Embedding成本。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="2-文档质量检查">2. 文档质量检查<a href="https://jameszbl.github.io/enterprise-ai-project#2-%E6%96%87%E6%A1%A3%E8%B4%A8%E9%87%8F%E6%A3%80%E6%9F%A5" class="hash-link" aria-label="2. 文档质量检查的直接链接" title="2. 文档质量检查的直接链接" translate="no">​</a></h2>
<p>很多RAG效果差，并不是检索问题，而是文档质量问题。</p>
<p>常见问题：</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="内容重复">内容重复<a href="https://jameszbl.github.io/enterprise-ai-project#%E5%86%85%E5%AE%B9%E9%87%8D%E5%A4%8D" class="hash-link" aria-label="内容重复的直接链接" title="内容重复的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">退款流程</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">退款流程</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">退款流程</span><br></div></code></pre></div></div>
<p>会影响召回结果。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="内容缺失">内容缺失<a href="https://jameszbl.github.io/enterprise-ai-project#%E5%86%85%E5%AE%B9%E7%BC%BA%E5%A4%B1" class="hash-link" aria-label="内容缺失的直接链接" title="内容缺失的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">商品保质期：</span><br></div></code></pre></div></div>
<p>关键信息为空。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="ocr错误">OCR错误<a href="https://jameszbl.github.io/enterprise-ai-project#ocr%E9%94%99%E8%AF%AF" class="hash-link" aria-label="OCR错误的直接链接" title="OCR错误的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">七天无理由退货</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">七天无理由退贷</span><br></div></code></pre></div></div>
<p>Embedding后会产生错误语义。</p>
<hr>
<p>上线前建议增加文档质检流程：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">上传文档</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">格式检查</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">内容检查</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">重复检查</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">入库</span><br></div></code></pre></div></div>
<p>从源头保证知识质量。</p>
<hr>
<h1>四、模型治理</h1>
<p>企业级AI系统通常不会依赖单一模型。</p>
<p>原因很简单：</p>
<ul>
<li class="">成本不同</li>
<li class="">性能不同</li>
<li class="">稳定性不同</li>
</ul>
<p>因此需要模型治理能力。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="1-模型路由">1. 模型路由<a href="https://jameszbl.github.io/enterprise-ai-project#1-%E6%A8%A1%E5%9E%8B%E8%B7%AF%E7%94%B1" class="hash-link" aria-label="1. 模型路由的直接链接" title="1. 模型路由的直接链接" translate="no">​</a></h2>
<p>典型架构：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户请求</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Router</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ├─ GPT</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ├─ DeepSeek</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ├─ Qwen</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> └─ Claude</span><br></div></code></pre></div></div>
<p>根据任务自动选择模型。</p>
<p>例如：</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="简单问答">简单问答<a href="https://jameszbl.github.io/enterprise-ai-project#%E7%AE%80%E5%8D%95%E9%97%AE%E7%AD%94" class="hash-link" aria-label="简单问答的直接链接" title="简单问答的直接链接" translate="no">​</a></h3>
<p>使用低成本模型：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">DeepSeek-V4-Flash</span><br></div></code></pre></div></div>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="复杂推理">复杂推理<a href="https://jameszbl.github.io/enterprise-ai-project#%E5%A4%8D%E6%9D%82%E6%8E%A8%E7%90%86" class="hash-link" aria-label="复杂推理的直接链接" title="复杂推理的直接链接" translate="no">​</a></h3>
<p>使用高性能模型：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">GPT</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Claude</span><br></div></code></pre></div></div>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="内容审核">内容审核<a href="https://jameszbl.github.io/enterprise-ai-project#%E5%86%85%E5%AE%B9%E5%AE%A1%E6%A0%B8" class="hash-link" aria-label="内容审核的直接链接" title="内容审核的直接链接" translate="no">​</a></h3>
<p>使用专门审核模型。</p>
<p>这样能够平衡：</p>
<ul>
<li class="">效果</li>
<li class="">延迟</li>
<li class="">成本</li>
</ul>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="2-降级策略">2. 降级策略<a href="https://jameszbl.github.io/enterprise-ai-project#2-%E9%99%8D%E7%BA%A7%E7%AD%96%E7%95%A5" class="hash-link" aria-label="2. 降级策略的直接链接" title="2. 降级策略的直接链接" translate="no">​</a></h2>
<p>模型服务不可能100%稳定。</p>
<p>需要设计降级链路：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">GPT</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">DeepSeek</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Qwen</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">规则回复</span><br></div></code></pre></div></div>
<p>例如：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">try{</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    return gpt.generate();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}catch(Exception e){</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    return deepseek.generate();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>即使上游服务异常，也能保证业务连续运行。</p>
<hr>
<h1>五、成本治理</h1>
<p>AI项目进入规模化阶段后，成本往往成为最大的挑战。</p>
<p>很多团队上线后才发现：</p>
<blockquote>
<p>Token费用远远超过预期。</p>
</blockquote>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="1-token监控">1. Token监控<a href="https://jameszbl.github.io/enterprise-ai-project#1-token%E7%9B%91%E6%8E%A7" class="hash-link" aria-label="1. Token监控的直接链接" title="1. Token监控的直接链接" translate="no">​</a></h2>
<p>需要监控：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">输入Token</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">输出Token</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">请求次数</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">模型费用</span><br></div></code></pre></div></div>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">GPT-4</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">今日消耗：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">1200万Token</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">费用：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">350美元</span><br></div></code></pre></div></div>
<p>做到按天、按业务、按用户维度统计。</p>
<p>常见指标：</p>
<ul>
<li class="">Token总量</li>
<li class="">人均Token</li>
<li class="">单次请求成本</li>
<li class="">模型成本排行</li>
</ul>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="2-缓存机制">2. 缓存机制<a href="https://jameszbl.github.io/enterprise-ai-project#2-%E7%BC%93%E5%AD%98%E6%9C%BA%E5%88%B6" class="hash-link" aria-label="2. 缓存机制的直接链接" title="2. 缓存机制的直接链接" translate="no">​</a></h2>
<p>大量请求其实存在重复。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">退货流程是什么？</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">退货规则是什么？</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">如何申请退款？</span><br></div></code></pre></div></div>
<p>经过标准化后，本质是同一问题。</p>
<p>可以增加缓存层：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Query</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Cache</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM</span><br></div></code></pre></div></div>
<p>命中缓存直接返回结果。</p>
<p>企业实践中：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">缓存命中率</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">20% ~ 50%</span><br></div></code></pre></div></div>
<p>并不罕见。</p>
<p>对于高频问答场景，成本下降非常明显。</p>
<hr>
<h1>六、风险治理</h1>
<p>AI应用最终面向用户，因此风险治理不可缺少。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="1-敏感词检测">1. 敏感词检测<a href="https://jameszbl.github.io/enterprise-ai-project#1-%E6%95%8F%E6%84%9F%E8%AF%8D%E6%A3%80%E6%B5%8B" class="hash-link" aria-label="1. 敏感词检测的直接链接" title="1. 敏感词检测的直接链接" translate="no">​</a></h2>
<p>用户输入可能包含：</p>
<ul>
<li class="">政治敏感内容</li>
<li class="">违法内容</li>
<li class="">广告内容</li>
<li class="">恶意攻击内容</li>
</ul>
<p>处理流程：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户输入</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">敏感词检测</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">风险判断</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">模型调用</span><br></div></code></pre></div></div>
<p>高风险内容直接拦截。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="2-内容审核">2. 内容审核<a href="https://jameszbl.github.io/enterprise-ai-project#2-%E5%86%85%E5%AE%B9%E5%AE%A1%E6%A0%B8" class="hash-link" aria-label="2. 内容审核的直接链接" title="2. 内容审核的直接链接" translate="no">​</a></h2>
<p>除了输入审核，还需要审核模型输出。</p>
<p>因为模型可能产生：</p>
<ul>
<li class="">幻觉内容</li>
<li class="">错误承诺</li>
<li class="">不当言论</li>
<li class="">违规回复</li>
</ul>
<p>典型流程：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM生成</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">审核模型</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">输出用户</span><br></div></code></pre></div></div>
<p>对于高风险业务：</p>
<ul>
<li class="">金融</li>
<li class="">医疗</li>
<li class="">法律</li>
</ul>
<p>通常还会增加：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">AI生成</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">人工审核</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">最终发布</span><br></div></code></pre></div></div>
<p>确保结果安全可靠。</p>
<hr>
<h1>总结</h1>
<p>随着AI应用从PoC走向生产环境，技术重点会逐渐从“如何生成答案”转向“如何稳定运行”。</p>
<p>一个成熟的企业AI治理体系通常包括：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Prompt治理</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Knowledge治理</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Model治理</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Cost治理</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Risk治理</span><br></div></code></pre></div></div>
<p>前期决定项目能否做出来，</p>
<p>而治理体系决定项目能否长期稳定运行。</p>]]></content:encoded>
            <category>ai</category>
            <category>java</category>
        </item>
        <item>
            <title><![CDATA[AI商品评价自动回复系统设计与实现]]></title>
            <link>https://jameszbl.github.io/feedback-reply-bot</link>
            <guid>https://jameszbl.github.io/feedback-reply-bot</guid>
            <pubDate>Sun, 11 Jan 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[{/ truncate /}]]></description>
            <content:encoded><![CDATA[
<p>随着大模型技术逐渐进入企业生产环境，越来越多的客服、运营场景开始尝试利用AI提升效率。本文将结合本人参与的商品评价智能回复项目，介绍一个面向百万级商品评价场景的AI自动回复系统设计与实现方案。</p>
<hr>
<h1>一、业务背景</h1>
<p>在电商平台中，商品评价不仅影响用户购买决策，也是商家运营的重要数据来源。</p>
<p>以东方甄选为例，每天会产生大量商品评价：</p>
<ul>
<li class="">好评感谢</li>
<li class="">商品质量反馈</li>
<li class="">物流问题投诉</li>
<li class="">售后服务咨询</li>
<li class="">商品使用建议</li>
</ul>
<p>随着业务规模增长，商品评价总量已经达到百万级。</p>
<p>传统运营模式下：</p>
<ul>
<li class="">人工逐条查看评价</li>
<li class="">根据经验编写回复</li>
<li class="">不同运营人员回复风格不统一</li>
<li class="">差评处理效率低</li>
</ul>
<p>因此我们希望构建一套AI商品评价自动回复系统，实现：</p>
<ul>
<li class="">自动识别评价内容</li>
<li class="">自动生成回复</li>
<li class="">人工审核后发布</li>
<li class="">提升回复效率和一致性</li>
</ul>
<hr>
<h1>二、核心流程设计</h1>
<p>整个系统采用多阶段AI工作流设计。</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">评价</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">情感分析</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">意图识别</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">知识库检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Prompt组装</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">多模型生成</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">审核</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">回复</span><br></div></code></pre></div></div>
<p>每个节点只负责单一职责。</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="1-情感分析">1. 情感分析<a href="https://jameszbl.github.io/feedback-reply-bot#1-%E6%83%85%E6%84%9F%E5%88%86%E6%9E%90" class="hash-link" aria-label="1. 情感分析的直接链接" title="1. 情感分析的直接链接" translate="no">​</a></h2>
<p>首先判断评价属于：</p>
<ul>
<li class="">好评</li>
<li class="">中评</li>
<li class="">差评</li>
</ul>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">苹果很甜，下次还会买</span><br></div></code></pre></div></div>
<p>输出：</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token property" style="color:#36acaa">"sentiment"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"positive"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<p>而：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">收到时已经坏了</span><br></div></code></pre></div></div>
<p>输出：</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token property" style="color:#36acaa">"sentiment"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"negative"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<p>情感结果决定后续流程走向。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="2-意图识别">2. 意图识别<a href="https://jameszbl.github.io/feedback-reply-bot#2-%E6%84%8F%E5%9B%BE%E8%AF%86%E5%88%AB" class="hash-link" aria-label="2. 意图识别的直接链接" title="2. 意图识别的直接链接" translate="no">​</a></h2>
<p>对于差评，需要进一步识别用户真实诉求。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">水果不甜</span><br></div></code></pre></div></div>
<p>识别为：</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token property" style="color:#36acaa">"intent"</span><span class="token operator" style="color:#393A34">:</span><span class="token string" style="color:#e3116c">"product_quality"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<hr>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">快递太慢了</span><br></div></code></pre></div></div>
<p>识别为：</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token property" style="color:#36acaa">"intent"</span><span class="token operator" style="color:#393A34">:</span><span class="token string" style="color:#e3116c">"logistics"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<hr>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">客服一直没人回复</span><br></div></code></pre></div></div>
<p>识别为：</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token property" style="color:#36acaa">"intent"</span><span class="token operator" style="color:#393A34">:</span><span class="token string" style="color:#e3116c">"service"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<p>通过意图分类，实现后续知识库精准召回。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="3-知识库检索">3. 知识库检索<a href="https://jameszbl.github.io/feedback-reply-bot#3-%E7%9F%A5%E8%AF%86%E5%BA%93%E6%A3%80%E7%B4%A2" class="hash-link" aria-label="3. 知识库检索的直接链接" title="3. 知识库检索的直接链接" translate="no">​</a></h2>
<p>根据意图进入对应知识域。</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="商品知识库">商品知识库<a href="https://jameszbl.github.io/feedback-reply-bot#%E5%95%86%E5%93%81%E7%9F%A5%E8%AF%86%E5%BA%93" class="hash-link" aria-label="商品知识库的直接链接" title="商品知识库的直接链接" translate="no">​</a></h3>
<p>包含：</p>
<ul>
<li class="">商品卖点</li>
<li class="">商品参数</li>
<li class="">常见问题</li>
</ul>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">商品：山东烟台红富士</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">糖度：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">13%-15%</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">成熟周期：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">10月采摘</span><br></div></code></pre></div></div>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="物流知识库">物流知识库<a href="https://jameszbl.github.io/feedback-reply-bot#%E7%89%A9%E6%B5%81%E7%9F%A5%E8%AF%86%E5%BA%93" class="hash-link" aria-label="物流知识库的直接链接" title="物流知识库的直接链接" translate="no">​</a></h3>
<p>包含：</p>
<ul>
<li class="">仓储信息</li>
<li class="">发货规则</li>
<li class="">配送规则</li>
</ul>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="售后知识库">售后知识库<a href="https://jameszbl.github.io/feedback-reply-bot#%E5%94%AE%E5%90%8E%E7%9F%A5%E8%AF%86%E5%BA%93" class="hash-link" aria-label="售后知识库的直接链接" title="售后知识库的直接链接" translate="no">​</a></h3>
<p>包含：</p>
<ul>
<li class="">退款规则</li>
<li class="">补发规则</li>
<li class="">客诉处理规范</li>
</ul>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="4-prompt组装">4. Prompt组装<a href="https://jameszbl.github.io/feedback-reply-bot#4-prompt%E7%BB%84%E8%A3%85" class="hash-link" aria-label="4. Prompt组装的直接链接" title="4. Prompt组装的直接链接" translate="no">​</a></h2>
<p>系统会动态拼接Prompt。</p>
<p>示例：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">你是一名电商客服。</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">用户评价：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">水果不甜。</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">商品知识：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">本商品属于自然成熟水果，</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">糖度受天气影响较大。</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">请生成专业且礼貌回复。</span><br></div></code></pre></div></div>
<p>Prompt模板统一由运营维护。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="5-多模型生成">5. 多模型生成<a href="https://jameszbl.github.io/feedback-reply-bot#5-%E5%A4%9A%E6%A8%A1%E5%9E%8B%E7%94%9F%E6%88%90" class="hash-link" aria-label="5. 多模型生成的直接链接" title="5. 多模型生成的直接链接" translate="no">​</a></h2>
<p>为了保证生成质量，我们采用多模型并行策略。</p>
<p>同时调用：</p>
<ul>
<li class="">DeepSeek</li>
<li class="">Qwen</li>
<li class="">GPT</li>
</ul>
<p>生成三份回复。</p>
<p>例如：</p>
<p>回复A：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">非常抱歉未能达到您的预期...</span><br></div></code></pre></div></div>
<p>回复B：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">感谢反馈，我们已将问题同步给供应链团队...</span><br></div></code></pre></div></div>
<p>回复C：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">水果甜度受天气和批次影响...</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="6-人工审核">6. 人工审核<a href="https://jameszbl.github.io/feedback-reply-bot#6-%E4%BA%BA%E5%B7%A5%E5%AE%A1%E6%A0%B8" class="hash-link" aria-label="6. 人工审核的直接链接" title="6. 人工审核的直接链接" translate="no">​</a></h2>
<p>AI不直接发送。</p>
<p>运营后台展示：</p>
<ul>
<li class="">原评价</li>
<li class="">检索知识</li>
<li class="">AI回复结果</li>
</ul>
<p>审核员可：</p>
<ul>
<li class="">直接采用</li>
<li class="">修改后采用</li>
<li class="">重新生成</li>
</ul>
<p>确保回复质量可控。</p>
<hr>
<h1>三、Agent工作流设计</h1>
<p>项目采用Agent Workflow模式，而非单次Prompt调用。</p>
<p>工作流结构：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">ReviewAgent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        │</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ▼</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">SentimentAgent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        │</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ▼</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">IntentAgent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        │</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ▼</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">RetrieverAgent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        │</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ▼</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">PromptAgent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        │</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ▼</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">GeneratorAgent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        │</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ▼</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">AuditAgent</span><br></div></code></pre></div></div>
<p>每个Agent负责独立任务。</p>
<p>Java实现中采用：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">StateMachine</span><br></div></code></pre></div></div>
<p>管理状态流转。</p>
<p>状态定义：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">RECEIVED</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">SENTIMENT_DONE</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">INTENT_DONE</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">RETRIEVE_DONE</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">GENERATED</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">AUDITING</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">FINISHED</span><br></div></code></pre></div></div>
<p>通过状态机实现：</p>
<ul>
<li class="">可追踪</li>
<li class="">可重试</li>
<li class="">可扩展</li>
</ul>
<p>后续增加节点无需重构主流程。</p>
<hr>
<h1>四、知识库设计</h1>
<p>知识库是系统效果的关键。</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="分层结构">分层结构<a href="https://jameszbl.github.io/feedback-reply-bot#%E5%88%86%E5%B1%82%E7%BB%93%E6%9E%84" class="hash-link" aria-label="分层结构的直接链接" title="分层结构的直接链接" translate="no">​</a></h2>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">商品知识库</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">物流知识库</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">售后知识库</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">客服话术库</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="chunk切分">Chunk切分<a href="https://jameszbl.github.io/feedback-reply-bot#chunk%E5%88%87%E5%88%86" class="hash-link" aria-label="Chunk切分的直接链接" title="Chunk切分的直接链接" translate="no">​</a></h2>
<p>采用：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">512 Token</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">50 Token Overlap</span><br></div></code></pre></div></div>
<p>避免上下文断裂。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="检索方案">检索方案<a href="https://jameszbl.github.io/feedback-reply-bot#%E6%A3%80%E7%B4%A2%E6%96%B9%E6%A1%88" class="hash-link" aria-label="检索方案的直接链接" title="检索方案的直接链接" translate="no">​</a></h2>
<p>Hybrid Search：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Keyword Search</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Vector Search</span><br></div></code></pre></div></div>
<p>优势：</p>
<ul>
<li class="">保证召回率</li>
<li class="">提高语义匹配能力</li>
</ul>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="rerank排序">ReRank排序<a href="https://jameszbl.github.io/feedback-reply-bot#rerank%E6%8E%92%E5%BA%8F" class="hash-link" aria-label="ReRank排序的直接链接" title="ReRank排序的直接链接" translate="no">​</a></h2>
<p>召回后使用ReRank模型排序。</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Top20</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">ReRank</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Top5</span><br></div></code></pre></div></div>
<p>降低噪声数据进入Prompt。</p>
<hr>
<h1>五、多模型对比方案</h1>
<p>项目上线前进行了大量AB实验。</p>
<table><thead><tr><th>模型</th><th>回复质量</th><th>成本</th><th>速度</th></tr></thead><tbody><tr><td>GPT</td><td>★★★★★</td><td>高</td><td>中</td></tr><tr><td>DeepSeek</td><td>★★★★☆</td><td>低</td><td>快</td></tr><tr><td>Qwen</td><td>★★★★</td><td>低</td><td>快</td></tr></tbody></table>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="评测维度">评测维度<a href="https://jameszbl.github.io/feedback-reply-bot#%E8%AF%84%E6%B5%8B%E7%BB%B4%E5%BA%A6" class="hash-link" aria-label="评测维度的直接链接" title="评测维度的直接链接" translate="no">​</a></h2>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="准确率">准确率<a href="https://jameszbl.github.io/feedback-reply-bot#%E5%87%86%E7%A1%AE%E7%8E%87" class="hash-link" aria-label="准确率的直接链接" title="准确率的直接链接" translate="no">​</a></h3>
<p>是否回答用户问题。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="礼貌度">礼貌度<a href="https://jameszbl.github.io/feedback-reply-bot#%E7%A4%BC%E8%B2%8C%E5%BA%A6" class="hash-link" aria-label="礼貌度的直接链接" title="礼貌度的直接链接" translate="no">​</a></h3>
<p>是否符合客服规范。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="知识引用率">知识引用率<a href="https://jameszbl.github.io/feedback-reply-bot#%E7%9F%A5%E8%AF%86%E5%BC%95%E7%94%A8%E7%8E%87" class="hash-link" aria-label="知识引用率的直接链接" title="知识引用率的直接链接" translate="no">​</a></h3>
<p>是否正确使用检索内容。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="幻觉率">幻觉率<a href="https://jameszbl.github.io/feedback-reply-bot#%E5%B9%BB%E8%A7%89%E7%8E%87" class="hash-link" aria-label="幻觉率的直接链接" title="幻觉率的直接链接" translate="no">​</a></h3>
<p>是否编造事实。</p>
<hr>
<p>最终采用：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">DeepSeek</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Qwen</span><br></div></code></pre></div></div>
<p>作为主模型。</p>
<p>复杂场景降级切换GPT。</p>
<hr>
<h1>六、效果评估</h1>
<p>项目上线后进行了持续监控。</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="运营效率提升">运营效率提升<a href="https://jameszbl.github.io/feedback-reply-bot#%E8%BF%90%E8%90%A5%E6%95%88%E7%8E%87%E6%8F%90%E5%8D%87" class="hash-link" aria-label="运营效率提升的直接链接" title="运营效率提升的直接链接" translate="no">​</a></h2>
<p>回复效率：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">人工回复：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">30秒/条</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">AI辅助：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">5秒/条</span><br></div></code></pre></div></div>
<p>效率提升约80%。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="自动采纳率">自动采纳率<a href="https://jameszbl.github.io/feedback-reply-bot#%E8%87%AA%E5%8A%A8%E9%87%87%E7%BA%B3%E7%8E%87" class="hash-link" aria-label="自动采纳率的直接链接" title="自动采纳率的直接链接" translate="no">​</a></h2>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">AI回复直接采纳率</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">≈ 75%</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="差评处理时效">差评处理时效<a href="https://jameszbl.github.io/feedback-reply-bot#%E5%B7%AE%E8%AF%84%E5%A4%84%E7%90%86%E6%97%B6%E6%95%88" class="hash-link" aria-label="差评处理时效的直接链接" title="差评处理时效的直接链接" translate="no">​</a></h2>
<p>从：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">24小时</span><br></div></code></pre></div></div>
<p>缩短到：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">2小时以内</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="成本收益">成本收益<a href="https://jameszbl.github.io/feedback-reply-bot#%E6%88%90%E6%9C%AC%E6%94%B6%E7%9B%8A" class="hash-link" aria-label="成本收益的直接链接" title="成本收益的直接链接" translate="no">​</a></h2>
<p>整体运营成本下降约：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">60%</span><br></div></code></pre></div></div>
<p>同时保持回复质量稳定。</p>
<hr>
<h1>总结</h1>
<p>AI商品评价自动回复系统本质上并不是简单的大模型问答，而是一个融合了：</p>
<ul>
<li class="">Agent工作流</li>
<li class="">RAG知识库</li>
<li class="">Prompt工程</li>
<li class="">多模型协同</li>
<li class="">人工审核</li>
</ul>
<p>的企业级AI应用。</p>
<p>对于开发者而言，真正的挑战也并非调用大模型API，而是如何构建一套稳定、可扩展、可治理的AI工作流系统。</p>]]></content:encoded>
            <category>ai</category>
            <category>java</category>
            <category>rag</category>
        </item>
        <item>
            <title><![CDATA[企业级Agent工作流设计]]></title>
            <link>https://jameszbl.github.io/agent-workflow-design</link>
            <guid>https://jameszbl.github.io/agent-workflow-design</guid>
            <pubDate>Sun, 04 Jan 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[{/ truncate /}]]></description>
            <content:encoded><![CDATA[
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="企业级agent工作流设计">企业级Agent工作流设计<a href="https://jameszbl.github.io/agent-workflow-design#%E4%BC%81%E4%B8%9A%E7%BA%A7agent%E5%B7%A5%E4%BD%9C%E6%B5%81%E8%AE%BE%E8%AE%A1" class="hash-link" aria-label="企业级Agent工作流设计的直接链接" title="企业级Agent工作流设计的直接链接" translate="no">​</a></h2>
<p>随着大模型能力不断增强，越来越多企业开始尝试构建自己的 AI Agent。然而在真实业务场景中，很多团队很快会发现：</p>
<blockquote>
<p>一个大模型 + 一个Agent，远远无法支撑复杂企业流程。</p>
</blockquote>
<p>真正能够落地生产环境的，往往不是单个 Agent，而是一套可编排、可治理、可审计的 Agent Workflow（Agent工作流）。</p>
<p>本文将从工程视角介绍企业级 Agent 工作流设计思想。</p>
<hr>
<h1>1. 为什么单Agent不够</h1>
<p>很多初学者理解的 Agent：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    LLM</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">调用工具</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">返回结果</span><br></div></code></pre></div></div>
<p>例如：</p>
<p>用户：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">帮我查询订单状态</span><br></div></code></pre></div></div>
<p>Agent：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">调用订单系统API</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">获取结果</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生成回复</span><br></div></code></pre></div></div>
<p>这种模式适合简单任务。</p>
<p>但企业真实场景往往更加复杂：</p>
<p>例如客服场景：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户投诉商品质量</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">识别情绪</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">识别意图</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">查询订单</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">查询商品知识库</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生成处理方案</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">人工审核</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">发送回复</span><br></div></code></pre></div></div>
<p>此时问题出现：</p>
<ul>
<li class="">多步骤依赖</li>
<li class="">多系统调用</li>
<li class="">多个模型协作</li>
<li class="">需要人工介入</li>
<li class="">需要失败重试</li>
<li class="">需要审计日志</li>
</ul>
<p>如果全部交给一个 Agent 自主决策：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Agent自己思考</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Agent自己决定</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Agent自己调用工具</span><br></div></code></pre></div></div>
<p>将产生：</p>
<ul>
<li class="">不稳定</li>
<li class="">不可预测</li>
<li class="">难以调试</li>
<li class="">难以监管</li>
</ul>
<p>因此企业更倾向于：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Agent能力</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Workflow编排</span><br></div></code></pre></div></div>
<p>而不是完全自治 Agent。</p>
<hr>
<h1>2. 工作流与Agent区别</h1>
<p>很多人容易混淆 Workflow 和 Agent。</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="agent">Agent<a href="https://jameszbl.github.io/agent-workflow-design#agent" class="hash-link" aria-label="Agent的直接链接" title="Agent的直接链接" translate="no">​</a></h2>
<p>核心特点：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">自主决策</span><br></div></code></pre></div></div>
<p>执行过程：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Thought</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Action</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Observation</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Thought</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Action</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Observation</span><br></div></code></pre></div></div>
<p>Agent每一步都由模型决定。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">帮我分析这个客户</span><br></div></code></pre></div></div>
<p>Agent可能：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">查CRM</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">查订单</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">查聊天记录</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生成报告</span><br></div></code></pre></div></div>
<p>整个路径是不确定的。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="workflow">Workflow<a href="https://jameszbl.github.io/agent-workflow-design#workflow" class="hash-link" aria-label="Workflow的直接链接" title="Workflow的直接链接" translate="no">​</a></h2>
<p>核心特点：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">流程确定</span><br></div></code></pre></div></div>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">情绪分析</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">意图识别</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">知识库检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生成回复</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">人工审核</span><br></div></code></pre></div></div>
<p>路径提前定义。</p>
<p>模型只负责某个节点能力。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="企业为什么偏爱workflow">企业为什么偏爱Workflow<a href="https://jameszbl.github.io/agent-workflow-design#%E4%BC%81%E4%B8%9A%E4%B8%BA%E4%BB%80%E4%B9%88%E5%81%8F%E7%88%B1workflow" class="hash-link" aria-label="企业为什么偏爱Workflow的直接链接" title="企业为什么偏爱Workflow的直接链接" translate="no">​</a></h2>
<p>因为：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">可观测</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">可调试</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">可回放</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">可审计</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">可治理</span><br></div></code></pre></div></div>
<p>生产环境稳定性远比智能程度更重要。</p>
<p>因此很多企业Agent本质上是：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Workflow + Agent</span><br></div></code></pre></div></div>
<p>而不是纯Agent。</p>
<hr>
<h1>3. LangGraph思想</h1>
<p>目前最火的 Agent Workflow 框架之一是：</p>
<p>LangGraph</p>
<p>它的核心思想非常简单：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Graph</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">=</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Node + Edge</span><br></div></code></pre></div></div>
<p>即：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">节点 + 连线</span><br></div></code></pre></div></div>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户输入</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">意图识别</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">知识库检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">答案生成</span><br></div></code></pre></div></div>
<p>对应：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Node A</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Node B</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Node C</span><br></div></code></pre></div></div>
<hr>
<p>LangGraph进一步增加：</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="状态">状态<a href="https://jameszbl.github.io/agent-workflow-design#%E7%8A%B6%E6%80%81" class="hash-link" aria-label="状态的直接链接" title="状态的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">State</span><br></div></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="条件路由">条件路由<a href="https://jameszbl.github.io/agent-workflow-design#%E6%9D%A1%E4%BB%B6%E8%B7%AF%E7%94%B1" class="hash-link" aria-label="条件路由的直接链接" title="条件路由的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Conditional Edge</span><br></div></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="循环">循环<a href="https://jameszbl.github.io/agent-workflow-design#%E5%BE%AA%E7%8E%AF" class="hash-link" aria-label="循环的直接链接" title="循环的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Loop</span><br></div></code></pre></div></div>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">生成答案</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">质量检查</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">不合格</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">重新生成</span><br></div></code></pre></div></div>
<p>形成闭环。</p>
<hr>
<h1>4. 状态机设计</h1>
<p>企业级Agent最重要的设计其实不是Prompt。</p>
<p>而是：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">State</span><br></div></code></pre></div></div>
<p>状态管理。</p>
<hr>
<p>例如客服工作流：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">public class CustomerServiceState {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    private String userQuestion;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    private String sentiment;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    private String intent;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    private List&lt;Document&gt; documents;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    private String draftReply;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    private boolean needHumanReview;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>整个流程都围绕 State 展开。</p>
<hr>
<p>流程执行：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">节点1</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">更新 sentiment</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">节点2</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">更新 intent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">节点3</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">更新 documents</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">节点4</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">更新 draftReply</span><br></div></code></pre></div></div>
<p>最终形成：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">共享上下文</span><br></div></code></pre></div></div>
<p>而不是节点之间互相传参数。</p>
<hr>
<p>推荐原则：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">节点无状态</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">状态集中管理</span><br></div></code></pre></div></div>
<p>这样：</p>
<ul>
<li class="">易扩展</li>
<li class="">易测试</li>
<li class="">易回放</li>
</ul>
<hr>
<h1>5. 条件路由</h1>
<p>企业流程一定会出现分支。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">好评</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">自动感谢</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">差评</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">进入售后处理</span><br></div></code></pre></div></div>
<hr>
<p>路由节点：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">if(sentiment.equals("positive")){</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    goto(POSITIVE_FLOW);</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}else{</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    goto(NEGATIVE_FLOW);</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>流程图：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">          情绪分析</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">             ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ┌──────┴──────┐</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ↓             ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    好评          差评</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ↓             ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> 自动回复      售后处理</span><br></div></code></pre></div></div>
<hr>
<p>再比如：</p>
<p>意图识别：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">物流问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">商品问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">退款问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">发票问题</span><br></div></code></pre></div></div>
<p>分别路由：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">物流Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">商品Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">退款Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">财务Agent</span><br></div></code></pre></div></div>
<p>形成多Agent协作架构。</p>
<hr>
<h1>6. 人工审核节点</h1>
<p>这是企业落地中极其重要的一环。</p>
<p>很多团队上线Agent失败的原因：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">没有Human-in-the-Loop</span><br></div></code></pre></div></div>
<p>即：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">人工介入能力</span><br></div></code></pre></div></div>
<hr>
<p>典型设计：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">生成回复</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">风险检测</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">人工审核</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">发送用户</span><br></div></code></pre></div></div>
<hr>
<p>高风险场景：</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="金融">金融<a href="https://jameszbl.github.io/agent-workflow-design#%E9%87%91%E8%9E%8D" class="hash-link" aria-label="金融的直接链接" title="金融的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">贷款审批</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">投资建议</span><br></div></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="医疗">医疗<a href="https://jameszbl.github.io/agent-workflow-design#%E5%8C%BB%E7%96%97" class="hash-link" aria-label="医疗的直接链接" title="医疗的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">诊断建议</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">用药建议</span><br></div></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="客服">客服<a href="https://jameszbl.github.io/agent-workflow-design#%E5%AE%A2%E6%9C%8D" class="hash-link" aria-label="客服的直接链接" title="客服的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">赔偿金额</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">退款金额</span><br></div></code></pre></div></div>
<hr>
<p>审核节点设计：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">public class ReviewNode {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    public ReviewResult review(State state){</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        if(state.isHighRisk()){</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            return WAIT_HUMAN;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        }</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        return PASS;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    }</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<hr>
<p>企业往往还会提供：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">通过</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">驳回</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">修改后通过</span><br></div></code></pre></div></div>
<p>三种操作。</p>
<p>形成：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Human Approval Workflow</span><br></div></code></pre></div></div>
<hr>
<h1>7. 企业客服Agent案例</h1>
<p>下面以电商客服Agent为例。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="整体架构">整体架构<a href="https://jameszbl.github.io/agent-workflow-design#%E6%95%B4%E4%BD%93%E6%9E%B6%E6%9E%84" class="hash-link" aria-label="整体架构的直接链接" title="整体架构的直接链接" translate="no">​</a></h2>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户评价</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">情感分析</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">意图识别</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">条件路由</span><br></div></code></pre></div></div>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="好评流程">好评流程<a href="https://jameszbl.github.io/agent-workflow-design#%E5%A5%BD%E8%AF%84%E6%B5%81%E7%A8%8B" class="hash-link" aria-label="好评流程的直接链接" title="好评流程的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">好评</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">商品知识库</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生成感谢回复</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">自动发送</span><br></div></code></pre></div></div>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="差评流程">差评流程<a href="https://jameszbl.github.io/agent-workflow-design#%E5%B7%AE%E8%AF%84%E6%B5%81%E7%A8%8B" class="hash-link" aria-label="差评流程的直接链接" title="差评流程的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">差评</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">意图识别</span><br></div></code></pre></div></div>
<p>可能识别：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">水果不甜</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">物流太慢</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">包装破损</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">售后问题</span><br></div></code></pre></div></div>
<hr>
<p>进入对应Agent：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">商品Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">物流Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">售后Agent</span><br></div></code></pre></div></div>
<hr>
<p>各Agent执行：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">RAG检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">知识增强</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Prompt组装</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生成回复</span><br></div></code></pre></div></div>
<hr>
<p>随后进入：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">风险检测</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">人工审核</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">发送用户</span><br></div></code></pre></div></div>
<hr>
<p>整体流程图：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">                用户评价</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                情感分析</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">          ┌─────────┴─────────┐</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">          ↓                   ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">         好评               差评</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">          ↓                   ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      自动感谢          意图识别</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                              ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                ┌──────┬──────┬──────┐</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                ↓      ↓      ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            商品Agent 物流Agent 售后Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">             RAG检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">             回复生成</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">             风险检测</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">             人工审核</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">             回复用户</span><br></div></code></pre></div></div>
<hr>
<h1>总结</h1>
<p>企业级 Agent 的核心并不是让模型拥有无限自主能力，而是将大模型能力嵌入到可治理的业务流程中。</p>
<p>一个成熟的企业 Agent 系统通常具备：</p>
<ul>
<li class="">Workflow编排</li>
<li class="">状态机管理</li>
<li class="">条件路由</li>
<li class="">多Agent协作</li>
<li class="">RAG知识增强</li>
<li class="">Human-in-the-Loop</li>
<li class="">全链路日志与审计</li>
</ul>
<p>当 Agent 与工作流结合后，AI 才真正具备进入企业核心业务系统的能力。</p>]]></content:encoded>
            <category>ai</category>
            <category>java</category>
            <category>rag</category>
        </item>
        <item>
            <title><![CDATA[节点明明执行了，状态却丢了]]></title>
            <link>https://jameszbl.github.io/state-missed</link>
            <guid>https://jameszbl.github.io/state-missed</guid>
            <pubDate>Mon, 15 Dec 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[{/ truncate /}]]></description>
            <content:encoded><![CDATA[
<hr>
<p>最近尝试着把原来基于 LangChain4j 的串行流程改造成 LangGraph 工作流，遇到了一个问题。
这种场景在实际项目里非常常见，而且很容易让人怀疑是框架 Bug，最后却发现是自己对 LangGraph 状态机制理解不够。</p>
<p><strong>节点明明执行了，状态却丢了</strong></p>
<hr>
<p>整体流程很简单：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户评价</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">情感分析</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">意图识别</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">知识库检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">回复生成</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">人工审核</span><br></div></code></pre></div></div>
<p>因为后面还计划增加条件路由和人工节点，所以决定尝试一下LangGraph。</p>
<p>刚开始开发的时候非常顺利，节点也都跑通了。</p>
<p>结果到了第三个节点突然出现一个特别诡异的问题。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="问题现象">问题现象<a href="https://jameszbl.github.io/state-missed#%E9%97%AE%E9%A2%98%E7%8E%B0%E8%B1%A1" class="hash-link" aria-label="问题现象的直接链接" title="问题现象的直接链接" translate="no">​</a></h2>
<p>情感分析节点输出正常：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">state.put("sentiment", "NEGATIVE");</span><br></div></code></pre></div></div>
<p>日志也能看到：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">EmotionNode execute</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">sentiment=NEGATIVE</span><br></div></code></pre></div></div>
<p>到了意图识别节点：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">String sentiment = state.get("sentiment");</span><br></div></code></pre></div></div>
<p>结果直接变成：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">null</span><br></div></code></pre></div></div>
<p>整个流程继续往下走。</p>
<p>然后生成回复时报错：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">java.lang.NullPointerException</span><br></div></code></pre></div></div>
<p>第一反应是：</p>
<blockquote>
<p>LangGraph状态没有传递成功？</p>
</blockquote>
<p>于是开始疯狂排查。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="第一轮排查">第一轮排查<a href="https://jameszbl.github.io/state-missed#%E7%AC%AC%E4%B8%80%E8%BD%AE%E6%8E%92%E6%9F%A5" class="hash-link" aria-label="第一轮排查的直接链接" title="第一轮排查的直接链接" translate="no">​</a></h2>
<p>先怀疑是不是Spring Bean作用域问题。</p>
<p>因为节点都是：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">@Component</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">public class EmotionNode implements Node</span><br></div></code></pre></div></div>
<p>怀疑多个请求共享状态。</p>
<p>于是加日志：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">System.out.println(state);</span><br></div></code></pre></div></div>
<p>发现每个请求拿到的 state 都不一样。</p>
<p>排除。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="第二轮排查">第二轮排查<a href="https://jameszbl.github.io/state-missed#%E7%AC%AC%E4%BA%8C%E8%BD%AE%E6%8E%92%E6%9F%A5" class="hash-link" aria-label="第二轮排查的直接链接" title="第二轮排查的直接链接" translate="no">​</a></h2>
<p>怀疑是不是异步执行导致。</p>
<p>因为工作流配置里用了：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">ExecutorService executor =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    Executors.newFixedThreadPool(10);</span><br></div></code></pre></div></div>
<p>于是把线程池全部关掉。</p>
<p>改成单线程。</p>
<p>问题依旧。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="第三轮排查">第三轮排查<a href="https://jameszbl.github.io/state-missed#%E7%AC%AC%E4%B8%89%E8%BD%AE%E6%8E%92%E6%9F%A5" class="hash-link" aria-label="第三轮排查的直接链接" title="第三轮排查的直接链接" translate="no">​</a></h2>
<p>开始翻官方文档。</p>
<p>看到一句不起眼的话：</p>
<blockquote>
<p>State updates are merged using reducers.</p>
</blockquote>
<p>当时完全没理解。</p>
<p>继续往下看源码。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="发现异常点">发现异常点<a href="https://jameszbl.github.io/state-missed#%E5%8F%91%E7%8E%B0%E5%BC%82%E5%B8%B8%E7%82%B9" class="hash-link" aria-label="发现异常点的直接链接" title="发现异常点的直接链接" translate="no">​</a></h2>
<p>在 LangGraph 中：</p>
<p>节点并不是直接修改全局状态。</p>
<p>而是：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">return Map.of(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    "sentiment",</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    "NEGATIVE"</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">);</span><br></div></code></pre></div></div>
<p>框架会把节点返回结果合并到 State 中。</p>
<p>而我写成了：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">state.put("sentiment", "NEGATIVE");</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">return state;</span><br></div></code></pre></div></div>
<p>看起来没问题。</p>
<p>实际上问题就出在这里。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="langgraph状态机制原理">LangGraph状态机制原理<a href="https://jameszbl.github.io/state-missed#langgraph%E7%8A%B6%E6%80%81%E6%9C%BA%E5%88%B6%E5%8E%9F%E7%90%86" class="hash-link" aria-label="LangGraph状态机制原理的直接链接" title="LangGraph状态机制原理的直接链接" translate="no">​</a></h2>
<p>很多人第一次接触 LangGraph 都会下意识认为：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Node A</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">修改State</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Node B读取State</span><br></div></code></pre></div></div>
<p>实际上内部更接近：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">State</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">   ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Node A</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">   ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Delta</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">   ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Reducer</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">   ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">New State</span><br></div></code></pre></div></div>
<p>也就是说：</p>
<p>节点返回的是状态增量（Delta）。</p>
<p>框架负责合并。</p>
<p>而不是直接共享同一个对象。</p>
<p>所以：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">state.put(...)</span><br></div></code></pre></div></div>
<p>只是改了当前节点拿到的对象。</p>
<p>最终状态树未必会使用这个对象。</p>
<p>尤其涉及：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">并行节点</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">条件路由</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Checkpoint恢复</span><br></div></code></pre></div></div>
<p>时更明显。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="为什么设计成这样">为什么设计成这样<a href="https://jameszbl.github.io/state-missed#%E4%B8%BA%E4%BB%80%E4%B9%88%E8%AE%BE%E8%AE%A1%E6%88%90%E8%BF%99%E6%A0%B7" class="hash-link" aria-label="为什么设计成这样的直接链接" title="为什么设计成这样的直接链接" translate="no">​</a></h2>
<p>后来读源码才理解。</p>
<p>假设有两个并行节点：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">          → NodeB →</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">NodeA →</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">          → NodeC →</span><br></div></code></pre></div></div>
<p>NodeB返回：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">{</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    "intent":"物流问题"</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>NodeC返回：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">{</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    "product":"苹果"</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>框架最后统一合并：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">{</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    "intent":"物流问题",</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    "product":"苹果"</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>如果所有节点都直接修改同一个对象：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">sharedState</span><br></div></code></pre></div></div>
<p>那么并发冲突会非常严重。</p>
<p>所以 LangGraph 采用：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">节点输出</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Reducer</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">新State</span><br></div></code></pre></div></div>
<p>的不可变状态思想。</p>
<p>这其实和 React 的 State 更新逻辑很像。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="真正解决方案">真正解决方案<a href="https://jameszbl.github.io/state-missed#%E7%9C%9F%E6%AD%A3%E8%A7%A3%E5%86%B3%E6%96%B9%E6%A1%88" class="hash-link" aria-label="真正解决方案的直接链接" title="真正解决方案的直接链接" translate="no">​</a></h2>
<p>节点不要修改原 State。</p>
<p>只返回变更内容。</p>
<p>错误写法：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">public Map&lt;String,Object&gt; execute(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    Map&lt;String,Object&gt; state){</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    state.put("sentiment","NEGATIVE");</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    return state;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>正确写法：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">public Map&lt;String,Object&gt; execute(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    Map&lt;String,Object&gt; state){</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    return Map.of(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        "sentiment",</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        "NEGATIVE"</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    );</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>再次执行：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">EmotionNode execute</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">sentiment=NEGATIVE</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">IntentNode execute</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">sentiment=NEGATIVE</span><br></div></code></pre></div></div>
<p>恢复正常。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="第二个隐藏坑字段覆盖">第二个隐藏坑：字段覆盖<a href="https://jameszbl.github.io/state-missed#%E7%AC%AC%E4%BA%8C%E4%B8%AA%E9%9A%90%E8%97%8F%E5%9D%91%E5%AD%97%E6%AE%B5%E8%A6%86%E7%9B%96" class="hash-link" aria-label="第二个隐藏坑：字段覆盖的直接链接" title="第二个隐藏坑：字段覆盖的直接链接" translate="no">​</a></h2>
<p>问题解决后没两天又遇到一个新坑。</p>
<p>知识库检索节点：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">return Map.of(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    "documents",</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    docs</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">);</span><br></div></code></pre></div></div>
<p>生成节点：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">return Map.of(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    "documents",</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    rerankDocs</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">);</span><br></div></code></pre></div></div>
<p>结果发现最终 State 中只剩下：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">rerankDocs</span><br></div></code></pre></div></div>
<p>原始召回结果消失。</p>
<p>最开始以为是框架缓存问题。</p>
<p>后来发现是 Reducer 默认行为。</p>
<p>默认情况下：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">同名字段</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">=</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">后写覆盖前写</span><br></div></code></pre></div></div>
<p>即：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">documents = newValue</span><br></div></code></pre></div></div>
<p>而不是：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">documents.addAll(...)</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="为什么官方一直强调reducer">为什么官方一直强调Reducer<a href="https://jameszbl.github.io/state-missed#%E4%B8%BA%E4%BB%80%E4%B9%88%E5%AE%98%E6%96%B9%E4%B8%80%E7%9B%B4%E5%BC%BA%E8%B0%83reducer" class="hash-link" aria-label="为什么官方一直强调Reducer的直接链接" title="为什么官方一直强调Reducer的直接链接" translate="no">​</a></h2>
<p>看到这里终于理解官方文档为什么一直在讲：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Reducer</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">State Schema</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Merge Strategy</span><br></div></code></pre></div></div>
<p>因为 LangGraph 本质上不是流程编排框架。</p>
<p>而是：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">状态驱动框架</span><br></div></code></pre></div></div>
<p>流程只是表象。</p>
<p>State 才是核心。</p>
<p>所有节点都围绕 State 演化。</p>
<p>这也是它和传统工作流引擎最大的区别。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="这个坑背后的启示">这个坑背后的启示<a href="https://jameszbl.github.io/state-missed#%E8%BF%99%E4%B8%AA%E5%9D%91%E8%83%8C%E5%90%8E%E7%9A%84%E5%90%AF%E7%A4%BA" class="hash-link" aria-label="这个坑背后的启示的直接链接" title="这个坑背后的启示的直接链接" translate="no">​</a></h2>
<p>这次问题前后排查了接近两个小时。</p>
<p>从 Spring Bean、线程池、异步执行一路查到 LangGraph 源码，最后发现根本不是框架问题，而是自己把 LangGraph 当成了普通责任链模式来理解。</p>
<p>如果把 LangGraph 看成：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Node1</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Node2</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Node3</span><br></div></code></pre></div></div>
<p>很多设计都会觉得奇怪。</p>
<p>但如果换个视角：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">State</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Node</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">State</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Node</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">State</span><br></div></code></pre></div></div>
<p>很多源码设计瞬间就合理了。</p>]]></content:encoded>
            <category>ai</category>
            <category>java</category>
            <category>langgraph</category>
        </item>
        <item>
            <title><![CDATA[Qdrant数据写进去了却搜不到]]></title>
            <link>https://jameszbl.github.io/qdrant-no-data</link>
            <guid>https://jameszbl.github.io/qdrant-no-data</guid>
            <pubDate>Fri, 12 Dec 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[{/ truncate /}]]></description>
            <content:encoded><![CDATA[
<hr>
<p>最近在做的系统中集成RAG，需要将商品知识库、物流知识库写入Qdrant，实现RAG检索。</p>
<p>技术栈：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Spring Boot</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LangChain4j</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Qdrant</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">OpenAI Embedding</span><br></div></code></pre></div></div>
<p>整体流程非常简单：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">文档</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Chunk切分</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Embedding</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Qdrant存储</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">向量检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM生成</span><br></div></code></pre></div></div>
<p>结果测试时，出现了一个非常奇怪的问题。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="问题现象">问题现象<a href="https://jameszbl.github.io/qdrant-no-data#%E9%97%AE%E9%A2%98%E7%8E%B0%E8%B1%A1" class="hash-link" aria-label="问题现象的直接链接" title="问题现象的直接链接" translate="no">​</a></h2>
<p>数据写入完全正常。</p>
<p>Qdrant后台也能看到数据：</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token property" style="color:#36acaa">"id"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token property" style="color:#36acaa">"vector"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain">...</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<p>插入日志：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">INFO Insert 5000 points success</span><br></div></code></pre></div></div>
<p>但是查询始终为空：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">List&lt;ScoredPoint&gt; result = client.search(...)</span><br></div></code></pre></div></div>
<p>返回：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">[]</span><br></div></code></pre></div></div>
<p>完全搜不到数据。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="第一轮排查">第一轮排查<a href="https://jameszbl.github.io/qdrant-no-data#%E7%AC%AC%E4%B8%80%E8%BD%AE%E6%8E%92%E6%9F%A5" class="hash-link" aria-label="第一轮排查的直接链接" title="第一轮排查的直接链接" translate="no">​</a></h2>
<p>怀疑Embedding有问题。</p>
<p>打印向量长度：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Embedding embedding =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    embeddingModel.embed(text).content();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">System.out.println(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    embedding.vector().length</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">);</span><br></div></code></pre></div></div>
<p>结果：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">1536</span><br></div></code></pre></div></div>
<p>正常。</p>
<p>再查看Qdrant Collection：</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">curl localhost:6333/collections/docs</span><br></div></code></pre></div></div>
<p>返回：</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token property" style="color:#36acaa">"vectors"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token property" style="color:#36acaa">"size"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">1536</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<p>维度一致。</p>
<p>排除Embedding问题。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="第二轮排查">第二轮排查<a href="https://jameszbl.github.io/qdrant-no-data#%E7%AC%AC%E4%BA%8C%E8%BD%AE%E6%8E%92%E6%9F%A5" class="hash-link" aria-label="第二轮排查的直接链接" title="第二轮排查的直接链接" translate="no">​</a></h2>
<p>怀疑相似度阈值过高。</p>
<p>代码：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">SearchParams.newBuilder()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    .setHnswEf(128)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    .build();</span><br></div></code></pre></div></div>
<p>搜索：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">scoreThreshold=0.8</span><br></div></code></pre></div></div>
<p>改成：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">scoreThreshold=0.3</span><br></div></code></pre></div></div>
<p>还是查不到。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="第三轮排查">第三轮排查<a href="https://jameszbl.github.io/qdrant-no-data#%E7%AC%AC%E4%B8%89%E8%BD%AE%E6%8E%92%E6%9F%A5" class="hash-link" aria-label="第三轮排查的直接链接" title="第三轮排查的直接链接" translate="no">​</a></h2>
<p>开始怀疑数据没写进去。</p>
<p>查看point数量：</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">curl localhost:6333/collections/docs</span><br></div></code></pre></div></div>
<p>结果：</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token property" style="color:#36acaa">"points_count"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">5000</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<p>确实存在。</p>
<p>甚至通过ID查询：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">retrieve(1)</span><br></div></code></pre></div></div>
<p>也能返回数据。</p>
<p>说明：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">数据存在</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">向量存在</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Collection存在</span><br></div></code></pre></div></div>
<p>但就是搜不到。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="真正原因">真正原因<a href="https://jameszbl.github.io/qdrant-no-data#%E7%9C%9F%E6%AD%A3%E5%8E%9F%E5%9B%A0" class="hash-link" aria-label="真正原因的直接链接" title="真正原因的直接链接" translate="no">​</a></h2>
<p>最后查看官方文档才发现问题。</p>
<p>创建Collection时：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">VectorParams.newBuilder()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    .setSize(1536)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    .setDistance(Distance.Cosine)</span><br></div></code></pre></div></div>
<p>后来项目切换Embedding模型：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">bge-large-zh</span><br></div></code></pre></div></div>
<p>向量维度变成：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">1024</span><br></div></code></pre></div></div>
<p>但是Collection没有重建。</p>
<p>此时出现：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Qdrant Collection: 1536维</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Embedding结果: 1024维</span><br></div></code></pre></div></div>
<p>写入时因为封装层没有显式报错。</p>
<p>查询时：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">1536 vs 1024</span><br></div></code></pre></div></div>
<p>向量根本不在同一个空间。</p>
<p>自然检索不到结果。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="为什么会这样">为什么会这样？<a href="https://jameszbl.github.io/qdrant-no-data#%E4%B8%BA%E4%BB%80%E4%B9%88%E4%BC%9A%E8%BF%99%E6%A0%B7" class="hash-link" aria-label="为什么会这样？的直接链接" title="为什么会这样？的直接链接" translate="no">​</a></h2>
<p>这里涉及Qdrant底层原理。</p>
<p>Qdrant本质上是：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">向量ID</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">向量数组</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">HNSW索引</span><br></div></code></pre></div></div>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">[0.12,0.33,0.55...]</span><br></div></code></pre></div></div>
<p>HNSW建立的是固定维度空间。</p>
<p>比如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">1536维空间</span><br></div></code></pre></div></div>
<p>所有向量必须：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">长度完全一致</span><br></div></code></pre></div></div>
<p>否则无法计算：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Cosine Similarity</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Euclidean Distance</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Dot Product</span><br></div></code></pre></div></div>
<p>因为：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">A=[1,2,3]</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">B=[1,2]</span><br></div></code></pre></div></div>
<p>数学上根本无法求距离。</p>
<p>所以Qdrant要求：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Collection创建后</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Vector Size不可变</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="深入源码">深入源码<a href="https://jameszbl.github.io/qdrant-no-data#%E6%B7%B1%E5%85%A5%E6%BA%90%E7%A0%81" class="hash-link" aria-label="深入源码的直接链接" title="深入源码的直接链接" translate="no">​</a></h2>
<p>Qdrant创建Collection时：</p>
<div class="language-protobuf codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-protobuf codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">message VectorParams {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  uint64 size = 1;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  Distance distance = 2;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>对应：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">CreateCollection.newBuilder()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    .setCollectionName("docs")</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    .setVectorsConfig(...)</span><br></div></code></pre></div></div>
<p>最终保存到：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Collection Config</span><br></div></code></pre></div></div>
<p>后续所有Point都会校验：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">vector.length == size</span><br></div></code></pre></div></div>
<p>一旦不一致：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Wrong input vector dimension</span><br></div></code></pre></div></div>
<p>或者某些SDK直接吞掉异常。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="如何避免">如何避免<a href="https://jameszbl.github.io/qdrant-no-data#%E5%A6%82%E4%BD%95%E9%81%BF%E5%85%8D" class="hash-link" aria-label="如何避免的直接链接" title="如何避免的直接链接" translate="no">​</a></h2>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="方案一启动时校验">方案一：启动时校验<a href="https://jameszbl.github.io/qdrant-no-data#%E6%96%B9%E6%A1%88%E4%B8%80%E5%90%AF%E5%8A%A8%E6%97%B6%E6%A0%A1%E9%AA%8C" class="hash-link" aria-label="方案一：启动时校验的直接链接" title="方案一：启动时校验的直接链接" translate="no">​</a></h3>
<p>项目启动时读取Collection信息：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">CollectionInfo info =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    client.getCollection("docs");</span><br></div></code></pre></div></div>
<p>检查：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">info.getConfig()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    .getParams()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    .getVectors()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    .getSize();</span><br></div></code></pre></div></div>
<p>与Embedding模型配置对比：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">1536 == embeddingDimension</span><br></div></code></pre></div></div>
<p>不一致直接启动失败。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="方案二collection版本化">方案二：Collection版本化<a href="https://jameszbl.github.io/qdrant-no-data#%E6%96%B9%E6%A1%88%E4%BA%8Ccollection%E7%89%88%E6%9C%AC%E5%8C%96" class="hash-link" aria-label="方案二：Collection版本化的直接链接" title="方案二：Collection版本化的直接链接" translate="no">​</a></h3>
<p>不要复用Collection。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">knowledge_v1</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">knowledge_v2</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">knowledge_v3</span><br></div></code></pre></div></div>
<p>Embedding模型升级：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">text-embedding-3-small</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">bge-large-zh</span><br></div></code></pre></div></div>
<p>直接创建：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">knowledge_v4</span><br></div></code></pre></div></div>
<p>重新构建索引。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="方案三记录embedding模型">方案三：记录Embedding模型<a href="https://jameszbl.github.io/qdrant-no-data#%E6%96%B9%E6%A1%88%E4%B8%89%E8%AE%B0%E5%BD%95embedding%E6%A8%A1%E5%9E%8B" class="hash-link" aria-label="方案三：记录Embedding模型的直接链接" title="方案三：记录Embedding模型的直接链接" translate="no">​</a></h3>
<p>Payload中增加：</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token property" style="color:#36acaa">"model"</span><span class="token operator" style="color:#393A34">:</span><span class="token string" style="color:#e3116c">"bge-large-zh"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<p>检索异常时快速定位：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">到底是哪批数据生成的Embedding</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="后续引出的qdrant原理">后续引出的Qdrant原理<a href="https://jameszbl.github.io/qdrant-no-data#%E5%90%8E%E7%BB%AD%E5%BC%95%E5%87%BA%E7%9A%84qdrant%E5%8E%9F%E7%90%86" class="hash-link" aria-label="后续引出的Qdrant原理的直接链接" title="后续引出的Qdrant原理的直接链接" translate="no">​</a></h2>
<p>这个问题解决后，又顺带理解了几个以前没注意的知识点：</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="collection为什么不能随便改维度">Collection为什么不能随便改维度<a href="https://jameszbl.github.io/qdrant-no-data#collection%E4%B8%BA%E4%BB%80%E4%B9%88%E4%B8%8D%E8%83%BD%E9%9A%8F%E4%BE%BF%E6%94%B9%E7%BB%B4%E5%BA%A6" class="hash-link" aria-label="Collection为什么不能随便改维度的直接链接" title="Collection为什么不能随便改维度的直接链接" translate="no">​</a></h3>
<p>因为HNSW索引是基于固定维度建立的。</p>
<p>修改维度意味着：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">整个索引失效</span><br></div></code></pre></div></div>
<p>必须重建。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="为什么embedding模型不能随便切换">为什么Embedding模型不能随便切换<a href="https://jameszbl.github.io/qdrant-no-data#%E4%B8%BA%E4%BB%80%E4%B9%88embedding%E6%A8%A1%E5%9E%8B%E4%B8%8D%E8%83%BD%E9%9A%8F%E4%BE%BF%E5%88%87%E6%8D%A2" class="hash-link" aria-label="为什么Embedding模型不能随便切换的直接链接" title="为什么Embedding模型不能随便切换的直接链接" translate="no">​</a></h3>
<p>因为不同模型生成的是不同向量空间：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">OpenAI</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">1536维</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">BGE</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">1024维</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Jina</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">768维</span><br></div></code></pre></div></div>
<p>即使维度相同：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">语义空间也不同</span><br></div></code></pre></div></div>
<p>老数据和新数据混存会导致召回效果急剧下降。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="为什么rag要做全量重建">为什么RAG要做全量重建<a href="https://jameszbl.github.io/qdrant-no-data#%E4%B8%BA%E4%BB%80%E4%B9%88rag%E8%A6%81%E5%81%9A%E5%85%A8%E9%87%8F%E9%87%8D%E5%BB%BA" class="hash-link" aria-label="为什么RAG要做全量重建的直接链接" title="为什么RAG要做全量重建的直接链接" translate="no">​</a></h3>
<p>很多团队升级Embedding后发现：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">检索命中率下降</span><br></div></code></pre></div></div>
<p>根本原因往往不是Qdrant。</p>
<p>而是：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">旧Embedding</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">新Embedding</span><br></div></code></pre></div></div>
<p>混在同一个Collection里。</p>
<hr>
<p>这个案例在真实项目里非常常见，而且很有迷惑性，因为：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">数据能写进去</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Collection也存在</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">没有明显报错</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">查询结果却永远为空</span><br></div></code></pre></div></div>
<p>最后定位下来并不是Spring问题，也不是LangChain4j问题，而是Qdrant最核心的设计原则——<strong>一个Collection只能对应一个固定维度、固定语义空间的Embedding模型</strong></p>]]></content:encoded>
            <category>ai</category>
            <category>java</category>
            <category>qdrant</category>
        </item>
        <item>
            <title><![CDATA[Agent到底是什么]]></title>
            <link>https://jameszbl.github.io/what-is-agent</link>
            <guid>https://jameszbl.github.io/what-is-agent</guid>
            <pubDate>Mon, 01 Dec 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[{/ truncate /}]]></description>
            <content:encoded><![CDATA[
<p>随着大模型能力不断增强，越来越多企业开始尝试构建自己的 AI Agent。然而在真实业务场景中，很多团队很快会发现：</p>
<blockquote>
<p>一个大模型 + 一个Agent，远远无法支撑复杂企业流程。</p>
</blockquote>
<p>真正能够落地生产环境的，往往不是单个 Agent，而是一套可编排、可治理、可审计的 Agent Workflow（Agent工作流）。</p>
<p>本文将从工程视角介绍企业级 Agent 工作流设计思想。</p>
<hr>
<h1>1. 为什么单Agent不够</h1>
<p>很多初学者理解的 Agent：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    LLM</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">调用工具</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">返回结果</span><br></div></code></pre></div></div>
<p>例如：</p>
<p>用户：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">帮我查询订单状态</span><br></div></code></pre></div></div>
<p>Agent：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">调用订单系统API</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">获取结果</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生成回复</span><br></div></code></pre></div></div>
<p>这种模式适合简单任务。</p>
<p>但企业真实场景往往更加复杂：</p>
<p>例如客服场景：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户投诉商品质量</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">识别情绪</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">识别意图</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">查询订单</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">查询商品知识库</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生成处理方案</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">人工审核</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">发送回复</span><br></div></code></pre></div></div>
<p>此时问题出现：</p>
<ul>
<li class="">多步骤依赖</li>
<li class="">多系统调用</li>
<li class="">多个模型协作</li>
<li class="">需要人工介入</li>
<li class="">需要失败重试</li>
<li class="">需要审计日志</li>
</ul>
<p>如果全部交给一个 Agent 自主决策：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Agent自己思考</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Agent自己决定</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Agent自己调用工具</span><br></div></code></pre></div></div>
<p>将产生：</p>
<ul>
<li class="">不稳定</li>
<li class="">不可预测</li>
<li class="">难以调试</li>
<li class="">难以监管</li>
</ul>
<p>因此企业更倾向于：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Agent能力</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Workflow编排</span><br></div></code></pre></div></div>
<p>而不是完全自治 Agent。</p>
<hr>
<h1>2. 工作流与Agent区别</h1>
<p>很多人容易混淆 Workflow 和 Agent。</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="agent">Agent<a href="https://jameszbl.github.io/what-is-agent#agent" class="hash-link" aria-label="Agent的直接链接" title="Agent的直接链接" translate="no">​</a></h2>
<p>核心特点：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">自主决策</span><br></div></code></pre></div></div>
<p>执行过程：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Thought</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Action</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Observation</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Thought</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Action</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Observation</span><br></div></code></pre></div></div>
<p>Agent每一步都由模型决定。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">帮我分析这个客户</span><br></div></code></pre></div></div>
<p>Agent可能：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">查CRM</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">查订单</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">查聊天记录</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生成报告</span><br></div></code></pre></div></div>
<p>整个路径是不确定的。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="workflow">Workflow<a href="https://jameszbl.github.io/what-is-agent#workflow" class="hash-link" aria-label="Workflow的直接链接" title="Workflow的直接链接" translate="no">​</a></h2>
<p>核心特点：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">流程确定</span><br></div></code></pre></div></div>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">情绪分析</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">意图识别</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">知识库检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生成回复</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">人工审核</span><br></div></code></pre></div></div>
<p>路径提前定义。</p>
<p>模型只负责某个节点能力。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="企业为什么偏爱workflow">企业为什么偏爱Workflow<a href="https://jameszbl.github.io/what-is-agent#%E4%BC%81%E4%B8%9A%E4%B8%BA%E4%BB%80%E4%B9%88%E5%81%8F%E7%88%B1workflow" class="hash-link" aria-label="企业为什么偏爱Workflow的直接链接" title="企业为什么偏爱Workflow的直接链接" translate="no">​</a></h2>
<p>因为：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">可观测</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">可调试</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">可回放</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">可审计</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">可治理</span><br></div></code></pre></div></div>
<p>生产环境稳定性远比智能程度更重要。</p>
<p>因此很多企业Agent本质上是：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Workflow + Agent</span><br></div></code></pre></div></div>
<p>而不是纯Agent。</p>
<hr>
<h1>3. LangGraph思想</h1>
<p>目前最火的 Agent Workflow 框架之一是：</p>
<p>LangGraph</p>
<p>它的核心思想非常简单：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Graph</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">=</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Node + Edge</span><br></div></code></pre></div></div>
<p>即：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">节点 + 连线</span><br></div></code></pre></div></div>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户输入</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">意图识别</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">知识库检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">答案生成</span><br></div></code></pre></div></div>
<p>对应：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Node A</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Node B</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Node C</span><br></div></code></pre></div></div>
<hr>
<p>LangGraph进一步增加：</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="状态">状态<a href="https://jameszbl.github.io/what-is-agent#%E7%8A%B6%E6%80%81" class="hash-link" aria-label="状态的直接链接" title="状态的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">State</span><br></div></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="条件路由">条件路由<a href="https://jameszbl.github.io/what-is-agent#%E6%9D%A1%E4%BB%B6%E8%B7%AF%E7%94%B1" class="hash-link" aria-label="条件路由的直接链接" title="条件路由的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Conditional Edge</span><br></div></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="循环">循环<a href="https://jameszbl.github.io/what-is-agent#%E5%BE%AA%E7%8E%AF" class="hash-link" aria-label="循环的直接链接" title="循环的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Loop</span><br></div></code></pre></div></div>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">生成答案</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">质量检查</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">不合格</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">重新生成</span><br></div></code></pre></div></div>
<p>形成闭环。</p>
<hr>
<h1>4. 状态机设计</h1>
<p>企业级Agent最重要的设计其实不是Prompt。</p>
<p>而是：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">State</span><br></div></code></pre></div></div>
<p>状态管理。</p>
<hr>
<p>例如客服工作流：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">public class CustomerServiceState {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    private String userQuestion;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    private String sentiment;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    private String intent;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    private List&lt;Document&gt; documents;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    private String draftReply;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    private boolean needHumanReview;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>整个流程都围绕 State 展开。</p>
<hr>
<p>流程执行：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">节点1</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">更新 sentiment</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">节点2</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">更新 intent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">节点3</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">更新 documents</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">节点4</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">更新 draftReply</span><br></div></code></pre></div></div>
<p>最终形成：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">共享上下文</span><br></div></code></pre></div></div>
<p>而不是节点之间互相传参数。</p>
<hr>
<p>推荐原则：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">节点无状态</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">状态集中管理</span><br></div></code></pre></div></div>
<p>这样：</p>
<ul>
<li class="">易扩展</li>
<li class="">易测试</li>
<li class="">易回放</li>
</ul>
<hr>
<h1>5. 条件路由</h1>
<p>企业流程一定会出现分支。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">好评</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">自动感谢</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">差评</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">进入售后处理</span><br></div></code></pre></div></div>
<hr>
<p>路由节点：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">if(sentiment.equals("positive")){</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    goto(POSITIVE_FLOW);</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}else{</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    goto(NEGATIVE_FLOW);</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>流程图：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">          情绪分析</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">             ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ┌──────┴──────┐</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ↓             ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    好评          差评</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ↓             ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> 自动回复      售后处理</span><br></div></code></pre></div></div>
<hr>
<p>再比如：</p>
<p>意图识别：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">物流问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">商品问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">退款问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">发票问题</span><br></div></code></pre></div></div>
<p>分别路由：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">物流Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">商品Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">退款Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">财务Agent</span><br></div></code></pre></div></div>
<p>形成多Agent协作架构。</p>
<hr>
<h1>6. 人工审核节点</h1>
<p>这是企业落地中极其重要的一环。</p>
<p>很多团队上线Agent失败的原因：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">没有Human-in-the-Loop</span><br></div></code></pre></div></div>
<p>即：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">人工介入能力</span><br></div></code></pre></div></div>
<hr>
<p>典型设计：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">生成回复</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">风险检测</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">人工审核</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">发送用户</span><br></div></code></pre></div></div>
<hr>
<p>高风险场景：</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="金融">金融<a href="https://jameszbl.github.io/what-is-agent#%E9%87%91%E8%9E%8D" class="hash-link" aria-label="金融的直接链接" title="金融的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">贷款审批</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">投资建议</span><br></div></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="医疗">医疗<a href="https://jameszbl.github.io/what-is-agent#%E5%8C%BB%E7%96%97" class="hash-link" aria-label="医疗的直接链接" title="医疗的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">诊断建议</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">用药建议</span><br></div></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="客服">客服<a href="https://jameszbl.github.io/what-is-agent#%E5%AE%A2%E6%9C%8D" class="hash-link" aria-label="客服的直接链接" title="客服的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">赔偿金额</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">退款金额</span><br></div></code></pre></div></div>
<hr>
<p>审核节点设计：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">public class ReviewNode {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    public ReviewResult review(State state){</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        if(state.isHighRisk()){</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            return WAIT_HUMAN;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        }</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        return PASS;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    }</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<hr>
<p>企业往往还会提供：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">通过</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">驳回</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">修改后通过</span><br></div></code></pre></div></div>
<p>三种操作。</p>
<p>形成：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Human Approval Workflow</span><br></div></code></pre></div></div>
<hr>
<h1>7. 企业客服Agent案例</h1>
<p>下面以电商客服Agent为例。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="整体架构">整体架构<a href="https://jameszbl.github.io/what-is-agent#%E6%95%B4%E4%BD%93%E6%9E%B6%E6%9E%84" class="hash-link" aria-label="整体架构的直接链接" title="整体架构的直接链接" translate="no">​</a></h2>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户评价</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">情感分析</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">意图识别</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">条件路由</span><br></div></code></pre></div></div>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="好评流程">好评流程<a href="https://jameszbl.github.io/what-is-agent#%E5%A5%BD%E8%AF%84%E6%B5%81%E7%A8%8B" class="hash-link" aria-label="好评流程的直接链接" title="好评流程的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">好评</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">商品知识库</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生成感谢回复</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">自动发送</span><br></div></code></pre></div></div>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="差评流程">差评流程<a href="https://jameszbl.github.io/what-is-agent#%E5%B7%AE%E8%AF%84%E6%B5%81%E7%A8%8B" class="hash-link" aria-label="差评流程的直接链接" title="差评流程的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">差评</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">意图识别</span><br></div></code></pre></div></div>
<p>可能识别：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">水果不甜</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">物流太慢</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">包装破损</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">售后问题</span><br></div></code></pre></div></div>
<hr>
<p>进入对应Agent：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">商品Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">物流Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">售后Agent</span><br></div></code></pre></div></div>
<hr>
<p>各Agent执行：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">RAG检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">知识增强</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Prompt组装</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生成回复</span><br></div></code></pre></div></div>
<hr>
<p>随后进入：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">风险检测</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">人工审核</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">发送用户</span><br></div></code></pre></div></div>
<hr>
<p>整体流程图：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">                用户评价</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                情感分析</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">          ┌─────────┴─────────┐</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">          ↓                   ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">         好评               差评</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">          ↓                   ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      自动感谢          意图识别</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                              ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                ┌──────┬──────┬──────┐</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                ↓      ↓      ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            商品Agent 物流Agent 售后Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">             RAG检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">             回复生成</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">             风险检测</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">             人工审核</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">             回复用户</span><br></div></code></pre></div></div>
<hr>
<h1>总结</h1>
<p>企业级 Agent 的核心并不是让模型拥有无限自主能力，而是将大模型能力嵌入到可治理的业务流程中。</p>
<p>一个成熟的企业 Agent 系统通常具备：</p>
<ul>
<li class="">Workflow编排</li>
<li class="">状态机管理</li>
<li class="">条件路由</li>
<li class="">多Agent协作</li>
<li class="">RAG知识增强</li>
<li class="">Human-in-the-Loop</li>
<li class="">全链路日志与审计</li>
</ul>
<p>当 Agent 与工作流结合后，AI 才真正具备进入企业核心业务系统的能力。</p>]]></content:encoded>
            <category>ai</category>
            <category>java</category>
            <category>rag</category>
        </item>
        <item>
            <title><![CDATA[企业RAG优化实战：从能用到好用的工程化演进]]></title>
            <link>https://jameszbl.github.io/make-better-rag</link>
            <guid>https://jameszbl.github.io/make-better-rag</guid>
            <pubDate>Fri, 21 Nov 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[{/ truncate /}]]></description>
            <content:encoded><![CDATA[
<p>随着技术的发展，RAG（Retrieval-Augmented Generation，检索增强生成）几乎成为企业AI应用的标准架构。</p>
<p>然而很多团队上线后会发现：</p>
<ul>
<li class="">检索结果不准确</li>
<li class="">明明知识库里有答案却找不到</li>
<li class="">上下文过长导致成本升高</li>
<li class="">用户反馈效果时好时坏</li>
</ul>
<p>实际上，企业RAG的难点并不在于搭建一个Demo，而在于如何持续优化召回率和答案准确率。</p>
<p>本文结合实际项目经验，介绍企业RAG系统常见的优化方案，以及背后的工程实现思路。</p>
<hr>
<h1>1 Chunk大小如何确定</h1>
<p>Chunk（文本切片）是影响RAG效果最关键的因素之一。</p>
<p>很多初学者直接采用：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">每500个字符切一块</span><br></div></code></pre></div></div>
<p>这种方式虽然简单，但往往效果并不好。</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="chunk过小的问题">Chunk过小的问题<a href="https://jameszbl.github.io/make-better-rag#chunk%E8%BF%87%E5%B0%8F%E7%9A%84%E9%97%AE%E9%A2%98" class="hash-link" aria-label="Chunk过小的问题的直接链接" title="Chunk过小的问题的直接链接" translate="no">​</a></h2>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">订单已发货。</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">物流单号：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">SF123456789</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">预计3天送达。</span><br></div></code></pre></div></div>
<p>如果切片长度过小：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Chunk1:</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">订单已发货</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Chunk2:</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">物流单号：SF123456789</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Chunk3:</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">预计3天送达</span><br></div></code></pre></div></div>
<p>用户提问：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">订单什么时候到？</span><br></div></code></pre></div></div>
<p>可能只召回Chunk3。</p>
<p>模型缺少上下文。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="chunk过大的问题">Chunk过大的问题<a href="https://jameszbl.github.io/make-better-rag#chunk%E8%BF%87%E5%A4%A7%E7%9A%84%E9%97%AE%E9%A2%98" class="hash-link" aria-label="Chunk过大的问题的直接链接" title="Chunk过大的问题的直接链接" translate="no">​</a></h2>
<p>如果一个Chunk包含：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">商品介绍</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">物流政策</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">售后政策</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">优惠券规则</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">会员权益</span><br></div></code></pre></div></div>
<p>Embedding会变得不聚焦。</p>
<p>用户问：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">如何退货？</span><br></div></code></pre></div></div>
<p>召回结果可能包含大量无关信息。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="企业常用策略">企业常用策略<a href="https://jameszbl.github.io/make-better-rag#%E4%BC%81%E4%B8%9A%E5%B8%B8%E7%94%A8%E7%AD%96%E7%95%A5" class="hash-link" aria-label="企业常用策略的直接链接" title="企业常用策略的直接链接" translate="no">​</a></h2>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="fixed-size">Fixed Size<a href="https://jameszbl.github.io/make-better-rag#fixed-size" class="hash-link" aria-label="Fixed Size的直接链接" title="Fixed Size的直接链接" translate="no">​</a></h3>
<p>固定长度切分</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Chunk Size = 500~1000 Tokens</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Overlap = 100~200 Tokens</span><br></div></code></pre></div></div>
<p>适合：</p>
<ul>
<li class="">FAQ</li>
<li class="">产品说明书</li>
<li class="">运营文档</li>
</ul>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="sliding-window">Sliding Window<a href="https://jameszbl.github.io/make-better-rag#sliding-window" class="hash-link" aria-label="Sliding Window的直接链接" title="Sliding Window的直接链接" translate="no">​</a></h3>
<p>滑动窗口切分</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Chunk1 0~500</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Chunk2 400~900</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Chunk3 800~1300</span><br></div></code></pre></div></div>
<p>保留上下文连续性。</p>
<p>适用于：</p>
<ul>
<li class="">长文档</li>
<li class="">技术手册</li>
<li class="">法律合同</li>
</ul>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="semantic-chunk">Semantic Chunk<a href="https://jameszbl.github.io/make-better-rag#semantic-chunk" class="hash-link" aria-label="Semantic Chunk的直接链接" title="Semantic Chunk的直接链接" translate="no">​</a></h3>
<p>基于语义切分</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain"># 售后政策</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">...</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">...</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"># 物流规则</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">...</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">...</span><br></div></code></pre></div></div>
<p>按标题进行切分。</p>
<p>优点：</p>
<ul>
<li class="">语义完整</li>
<li class="">Embedding质量高</li>
<li class="">召回更准确</li>
</ul>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="实际经验">实际经验<a href="https://jameszbl.github.io/make-better-rag#%E5%AE%9E%E9%99%85%E7%BB%8F%E9%AA%8C" class="hash-link" aria-label="实际经验的直接链接" title="实际经验的直接链接" translate="no">​</a></h2>
<p>企业项目中通常采用：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">标题切分</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">滑动窗口</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">最大Token限制</span><br></div></code></pre></div></div>
<p>组合方案。</p>
<p>效果往往优于单纯固定长度切分。</p>
<hr>
<h1>2 Hybrid Search</h1>
<p>很多团队上线后发现：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">只做向量检索</span><br></div></code></pre></div></div>
<p>效果并不好。</p>
<p>原因在于：</p>
<p>Embedding擅长理解语义。</p>
<p>但不擅长精确匹配。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="keyword-search">Keyword Search<a href="https://jameszbl.github.io/make-better-rag#keyword-search" class="hash-link" aria-label="Keyword Search的直接链接" title="Keyword Search的直接链接" translate="no">​</a></h2>
<p>典型实现：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">BM25</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">ElasticSearch</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">OpenSearch</span><br></div></code></pre></div></div>
<p>例如：</p>
<p>用户搜索：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">SKU123456</span><br></div></code></pre></div></div>
<p>向量模型未必理解SKU。</p>
<p>但关键词检索能够精准命中。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="vector-search">Vector Search<a href="https://jameszbl.github.io/make-better-rag#vector-search" class="hash-link" aria-label="Vector Search的直接链接" title="Vector Search的直接链接" translate="no">​</a></h2>
<p>流程：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Query</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Embedding</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">向量检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">TopK</span><br></div></code></pre></div></div>
<p>适用于：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">快递太慢</span><br></div></code></pre></div></div>
<p>召回：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">物流延迟</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">配送超时</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">运输异常</span><br></div></code></pre></div></div>
<p>等语义相近内容。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="hybrid-search">Hybrid Search<a href="https://jameszbl.github.io/make-better-rag#hybrid-search" class="hash-link" aria-label="Hybrid Search的直接链接" title="Hybrid Search的直接链接" translate="no">​</a></h2>
<p>生产环境通常采用：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">BM25</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Vector Search</span><br></div></code></pre></div></div>
<p>架构如下：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">BM25召回</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">--------</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">向量召回</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> Merge</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ReRank</span><br></div></code></pre></div></div>
<p>优势：</p>
<ul>
<li class="">提高召回率</li>
<li class="">减少漏召回</li>
<li class="">提升长尾问题覆盖率</li>
</ul>
<hr>
<h1>3 ReRank原理</h1>
<p>很多团队误以为：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">TopK检索结果</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">直接喂给LLM</span><br></div></code></pre></div></div>
<p>即可。</p>
<p>实际上这是效果下降的重要原因。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="为什么需要rerank">为什么需要ReRank<a href="https://jameszbl.github.io/make-better-rag#%E4%B8%BA%E4%BB%80%E4%B9%88%E9%9C%80%E8%A6%81rerank" class="hash-link" aria-label="为什么需要ReRank的直接链接" title="为什么需要ReRank的直接链接" translate="no">​</a></h2>
<p>假设用户提问：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">水果不甜怎么办？</span><br></div></code></pre></div></div>
<p>检索返回：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">A 商品种植基地介绍</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">B 水果糖度标准</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">C 售后赔付规则</span><br></div></code></pre></div></div>
<p>向量相似度：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">A 0.82</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">B 0.81</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">C 0.80</span><br></div></code></pre></div></div>
<p>差距非常小。</p>
<p>真正最有价值的是：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">C 售后赔付规则</span><br></div></code></pre></div></div>
<p>但可能排在最后。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="rerank流程">ReRank流程<a href="https://jameszbl.github.io/make-better-rag#rerank%E6%B5%81%E7%A8%8B" class="hash-link" aria-label="ReRank流程的直接链接" title="ReRank流程的直接链接" translate="no">​</a></h2>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Query</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> +</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> </span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Candidate Docs</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Cross Encoder</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">重新打分</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">TopN</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="常见模型">常见模型<a href="https://jameszbl.github.io/make-better-rag#%E5%B8%B8%E8%A7%81%E6%A8%A1%E5%9E%8B" class="hash-link" aria-label="常见模型的直接链接" title="常见模型的直接链接" translate="no">​</a></h2>
<p>开源：</p>
<ul>
<li class="">BGE-Reranker</li>
<li class="">BCE-Reranker</li>
</ul>
<p>商业：</p>
<ul>
<li class="">Cohere Rerank</li>
</ul>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="效果提升">效果提升<a href="https://jameszbl.github.io/make-better-rag#%E6%95%88%E6%9E%9C%E6%8F%90%E5%8D%87" class="hash-link" aria-label="效果提升的直接链接" title="效果提升的直接链接" translate="no">​</a></h2>
<p>实际项目中：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">仅向量召回</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">准确率：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">70%左右</span><br></div></code></pre></div></div>
<p>加入ReRank：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">80%~90%</span><br></div></code></pre></div></div>
<p>属于投入产出比最高的优化手段之一。</p>
<hr>
<h1>4 查询改写（Query Rewrite）</h1>
<p>用户的问题往往并不适合直接检索。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">这个东西为什么还没到？</span><br></div></code></pre></div></div>
<p>Embedding很难理解：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">这个东西</span><br></div></code></pre></div></div>
<p>指什么。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="query-rewrite流程">Query Rewrite流程<a href="https://jameszbl.github.io/make-better-rag#query-rewrite%E6%B5%81%E7%A8%8B" class="hash-link" aria-label="Query Rewrite流程的直接链接" title="Query Rewrite流程的直接链接" translate="no">​</a></h2>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">原始问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM改写</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">检索</span><br></div></code></pre></div></div>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">这个东西为什么还没到</span><br></div></code></pre></div></div>
<p>改写为：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">订单物流延迟原因是什么</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="多轮对话场景">多轮对话场景<a href="https://jameszbl.github.io/make-better-rag#%E5%A4%9A%E8%BD%AE%E5%AF%B9%E8%AF%9D%E5%9C%BA%E6%99%AF" class="hash-link" aria-label="多轮对话场景的直接链接" title="多轮对话场景的直接链接" translate="no">​</a></h2>
<p>用户：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">苹果怎么样？</span><br></div></code></pre></div></div>
<p>助手：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">甜度较高。</span><br></div></code></pre></div></div>
<p>用户：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">那坏了怎么办？</span><br></div></code></pre></div></div>
<p>改写后：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">苹果收到后腐烂怎么办</span><br></div></code></pre></div></div>
<p>才能正确检索。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="企业实现">企业实现<a href="https://jameszbl.github.io/make-better-rag#%E4%BC%81%E4%B8%9A%E5%AE%9E%E7%8E%B0" class="hash-link" aria-label="企业实现的直接链接" title="企业实现的直接链接" translate="no">​</a></h2>
<p>通常增加一个轻量模型：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Rewrite Model</span><br></div></code></pre></div></div>
<p>负责：</p>
<ul>
<li class="">指代消解</li>
<li class="">上下文补全</li>
<li class="">同义词扩展</li>
</ul>
<p>降低检索难度。</p>
<hr>
<h1>5 多路召回</h1>
<p>大型知识库中，单一路径召回往往不够。</p>
<p>企业通常采用：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Multi Retrieval</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="常见召回源">常见召回源<a href="https://jameszbl.github.io/make-better-rag#%E5%B8%B8%E8%A7%81%E5%8F%AC%E5%9B%9E%E6%BA%90" class="hash-link" aria-label="常见召回源的直接链接" title="常见召回源的直接链接" translate="no">​</a></h2>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="faq知识库">FAQ知识库<a href="https://jameszbl.github.io/make-better-rag#faq%E7%9F%A5%E8%AF%86%E5%BA%93" class="hash-link" aria-label="FAQ知识库的直接链接" title="FAQ知识库的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">运营维护</span><br></div></code></pre></div></div>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="商品知识库">商品知识库<a href="https://jameszbl.github.io/make-better-rag#%E5%95%86%E5%93%81%E7%9F%A5%E8%AF%86%E5%BA%93" class="hash-link" aria-label="商品知识库的直接链接" title="商品知识库的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">商品属性</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">商品卖点</span><br></div></code></pre></div></div>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="售后知识库">售后知识库<a href="https://jameszbl.github.io/make-better-rag#%E5%94%AE%E5%90%8E%E7%9F%A5%E8%AF%86%E5%BA%93" class="hash-link" aria-label="售后知识库的直接链接" title="售后知识库的直接链接" translate="no">​</a></h3>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">退货规则</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">赔付规则</span><br></div></code></pre></div></div>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="业务数据库">业务数据库<a href="https://jameszbl.github.io/make-better-rag#%E4%B8%9A%E5%8A%A1%E6%95%B0%E6%8D%AE%E5%BA%93" class="hash-link" aria-label="业务数据库的直接链接" title="业务数据库的直接链接" translate="no">​</a></h3>
<p>实时查询：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">订单状态</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">物流状态</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">库存状态</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="多路召回架构">多路召回架构<a href="https://jameszbl.github.io/make-better-rag#%E5%A4%9A%E8%B7%AF%E5%8F%AC%E5%9B%9E%E6%9E%B6%E6%9E%84" class="hash-link" aria-label="多路召回架构的直接链接" title="多路召回架构的直接链接" translate="no">​</a></h2>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">     用户问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  Intent Router</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> FAQ | 商品 | 售后</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">     Merge</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">     ReRank</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="工程实现">工程实现<a href="https://jameszbl.github.io/make-better-rag#%E5%B7%A5%E7%A8%8B%E5%AE%9E%E7%8E%B0" class="hash-link" aria-label="工程实现的直接链接" title="工程实现的直接链接" translate="no">​</a></h2>
<p>常见做法：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">CompletableFuture</span><br></div></code></pre></div></div>
<p>并行召回：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">faqRetriever.retrieve();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">productRetriever.retrieve();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">aftersaleRetriever.retrieve();</span><br></div></code></pre></div></div>
<p>最终聚合。</p>
<p>这样能够将整体耗时控制在：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">200~500ms</span><br></div></code></pre></div></div>
<p>以内。</p>
<hr>
<h1>6 线上效果评估</h1>
<p>很多团队上线后面临的问题：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户说不好用</span><br></div></code></pre></div></div>
<p>但不知道哪里出了问题。</p>
<p>因此必须建立评估体系。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="retrieval指标">Retrieval指标<a href="https://jameszbl.github.io/make-better-rag#retrieval%E6%8C%87%E6%A0%87" class="hash-link" aria-label="Retrieval指标的直接链接" title="Retrieval指标的直接链接" translate="no">​</a></h2>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="recallk">Recall@K<a href="https://jameszbl.github.io/make-better-rag#recallk" class="hash-link" aria-label="Recall@K的直接链接" title="Recall@K的直接链接" translate="no">​</a></h3>
<p>正确答案是否被召回</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Recall@5</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Recall@10</span><br></div></code></pre></div></div>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="mrr">MRR<a href="https://jameszbl.github.io/make-better-rag#mrr" class="hash-link" aria-label="MRR的直接链接" title="MRR的直接链接" translate="no">​</a></h3>
<p>Mean Reciprocal Rank</p>
<p>衡量正确结果排名。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="ndcg">NDCG<a href="https://jameszbl.github.io/make-better-rag#ndcg" class="hash-link" aria-label="NDCG的直接链接" title="NDCG的直接链接" translate="no">​</a></h3>
<p>考虑排序质量。</p>
<p>适用于：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">ReRank评估</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="generation指标">Generation指标<a href="https://jameszbl.github.io/make-better-rag#generation%E6%8C%87%E6%A0%87" class="hash-link" aria-label="Generation指标的直接链接" title="Generation指标的直接链接" translate="no">​</a></h2>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="answer-correctness">Answer Correctness<a href="https://jameszbl.github.io/make-better-rag#answer-correctness" class="hash-link" aria-label="Answer Correctness的直接链接" title="Answer Correctness的直接链接" translate="no">​</a></h3>
<p>答案正确率</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="groundedness">Groundedness<a href="https://jameszbl.github.io/make-better-rag#groundedness" class="hash-link" aria-label="Groundedness的直接链接" title="Groundedness的直接链接" translate="no">​</a></h3>
<p>答案是否来自知识库</p>
<p>用于评估幻觉。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="faithfulness">Faithfulness<a href="https://jameszbl.github.io/make-better-rag#faithfulness" class="hash-link" aria-label="Faithfulness的直接链接" title="Faithfulness的直接链接" translate="no">​</a></h3>
<p>回答是否忠于检索内容。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="ab实验">A/B实验<a href="https://jameszbl.github.io/make-better-rag#ab%E5%AE%9E%E9%AA%8C" class="hash-link" aria-label="A/B实验的直接链接" title="A/B实验的直接链接" translate="no">​</a></h2>
<p>线上通常同时运行：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">方案A</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">旧检索策略</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">方案B</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">新检索策略</span><br></div></code></pre></div></div>
<p>比较：</p>
<ul>
<li class="">点击率</li>
<li class="">采纳率</li>
<li class="">用户满意度</li>
<li class="">人工评分</li>
</ul>
<hr>
<h1>总结</h1>
<p>企业RAG的优化，本质上是一个搜索工程问题，而不仅仅是大模型问题。</p>
<p>一个成熟的RAG系统通常包含：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Chunk优化</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Hybrid Search</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Query Rewrite</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">多路召回</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">ReRank</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Prompt组装</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM生成</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">效果评估</span><br></div></code></pre></div></div>
<p>对于Java工程师而言，真正的竞争力不在于调用一次大模型API，而在于构建一套可扩展、可观测、可持续优化的RAG工程体系。</p>
<p>当知识库规模从几百篇文档增长到几十万篇文档时，决定系统效果的往往不是模型参数，而是检索链路上的每一个工程细节。</p>]]></content:encoded>
            <category>ai</category>
            <category>java</category>
            <category>rag</category>
        </item>
        <item>
            <title><![CDATA[从零实现一个Java版RAG系统]]></title>
            <link>https://jameszbl.github.io/build-rag-by-java</link>
            <guid>https://jameszbl.github.io/build-rag-by-java</guid>
            <pubDate>Fri, 14 Nov 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[{/ truncate /}]]></description>
            <content:encoded><![CDATA[
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="前言">前言<a href="https://jameszbl.github.io/build-rag-by-java#%E5%89%8D%E8%A8%80" class="hash-link" aria-label="前言的直接链接" title="前言的直接链接" translate="no">​</a></h2>
<p>上一篇文章介绍了 RAG（Retrieval-Augmented Generation）的整体架构，以及为什么它已经成为企业 AI 应用的标配。</p>
<p>本篇将通过 Java 实战，从零实现一个完整的 RAG 系统，包括：</p>
<ul>
<li class="">文档解析</li>
<li class="">Chunk 切分</li>
<li class="">Embedding 向量生成</li>
<li class="">Qdrant 存储</li>
<li class="">向量检索</li>
<li class="">大模型生成最终答案</li>
</ul>
<p>最终实现如下流程：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Embedding</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Qdrant向量检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">获取相关知识片段</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Prompt组装</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM生成答案</span><br></div></code></pre></div></div>
<p>技术栈：</p>
<ul>
<li class="">Spring Boot 3</li>
<li class="">LangChain4j</li>
<li class="">OpenAI Compatible API</li>
<li class="">Qdrant</li>
<li class="">Apache POI</li>
<li class="">PDFBox</li>
</ul>
<hr>
<h1>1. 文档解析</h1>
<p>企业知识库中的数据通常来自：</p>
<ul>
<li class="">PDF</li>
<li class="">Word</li>
<li class="">Markdown</li>
<li class="">Wiki</li>
<li class="">Excel</li>
</ul>
<p>第一步需要将这些文档统一转换成纯文本。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="pdf解析">PDF解析<a href="https://jameszbl.github.io/build-rag-by-java#pdf%E8%A7%A3%E6%9E%90" class="hash-link" aria-label="PDF解析的直接链接" title="PDF解析的直接链接" translate="no">​</a></h2>
<p>使用 PDFBox：</p>
<div class="language-xml codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-xml codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token tag punctuation" style="color:#393A34">&lt;</span><span class="token tag" style="color:#00009f">dependency</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token tag punctuation" style="color:#393A34">&lt;</span><span class="token tag" style="color:#00009f">groupId</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain">org.apache.pdfbox</span><span class="token tag punctuation" style="color:#393A34">&lt;/</span><span class="token tag" style="color:#00009f">groupId</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token tag punctuation" style="color:#393A34">&lt;</span><span class="token tag" style="color:#00009f">artifactId</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain">pdfbox</span><span class="token tag punctuation" style="color:#393A34">&lt;/</span><span class="token tag" style="color:#00009f">artifactId</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token tag punctuation" style="color:#393A34">&lt;</span><span class="token tag" style="color:#00009f">version</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain">3.0.2</span><span class="token tag punctuation" style="color:#393A34">&lt;/</span><span class="token tag" style="color:#00009f">version</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token tag punctuation" style="color:#393A34">&lt;/</span><span class="token tag" style="color:#00009f">dependency</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><br></div></code></pre></div></div>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">public String parsePdf(File file) throws IOException {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    try (PDDocument document = Loader.loadPDF(file)) {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        PDFTextStripper stripper = new PDFTextStripper();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        return stripper.getText(document);</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    }</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="word解析">Word解析<a href="https://jameszbl.github.io/build-rag-by-java#word%E8%A7%A3%E6%9E%90" class="hash-link" aria-label="Word解析的直接链接" title="Word解析的直接链接" translate="no">​</a></h2>
<p>使用 Apache POI：</p>
<div class="language-xml codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-xml codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token tag punctuation" style="color:#393A34">&lt;</span><span class="token tag" style="color:#00009f">dependency</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token tag punctuation" style="color:#393A34">&lt;</span><span class="token tag" style="color:#00009f">groupId</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain">org.apache.poi</span><span class="token tag punctuation" style="color:#393A34">&lt;/</span><span class="token tag" style="color:#00009f">groupId</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token tag punctuation" style="color:#393A34">&lt;</span><span class="token tag" style="color:#00009f">artifactId</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain">poi-ooxml</span><span class="token tag punctuation" style="color:#393A34">&lt;/</span><span class="token tag" style="color:#00009f">artifactId</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token tag punctuation" style="color:#393A34">&lt;/</span><span class="token tag" style="color:#00009f">dependency</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><br></div></code></pre></div></div>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">public String parseWord(File file) throws Exception {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    try (XWPFDocument document =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">             new XWPFDocument(new FileInputStream(file))) {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        return document.getParagraphs()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .stream()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .map(XWPFParagraph::getText)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .collect(Collectors.joining("\n"));</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    }</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="markdown解析">Markdown解析<a href="https://jameszbl.github.io/build-rag-by-java#markdown%E8%A7%A3%E6%9E%90" class="hash-link" aria-label="Markdown解析的直接链接" title="Markdown解析的直接链接" translate="no">​</a></h2>
<p>Markdown最简单：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">String content = Files.readString(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        Path.of("rag.md"));</span><br></div></code></pre></div></div>
<p>统一抽象：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">public interface DocumentParser {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    String parse(File file);</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>这样后续新增 Excel、HTML、Wiki 解析器也非常方便。</p>
<hr>
<h1>2. Chunk切分策略</h1>
<p>大模型无法一次读取几十万字文档。</p>
<p>因此需要将文档拆分成多个 Chunk。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="fixed-size">Fixed Size<a href="https://jameszbl.github.io/build-rag-by-java#fixed-size" class="hash-link" aria-label="Fixed Size的直接链接" title="Fixed Size的直接链接" translate="no">​</a></h2>
<p>最简单方案：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">每500字符</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">切一次</span><br></div></code></pre></div></div>
<p>示例：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">public List&lt;String&gt; split(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        String text,</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        int chunkSize) {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    List&lt;String&gt; chunks = new ArrayList&lt;&gt;();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    for (int i = 0; i &lt; text.length(); i += chunkSize) {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        chunks.add(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                text.substring(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                        i,</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                        Math.min(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                                i + chunkSize,</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                                text.length()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                        )</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                )</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        );</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    }</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    return chunks;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>优点：</p>
<ul>
<li class="">实现简单</li>
<li class="">速度快</li>
</ul>
<p>缺点：</p>
<ul>
<li class="">容易截断语义</li>
</ul>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="sliding-window">Sliding Window<a href="https://jameszbl.github.io/build-rag-by-java#sliding-window" class="hash-link" aria-label="Sliding Window的直接链接" title="Sliding Window的直接链接" translate="no">​</a></h2>
<p>生产环境最常见。</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Chunk1</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">[1-500]</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Chunk2</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">[400-900]</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Chunk3</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">[800-1300]</span><br></div></code></pre></div></div>
<p>重叠部分保留上下文。</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">public List&lt;String&gt; split(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        String text,</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        int chunkSize,</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        int overlap) {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    List&lt;String&gt; chunks = new ArrayList&lt;&gt;();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    for (int i = 0;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">         i &lt; text.length();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">         i += chunkSize - overlap) {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        chunks.add(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                text.substring(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                        i,</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                        Math.min(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                                i + chunkSize,</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                                text.length()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                        )</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                )</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        );</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    }</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    return chunks;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>推荐：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Chunk Size = 500~1000 Token</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Overlap = 100~200 Token</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="semantic-chunk">Semantic Chunk<a href="https://jameszbl.github.io/build-rag-by-java#semantic-chunk" class="hash-link" aria-label="Semantic Chunk的直接链接" title="Semantic Chunk的直接链接" translate="no">​</a></h2>
<p>按照语义切分。</p>
<p>例如：</p>
<div class="language-markdown codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-markdown codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token title important punctuation" style="color:#393A34">#</span><span class="token title important"> 商品评价系统</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">介绍...</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token title important punctuation" style="color:#393A34">#</span><span class="token title important"> AI回复系统</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">介绍...</span><br></div></code></pre></div></div>
<p>切分结果：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Chunk1</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">商品评价系统</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Chunk2</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">AI回复系统</span><br></div></code></pre></div></div>
<p>LangChain4j 已经提供相关实现：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">DocumentSplitter splitter =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        DocumentSplitters.recursive(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                500,</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                100</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        );</span><br></div></code></pre></div></div>
<p>企业知识库一般采用：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">语义切分</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Sliding Window</span><br></div></code></pre></div></div>
<p>效果最佳。</p>
<hr>
<h1>3. Embedding生成</h1>
<p>Chunk完成后，需要生成向量。</p>
<div class="language-xml codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-xml codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token tag punctuation" style="color:#393A34">&lt;</span><span class="token tag" style="color:#00009f">dependency</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token tag punctuation" style="color:#393A34">&lt;</span><span class="token tag" style="color:#00009f">groupId</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain">dev.langchain4j</span><span class="token tag punctuation" style="color:#393A34">&lt;/</span><span class="token tag" style="color:#00009f">groupId</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token tag punctuation" style="color:#393A34">&lt;</span><span class="token tag" style="color:#00009f">artifactId</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain">langchain4j-open-ai</span><span class="token tag punctuation" style="color:#393A34">&lt;/</span><span class="token tag" style="color:#00009f">artifactId</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token tag punctuation" style="color:#393A34">&lt;/</span><span class="token tag" style="color:#00009f">dependency</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><br></div></code></pre></div></div>
<p>配置Embedding模型：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">@Bean</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">EmbeddingModel embeddingModel() {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    return OpenAiEmbeddingModel.builder()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            .baseUrl(baseUrl)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            .apiKey(apiKey)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            .modelName("text-embedding-v4")</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            .build();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>生成向量：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Embedding embedding =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        embeddingModel.embed(text)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .content();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">float[] vector = embedding.vector();</span><br></div></code></pre></div></div>
<p>返回结果：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">[0.123,</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> 0.555,</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> -0.233,</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ...]</span><br></div></code></pre></div></div>
<p>通常维度：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">768</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">1024</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">1536</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">3072</span><br></div></code></pre></div></div>
<hr>
<h1>4. Qdrant存储</h1>
<p>向量数据库负责保存：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">向量</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">原文</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">元数据</span><br></div></code></pre></div></div>
<p>启动Qdrant：</p>
<div class="language-yaml codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-yaml codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token key atrule" style="color:#00a4db">version</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"3"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token key atrule" style="color:#00a4db">services</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token key atrule" style="color:#00a4db">qdrant</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token key atrule" style="color:#00a4db">image</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> qdrant/qdrant</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token key atrule" style="color:#00a4db">ports</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      </span><span class="token punctuation" style="color:#393A34">-</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"6333:6333"</span><br></div></code></pre></div></div>
<p>启动：</p>
<div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">docker compose up -d</span><br></div></code></pre></div></div>
<hr>
<p>创建Collection：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">client.createCollectionAsync(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        "knowledge",</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        VectorParams.newBuilder()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .setSize(1536)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .setDistance(Distance.Cosine)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .build()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">);</span><br></div></code></pre></div></div>
<hr>
<p>写入向量：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">PointStruct point =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        PointStruct.newBuilder()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .setId(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                        PointIdFactory.id(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                                UUID.randomUUID().toString()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                        )</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                )</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .setVectors(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                        VectorsFactory.vectors(vector)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                )</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .putAllPayload(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                        Map.of(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                                "content",</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                                ValueFactory.value(text)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                        )</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                )</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .build();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">client.upsertAsync(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        "knowledge",</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        List.of(point)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">);</span><br></div></code></pre></div></div>
<p>存储结构：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">ID</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Vector</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Content</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Source</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Title</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">CreateTime</span><br></div></code></pre></div></div>
<hr>
<h1>5. 向量检索</h1>
<p>用户提问：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">商品评价AI回复如何实现？</span><br></div></code></pre></div></div>
<p>先转成向量：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Embedding queryEmbedding =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        embeddingModel.embed(question)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .content();</span><br></div></code></pre></div></div>
<hr>
<p>执行相似度搜索：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">SearchPoints searchPoints =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        SearchPoints.newBuilder()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .setCollectionName("knowledge")</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .addAllVector(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                        Floats.asList(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                                queryEmbedding.vector()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                        )</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                )</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .setLimit(5)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .build();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">List&lt;ScoredPoint&gt; points =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        client.searchAsync(searchPoints)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .get();</span><br></div></code></pre></div></div>
<p>获得结果：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Chunk A</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Score = 0.91</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Chunk B</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Score = 0.88</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Chunk C</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Score = 0.85</span><br></div></code></pre></div></div>
<p>取Top-K：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Top3</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Top5</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Top10</span><br></div></code></pre></div></div>
<p>一般：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Top5</span><br></div></code></pre></div></div>
<p>即可。</p>
<hr>
<h1>6. 最终回答生成</h1>
<p>检索出的知识需要注入Prompt。</p>
<hr>
<p>构造上下文：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">String context = chunks.stream()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        .collect(Collectors.joining("\n"));</span><br></div></code></pre></div></div>
<p>Prompt模板：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">String prompt = """</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">你是企业知识库助手。</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">请严格依据提供的知识回答。</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">知识：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">%s</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">问题：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">%s</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">如果知识中没有答案，</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">请明确说明不知道。</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">""".formatted(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        context,</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        question</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">);</span><br></div></code></pre></div></div>
<hr>
<p>调用大模型：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">String answer =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        chatModel.chat(prompt)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .aiMessage()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .text();</span><br></div></code></pre></div></div>
<p>最终效果：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">问题：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">商品评价AI回复如何实现？</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">回答：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">系统首先进行情感识别，</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">判断评价属于好评还是差评。</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">对于差评进一步进行意图识别，</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">例如物流慢、商品质量问题、</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">水果不甜等。</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">随后检索对应知识库，</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">结合Prompt模板生成回复。</span><br></div></code></pre></div></div>
<hr>
<h1>完整流程代码</h1>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">public String ask(String question) {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    // 1. 问题向量化</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    Embedding queryEmbedding =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            embeddingModel.embed(question)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                    .content();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    // 2. Qdrant检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    List&lt;String&gt; chunks =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            vectorStore.search(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                    queryEmbedding.vector(),</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                    5</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            );</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    // 3. 拼接上下文</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    String context =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            String.join("\n", chunks);</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    // 4. 构造Prompt</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    String prompt = """</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            请依据知识回答问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            知识:</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            %s</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            问题:</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            %s</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            """</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            .formatted(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                    context,</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                    question</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            );</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    // 5. LLM生成</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    return chatModel.chat(prompt)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            .aiMessage()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            .text();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>整体架构：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">文档</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">解析</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Chunk</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Embedding</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Qdrant</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">向量检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Prompt增强</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM回答</span><br></div></code></pre></div></div>
<hr>
<h1>总结</h1>
<p>一个完整的 RAG 系统，本质上只包含六个核心步骤：</p>
<ol>
<li class="">文档解析（PDF/Word/Markdown）</li>
<li class="">Chunk切分</li>
<li class="">Embedding向量生成</li>
<li class="">Qdrant存储</li>
<li class="">向量检索</li>
<li class="">LLM生成答案</li>
</ol>
<p>对于企业级应用，还会继续增加以下能力：</p>
<ul>
<li class="">Hybrid Search（关键词+向量检索）</li>
<li class="">ReRank重排序</li>
<li class="">多路知识库路由</li>
<li class="">Metadata过滤</li>
<li class="">Query Rewrite</li>
<li class="">多轮会话记忆</li>
<li class="">文档增量更新</li>
<li class="">检索结果评估</li>
</ul>]]></content:encoded>
            <category>ai</category>
            <category>java</category>
            <category>rag</category>
        </item>
        <item>
            <title><![CDATA[RAG为什么是企业AI应用的标配]]></title>
            <link>https://jameszbl.github.io/rag-is-necessary</link>
            <guid>https://jameszbl.github.io/rag-is-necessary</guid>
            <pubDate>Fri, 07 Nov 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[{/ truncate /}]]></description>
            <content:encoded><![CDATA[
<p>当真正开始开发企业级 AI 应用时，大家很快会遇到一个问题：</p>
<blockquote>
<p>为什么模型看起来很聪明，但在业务场景里经常答错？</p>
</blockquote>
<p>这也是 RAG（Retrieval-Augmented Generation，检索增强生成）诞生的原因。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="1-为什么模型回答不准">1. 为什么模型回答不准<a href="https://jameszbl.github.io/rag-is-necessary#1-%E4%B8%BA%E4%BB%80%E4%B9%88%E6%A8%A1%E5%9E%8B%E5%9B%9E%E7%AD%94%E4%B8%8D%E5%87%86" class="hash-link" aria-label="1. 为什么模型回答不准的直接链接" title="1. 为什么模型回答不准的直接链接" translate="no">​</a></h2>
<p>很多人第一次接入大模型后都会发现：</p>
<ul>
<li class="">通用知识回答很好</li>
<li class="">企业内部知识回答很差</li>
<li class="">经常一本正经地胡说八道</li>
</ul>
<p>例如：</p>
<p>用户提问：</p>
<blockquote>
<p>东方甄选自营烤肠的保质期是多久？</p>
</blockquote>
<p>如果这个商品是最近上线的，模型训练时根本没有见过相关数据。</p>
<p>此时模型只能依赖训练阶段学到的知识进行推理，最终可能生成一个看似合理但完全错误的答案。</p>
<p>这种现象通常被称为：</p>
<p><strong>Hallucination（幻觉）</strong></p>
<p>产生原因主要有：</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="知识截止">知识截止<a href="https://jameszbl.github.io/rag-is-necessary#%E7%9F%A5%E8%AF%86%E6%88%AA%E6%AD%A2" class="hash-link" aria-label="知识截止的直接链接" title="知识截止的直接链接" translate="no">​</a></h3>
<p>大模型只知道训练时的数据。</p>
<p>例如：</p>
<ul>
<li class="">GPT训练到某个时间点</li>
<li class="">DeepSeek训练到某个时间点</li>
</ul>
<p>之后新增的数据模型并不知道。</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="企业私有知识缺失">企业私有知识缺失<a href="https://jameszbl.github.io/rag-is-necessary#%E4%BC%81%E4%B8%9A%E7%A7%81%E6%9C%89%E7%9F%A5%E8%AF%86%E7%BC%BA%E5%A4%B1" class="hash-link" aria-label="企业私有知识缺失的直接链接" title="企业私有知识缺失的直接链接" translate="no">​</a></h3>
<p>企业内部数据通常不会进入公开训练集。</p>
<p>例如：</p>
<ul>
<li class="">商品信息</li>
<li class="">售后规则</li>
<li class="">客服话术</li>
<li class="">运维文档</li>
<li class="">内部制度</li>
</ul>
<p>这些内容模型天然不知道。</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="概率生成机制">概率生成机制<a href="https://jameszbl.github.io/rag-is-necessary#%E6%A6%82%E7%8E%87%E7%94%9F%E6%88%90%E6%9C%BA%E5%88%B6" class="hash-link" aria-label="概率生成机制的直接链接" title="概率生成机制的直接链接" translate="no">​</a></h3>
<p>LLM本质上是在预测下一个Token。</p>
<p>它追求的是：</p>
<blockquote>
<p>最可能的答案</p>
</blockquote>
<p>而不是：</p>
<blockquote>
<p>最真实的答案</p>
</blockquote>
<p>因此会出现“看起来很合理”的错误回答。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="2-prompt为什么解决不了知识更新问题">2. Prompt为什么解决不了知识更新问题<a href="https://jameszbl.github.io/rag-is-necessary#2-prompt%E4%B8%BA%E4%BB%80%E4%B9%88%E8%A7%A3%E5%86%B3%E4%B8%8D%E4%BA%86%E7%9F%A5%E8%AF%86%E6%9B%B4%E6%96%B0%E9%97%AE%E9%A2%98" class="hash-link" aria-label="2. Prompt为什么解决不了知识更新问题的直接链接" title="2. Prompt为什么解决不了知识更新问题的直接链接" translate="no">​</a></h2>
<p>很多初学者会想到：</p>
<blockquote>
<p>那我把知识写进Prompt里不就行了吗？</p>
</blockquote>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">你是一名客服专家。</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">商品A的退货规则：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">7天无理由退货</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">商品B的退货规则：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">15天无理由退货</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">......</span><br></div></code></pre></div></div>
<p>这种方式对于少量数据是可行的。</p>
<p>但在企业场景下很快会遇到问题。</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="问题1上下文长度有限">问题1：上下文长度有限<a href="https://jameszbl.github.io/rag-is-necessary#%E9%97%AE%E9%A2%981%E4%B8%8A%E4%B8%8B%E6%96%87%E9%95%BF%E5%BA%A6%E6%9C%89%E9%99%90" class="hash-link" aria-label="问题1：上下文长度有限的直接链接" title="问题1：上下文长度有限的直接链接" translate="no">​</a></h3>
<p>模型存在Context Window限制。</p>
<p>例如：</p>
<ul>
<li class="">32K</li>
<li class="">128K</li>
<li class="">256K</li>
</ul>
<p>企业知识库可能包含：</p>
<ul>
<li class="">数万商品</li>
<li class="">数十万FAQ</li>
<li class="">上百万文档</li>
</ul>
<p>根本无法全部塞进Prompt。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="问题2知识更新成本高">问题2：知识更新成本高<a href="https://jameszbl.github.io/rag-is-necessary#%E9%97%AE%E9%A2%982%E7%9F%A5%E8%AF%86%E6%9B%B4%E6%96%B0%E6%88%90%E6%9C%AC%E9%AB%98" class="hash-link" aria-label="问题2：知识更新成本高的直接链接" title="问题2：知识更新成本高的直接链接" translate="no">​</a></h3>
<p>假设：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">苹果礼盒退货规则：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">7天无理由</span><br></div></code></pre></div></div>
<p>运营修改为：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">15天无理由</span><br></div></code></pre></div></div>
<p>如果知识写死在Prompt中：</p>
<ul>
<li class="">需要修改代码</li>
<li class="">重新发布系统</li>
</ul>
<p>维护成本极高。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="问题3token成本高">问题3：Token成本高<a href="https://jameszbl.github.io/rag-is-necessary#%E9%97%AE%E9%A2%983token%E6%88%90%E6%9C%AC%E9%AB%98" class="hash-link" aria-label="问题3：Token成本高的直接链接" title="问题3：Token成本高的直接链接" translate="no">​</a></h3>
<p>Prompt越长：</p>
<ul>
<li class="">输入Token越多</li>
<li class="">推理时间越长</li>
<li class="">API费用越高</li>
</ul>
<p>对于高并发业务：</p>
<p>例如：</p>
<ul>
<li class="">智能客服</li>
<li class="">AI助手</li>
<li class="">商品问答</li>
</ul>
<p>成本会迅速上升。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="3-rag整体架构">3. RAG整体架构<a href="https://jameszbl.github.io/rag-is-necessary#3-rag%E6%95%B4%E4%BD%93%E6%9E%B6%E6%9E%84" class="hash-link" aria-label="3. RAG整体架构的直接链接" title="3. RAG整体架构的直接链接" translate="no">​</a></h2>
<p>为了解决知识更新和知识注入问题，行业逐渐形成了RAG架构。</p>
<p>RAG全称：</p>
<p><strong>Retrieval-Augmented Generation</strong></p>
<p>即：</p>
<p><strong>检索增强生成</strong></p>
<p>整体流程如下：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Embedding</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">向量检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">上下文增强</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM生成</span><br></div></code></pre></div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="第一步用户提问">第一步：用户提问<a href="https://jameszbl.github.io/rag-is-necessary#%E7%AC%AC%E4%B8%80%E6%AD%A5%E7%94%A8%E6%88%B7%E6%8F%90%E9%97%AE" class="hash-link" aria-label="第一步：用户提问的直接链接" title="第一步：用户提问的直接链接" translate="no">​</a></h3>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">苹果礼盒支持退货吗？</span><br></div></code></pre></div></div>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="第二步embedding向量化">第二步：Embedding向量化<a href="https://jameszbl.github.io/rag-is-necessary#%E7%AC%AC%E4%BA%8C%E6%AD%A5embedding%E5%90%91%E9%87%8F%E5%8C%96" class="hash-link" aria-label="第二步：Embedding向量化的直接链接" title="第二步：Embedding向量化的直接链接" translate="no">​</a></h3>
<p>将问题转换为向量：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">苹果礼盒支持退货吗？</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">[0.231, 0.754, ...]</span><br></div></code></pre></div></div>
<p>Embedding模型会将语义相近的内容映射到相近的向量空间。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">苹果礼盒支持退货吗</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">苹果礼盒售后规则是什么</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">苹果礼盒可以退款吗</span><br></div></code></pre></div></div>
<p>三句话虽然文字不同，但向量距离会很接近。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="第三步向量检索">第三步：向量检索<a href="https://jameszbl.github.io/rag-is-necessary#%E7%AC%AC%E4%B8%89%E6%AD%A5%E5%90%91%E9%87%8F%E6%A3%80%E7%B4%A2" class="hash-link" aria-label="第三步：向量检索的直接链接" title="第三步：向量检索的直接链接" translate="no">​</a></h3>
<p>在向量数据库中搜索最相似内容。</p>
<p>常见向量数据库：</p>
<ul>
<li class="">Qdrant</li>
<li class="">Milvus</li>
<li class="">Weaviate</li>
<li class="">Elasticsearch Vector</li>
</ul>
<p>例如检索到：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">商品名称：苹果礼盒</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">售后规则：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">15天无理由退货</span><br></div></code></pre></div></div>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="第四步上下文增强">第四步：上下文增强<a href="https://jameszbl.github.io/rag-is-necessary#%E7%AC%AC%E5%9B%9B%E6%AD%A5%E4%B8%8A%E4%B8%8B%E6%96%87%E5%A2%9E%E5%BC%BA" class="hash-link" aria-label="第四步：上下文增强的直接链接" title="第四步：上下文增强的直接链接" translate="no">​</a></h3>
<p>将检索结果拼接到Prompt。</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">你是一名客服专家。</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">参考资料：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">商品名称：苹果礼盒</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">售后规则：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">15天无理由退货</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">用户问题：</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">苹果礼盒支持退货吗？</span><br></div></code></pre></div></div>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="第五步llm生成">第五步：LLM生成<a href="https://jameszbl.github.io/rag-is-necessary#%E7%AC%AC%E4%BA%94%E6%AD%A5llm%E7%94%9F%E6%88%90" class="hash-link" aria-label="第五步：LLM生成的直接链接" title="第五步：LLM生成的直接链接" translate="no">​</a></h3>
<p>此时模型不再依赖训练知识。</p>
<p>而是依据实时检索到的内容回答：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">苹果礼盒支持15天无理由退货。</span><br></div></code></pre></div></div>
<p>这就是RAG的核心思想：</p>
<blockquote>
<p>不让模型记住知识，而是让模型学会查知识。</p>
</blockquote>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="4-企业知识库场景">4. 企业知识库场景<a href="https://jameszbl.github.io/rag-is-necessary#4-%E4%BC%81%E4%B8%9A%E7%9F%A5%E8%AF%86%E5%BA%93%E5%9C%BA%E6%99%AF" class="hash-link" aria-label="4. 企业知识库场景的直接链接" title="4. 企业知识库场景的直接链接" translate="no">​</a></h2>
<p>RAG几乎已经成为企业AI应用的标准配置。</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="商品知识库">商品知识库<a href="https://jameszbl.github.io/rag-is-necessary#%E5%95%86%E5%93%81%E7%9F%A5%E8%AF%86%E5%BA%93" class="hash-link" aria-label="商品知识库的直接链接" title="商品知识库的直接链接" translate="no">​</a></h3>
<p>电商场景非常常见。</p>
<p>知识来源：</p>
<ul>
<li class="">商品详情</li>
<li class="">商品属性</li>
<li class="">售后规则</li>
<li class="">商品评价</li>
</ul>
<p>用户提问：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">榴莲为什么不甜？</span><br></div></code></pre></div></div>
<p>系统可以检索：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">榴莲成熟度说明</span><br></div></code></pre></div></div>
<p>再由模型生成专业回复。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="客服知识库">客服知识库<a href="https://jameszbl.github.io/rag-is-necessary#%E5%AE%A2%E6%9C%8D%E7%9F%A5%E8%AF%86%E5%BA%93" class="hash-link" aria-label="客服知识库的直接链接" title="客服知识库的直接链接" translate="no">​</a></h3>
<p>客服机器人是目前最成熟的RAG应用。</p>
<p>知识来源：</p>
<ul>
<li class="">FAQ</li>
<li class="">客服手册</li>
<li class="">业务规则</li>
<li class="">退款流程</li>
</ul>
<p>用户提问：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">订单发货后还能取消吗？</span><br></div></code></pre></div></div>
<p>系统实时检索相关规则并回答。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="运维知识库">运维知识库<a href="https://jameszbl.github.io/rag-is-necessary#%E8%BF%90%E7%BB%B4%E7%9F%A5%E8%AF%86%E5%BA%93" class="hash-link" aria-label="运维知识库的直接链接" title="运维知识库的直接链接" translate="no">​</a></h3>
<p>很多企业正在建设AIOps系统。</p>
<p>知识来源：</p>
<ul>
<li class="">故障案例</li>
<li class="">运维手册</li>
<li class="">Kubernetes文档</li>
<li class="">系统架构说明</li>
</ul>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">为什么Pod一直CrashLoopBackOff？</span><br></div></code></pre></div></div>
<p>系统检索历史案例和运维文档后生成排查建议。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="5-rag的优缺点">5. RAG的优缺点<a href="https://jameszbl.github.io/rag-is-necessary#5-rag%E7%9A%84%E4%BC%98%E7%BC%BA%E7%82%B9" class="hash-link" aria-label="5. RAG的优缺点的直接链接" title="5. RAG的优缺点的直接链接" translate="no">​</a></h2>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="优点">优点<a href="https://jameszbl.github.io/rag-is-necessary#%E4%BC%98%E7%82%B9" class="hash-link" aria-label="优点的直接链接" title="优点的直接链接" translate="no">​</a></h3>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="知识实时更新">知识实时更新<a href="https://jameszbl.github.io/rag-is-necessary#%E7%9F%A5%E8%AF%86%E5%AE%9E%E6%97%B6%E6%9B%B4%E6%96%B0" class="hash-link" aria-label="知识实时更新的直接链接" title="知识实时更新的直接链接" translate="no">​</a></h4>
<p>新增文档后即可生效。</p>
<p>无需重新训练模型。</p>
<hr>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="降低幻觉">降低幻觉<a href="https://jameszbl.github.io/rag-is-necessary#%E9%99%8D%E4%BD%8E%E5%B9%BB%E8%A7%89" class="hash-link" aria-label="降低幻觉的直接链接" title="降低幻觉的直接链接" translate="no">​</a></h4>
<p>回答基于真实知识库。</p>
<p>准确率显著提升。</p>
<hr>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="成本低">成本低<a href="https://jameszbl.github.io/rag-is-necessary#%E6%88%90%E6%9C%AC%E4%BD%8E" class="hash-link" aria-label="成本低的直接链接" title="成本低的直接链接" translate="no">​</a></h4>
<p>相比Fine-tuning：</p>
<ul>
<li class="">实现简单</li>
<li class="">维护容易</li>
<li class="">更新速度快</li>
</ul>
<p>更适合企业场景。</p>
<hr>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="模型无关">模型无关<a href="https://jameszbl.github.io/rag-is-necessary#%E6%A8%A1%E5%9E%8B%E6%97%A0%E5%85%B3" class="hash-link" aria-label="模型无关的直接链接" title="模型无关的直接链接" translate="no">​</a></h4>
<p>可以接入任意LLM：</p>
<ul>
<li class="">GPT</li>
<li class="">DeepSeek</li>
<li class="">Qwen</li>
<li class="">Claude</li>
</ul>
<p>知识库无需改变。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="缺点">缺点<a href="https://jameszbl.github.io/rag-is-necessary#%E7%BC%BA%E7%82%B9" class="hash-link" aria-label="缺点的直接链接" title="缺点的直接链接" translate="no">​</a></h3>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="检索质量决定上限">检索质量决定上限<a href="https://jameszbl.github.io/rag-is-necessary#%E6%A3%80%E7%B4%A2%E8%B4%A8%E9%87%8F%E5%86%B3%E5%AE%9A%E4%B8%8A%E9%99%90" class="hash-link" aria-label="检索质量决定上限的直接链接" title="检索质量决定上限的直接链接" translate="no">​</a></h4>
<p>如果检索不到正确内容：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Garbage In</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Garbage Out</span><br></div></code></pre></div></div>
<p>模型仍然会回答错误。</p>
<hr>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="chunk切分困难">Chunk切分困难<a href="https://jameszbl.github.io/rag-is-necessary#chunk%E5%88%87%E5%88%86%E5%9B%B0%E9%9A%BE" class="hash-link" aria-label="Chunk切分困难的直接链接" title="Chunk切分困难的直接链接" translate="no">​</a></h4>
<p>文档如何切分：</p>
<ul>
<li class="">太大召回不准</li>
<li class="">太小丢失上下文</li>
</ul>
<p>需要不断调优。</p>
<hr>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="检索延迟增加">检索延迟增加<a href="https://jameszbl.github.io/rag-is-necessary#%E6%A3%80%E7%B4%A2%E5%BB%B6%E8%BF%9F%E5%A2%9E%E5%8A%A0" class="hash-link" aria-label="检索延迟增加的直接链接" title="检索延迟增加的直接链接" translate="no">​</a></h4>
<p>相比直接调用LLM：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">重排</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生成</span><br></div></code></pre></div></div>
<p>会增加响应时间。</p>
<hr>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="无法替代模型能力">无法替代模型能力<a href="https://jameszbl.github.io/rag-is-necessary#%E6%97%A0%E6%B3%95%E6%9B%BF%E4%BB%A3%E6%A8%A1%E5%9E%8B%E8%83%BD%E5%8A%9B" class="hash-link" aria-label="无法替代模型能力的直接链接" title="无法替代模型能力的直接链接" translate="no">​</a></h4>
<p>RAG解决的是：</p>
<blockquote>
<p>知识问题</p>
</blockquote>
<p>而不是：</p>
<blockquote>
<p>推理能力问题</p>
</blockquote>
<p>如果任务需要复杂逻辑推理，仅靠RAG并不能解决。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="总结">总结<a href="https://jameszbl.github.io/rag-is-necessary#%E6%80%BB%E7%BB%93" class="hash-link" aria-label="总结的直接链接" title="总结的直接链接" translate="no">​</a></h2>
<p>大模型最大的短板是：</p>
<blockquote>
<p>不知道企业自己的知识。</p>
</blockquote>
<p>Prompt可以临时注入少量知识，但无法解决大规模知识管理和实时更新问题。</p>
<p>因此行业逐渐形成了RAG架构：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Embedding</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">向量检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">上下文增强</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM生成</span><br></div></code></pre></div></div>
<p>对于绝大多数企业AI项目来说：</p>
<ul>
<li class="">智能客服</li>
<li class="">商品问答</li>
<li class="">企业搜索</li>
<li class="">AI助手</li>
<li class="">运维Copilot</li>
</ul>
<p>第一步往往不是Agent，而是先做好RAG。</p>
<p>因为企业AI落地最核心的问题，从来不是模型不够聪明，而是模型不知道企业自己的知识。</p>]]></content:encoded>
            <category>ai</category>
            <category>java</category>
            <category>rag</category>
        </item>
        <item>
            <title><![CDATA[Java接入大模型的最佳实践]]></title>
            <link>https://jameszbl.github.io/best-practise-for-javaer-on-ai</link>
            <guid>https://jameszbl.github.io/best-practise-for-javaer-on-ai</guid>
            <pubDate>Wed, 22 Oct 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[{/ truncate /}]]></description>
            <content:encoded><![CDATA[
<p>随着大模型技术逐渐进入企业生产环境，越来越多的 Java 工程师开始参与 AI 应用开发。从最初的 ChatBot，到 RAG 知识库、Agent、工作流编排，Java 技术栈已经能够覆盖绝大多数 AI 应用场景。</p>
<p>本文结合实际项目经验，介绍 Java 接入大模型的主流方案以及企业级最佳实践。</p>
<hr>
<h1>1. OpenAI协议已经成为事实标准</h1>
<p>在早期接入 AI 模型时，每家模型厂商都有自己独立的接口规范：</p>
<ul>
<li class="">OpenAI API</li>
<li class="">Claude API</li>
<li class="">Gemini API</li>
<li class="">文心一言 API</li>
<li class="">通义千问 API</li>
</ul>
<p>开发者需要针对不同厂商编写不同的适配代码，维护成本极高。</p>
<p>随着 OpenAI 的普及，OpenAI Chat Completion API 逐渐成为行业标准。</p>
<p>目前绝大多数模型服务商都兼容 OpenAI 协议：</p>
<ul>
<li class="">DeepSeek</li>
<li class="">阿里百炼</li>
<li class="">智谱AI</li>
<li class="">月之暗面(Kimi)</li>
<li class="">SiliconFlow</li>
<li class="">OpenRouter</li>
<li class="">Azure OpenAI</li>
</ul>
<p>典型请求格式：</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">POST /v1/chat/completions</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token property" style="color:#36acaa">"model"</span><span class="token operator" style="color:#393A34">:</span><span class="token string" style="color:#e3116c">"deepseek-chat"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token property" style="color:#36acaa">"messages"</span><span class="token operator" style="color:#393A34">:</span><span class="token punctuation" style="color:#393A34">[</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      </span><span class="token property" style="color:#36acaa">"role"</span><span class="token operator" style="color:#393A34">:</span><span class="token string" style="color:#e3116c">"user"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      </span><span class="token property" style="color:#36acaa">"content"</span><span class="token operator" style="color:#393A34">:</span><span class="token string" style="color:#e3116c">"你好"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<p>因此对于 Java 工程师来说：</p>
<p><strong>只要支持 OpenAI 协议，理论上就能快速切换底层模型。</strong></p>
<p>这也是为什么目前企业项目都会优先选择支持 OpenAI 协议的 AI 框架。</p>
<hr>
<h1>2. LangChain4j架构解析</h1>
<p>在 Java 生态中，目前最流行的大模型框架是 LangChain4j。</p>
<p>它类似于 Spring Data 对数据库的封装。</p>
<p>开发者无需关心底层 HTTP 调用，只需要关注业务逻辑。</p>
<p>核心架构如下：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Application</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">AiService</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">ChatModel</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM Provider</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">(OpenAI/DeepSeek/Qwen...)</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="chatmodel">ChatModel<a href="https://jameszbl.github.io/best-practise-for-javaer-on-ai#chatmodel" class="hash-link" aria-label="ChatModel的直接链接" title="ChatModel的直接链接" translate="no">​</a></h2>
<p>ChatModel 是最基础的模型接口。</p>
<p>类似于同步调用。</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">ChatModel model = OpenAiChatModel.builder()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        .apiKey(apiKey)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        .baseUrl(baseUrl)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        .modelName("deepseek-chat")</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        .build();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">String answer = model.chat("介绍一下SpringBoot");</span><br></div></code></pre></div></div>
<p>特点：</p>
<ul>
<li class="">简单</li>
<li class="">阻塞调用</li>
<li class="">适合测试场景</li>
</ul>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="streamingchatmodel">StreamingChatModel<a href="https://jameszbl.github.io/best-practise-for-javaer-on-ai#streamingchatmodel" class="hash-link" aria-label="StreamingChatModel的直接链接" title="StreamingChatModel的直接链接" translate="no">​</a></h2>
<p>用于流式输出。</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">StreamingChatModel model =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        OpenAiStreamingChatModel.builder()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .apiKey(apiKey)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .baseUrl(baseUrl)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .modelName("deepseek-chat")</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .build();</span><br></div></code></pre></div></div>
<p>可以实现：</p>
<ul>
<li class="">打字机效果</li>
<li class="">长文本输出</li>
<li class="">实时推送</li>
</ul>
<p>这是生产环境最常用的模式。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="aiservice">AiService<a href="https://jameszbl.github.io/best-practise-for-javaer-on-ai#aiservice" class="hash-link" aria-label="AiService的直接链接" title="AiService的直接链接" translate="no">​</a></h2>
<p>AiService 是 LangChain4j 最强大的能力。</p>
<p>开发者只需要定义接口：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">interface Assistant {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    String chat(String message);</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>创建代理：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Assistant assistant =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        AiServices.create(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            Assistant.class,</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            model</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        );</span><br></div></code></pre></div></div>
<p>调用：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">assistant.chat("你好");</span><br></div></code></pre></div></div>
<p>类似于 Spring 的：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">@Mapper</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">@Repository</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">@FeignClient</span><br></div></code></pre></div></div>
<p>底层通过动态代理自动完成提示词拼装。</p>
<p>企业项目中大部分业务Agent都会基于 AiService 实现。</p>
<hr>
<h1>3. SpringBoot整合LangChain4j</h1>
<p>Maven依赖：</p>
<div class="language-xml codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-xml codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token tag punctuation" style="color:#393A34">&lt;</span><span class="token tag" style="color:#00009f">dependency</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token tag punctuation" style="color:#393A34">&lt;</span><span class="token tag" style="color:#00009f">groupId</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain">dev.langchain4j</span><span class="token tag punctuation" style="color:#393A34">&lt;/</span><span class="token tag" style="color:#00009f">groupId</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token tag punctuation" style="color:#393A34">&lt;</span><span class="token tag" style="color:#00009f">artifactId</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain">langchain4j-open-ai</span><span class="token tag punctuation" style="color:#393A34">&lt;/</span><span class="token tag" style="color:#00009f">artifactId</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token tag punctuation" style="color:#393A34">&lt;/</span><span class="token tag" style="color:#00009f">dependency</span><span class="token tag punctuation" style="color:#393A34">&gt;</span><br></div></code></pre></div></div>
<p>配置文件：</p>
<div class="language-yaml codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-yaml codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token key atrule" style="color:#00a4db">ai</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token key atrule" style="color:#00a4db">base-url</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> https</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain">//api.deepseek.com</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token key atrule" style="color:#00a4db">api-key</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> xxx</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token key atrule" style="color:#00a4db">model</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> deepseek</span><span class="token punctuation" style="color:#393A34">-</span><span class="token plain">chat</span><br></div></code></pre></div></div>
<p>配置Bean：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">@Bean</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">public ChatModel chatModel() {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    return OpenAiChatModel.builder()</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            .apiKey(apiKey)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            .baseUrl(baseUrl)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            .modelName(model)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            .build();</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>业务调用：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">@RestController</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">public class ChatController {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    @Autowired</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    private Assistant assistant;</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    @GetMapping("/chat")</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    public String chat(String msg) {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        return assistant.chat(msg);</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    }</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>至此已经完成最基础的大模型接入。</p>
<hr>
<h1>4. 流式输出实现</h1>
<p>真实项目中几乎不会使用同步返回。</p>
<p>因为：</p>
<ul>
<li class="">响应时间长</li>
<li class="">用户体验差</li>
<li class="">无法实时展示生成内容</li>
</ul>
<p>通常采用 SSE + WebFlux。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="sse">SSE<a href="https://jameszbl.github.io/best-practise-for-javaer-on-ai#sse" class="hash-link" aria-label="SSE的直接链接" title="SSE的直接链接" translate="no">​</a></h2>
<p>服务端：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">@GetMapping(value = "/chat",</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        produces = MediaType.TEXT_EVENT_STREAM_VALUE)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">public Flux&lt;String&gt; chat(String message) {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    return aiService.chat(message);</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>前端：</p>
<div class="language-javascript codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-javascript codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">const</span><span class="token plain"> eventSource </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">new</span><span class="token plain"> </span><span class="token class-name">EventSource</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"/chat?message=你好"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">eventSource</span><span class="token punctuation" style="color:#393A34">.</span><span class="token method-variable function-variable method function property-access" style="color:#d73a49">onmessage</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token parameter">event</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token arrow operator" style="color:#393A34">=&gt;</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token console class-name">console</span><span class="token punctuation" style="color:#393A34">.</span><span class="token method function property-access" style="color:#d73a49">log</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">event</span><span class="token punctuation" style="color:#393A34">.</span><span class="token property-access">data</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">;</span><br></div></code></pre></div></div>
<p>浏览器会持续接收 Token。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="webflux">WebFlux<a href="https://jameszbl.github.io/best-practise-for-javaer-on-ai#webflux" class="hash-link" aria-label="WebFlux的直接链接" title="WebFlux的直接链接" translate="no">​</a></h2>
<p>企业项目推荐：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">public Flux&lt;String&gt; chat(String message) {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    return Flux.create(emitter -&gt; {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        assistant.chat(</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            message,</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            token -&gt; emitter.next(token),</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            emitter::error,</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            emitter::complete</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        );</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    });</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>优势：</p>
<ul>
<li class="">非阻塞</li>
<li class="">高并发</li>
<li class="">节省线程资源</li>
</ul>
<p>特别适合 AI Chat 场景。</p>
<hr>
<h1>5. Tool Calling实现</h1>
<p>Tool Calling 本质上是：</p>
<p><strong>让大模型调用 Java 方法。</strong></p>
<p>例如：</p>
<p>用户：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">北京天气怎么样</span><br></div></code></pre></div></div>
<p>模型返回：</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token property" style="color:#36acaa">"tool"</span><span class="token operator" style="color:#393A34">:</span><span class="token string" style="color:#e3116c">"getWeather"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token property" style="color:#36acaa">"city"</span><span class="token operator" style="color:#393A34">:</span><span class="token string" style="color:#e3116c">"北京"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<p>Java执行：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">public class WeatherTool {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    @Tool</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    public String getWeather(String city) {</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        return weatherService.query(city);</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    }</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">}</span><br></div></code></pre></div></div>
<p>注册：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Assistant assistant =</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        AiServices.builder(Assistant.class)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .chatModel(model)</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .tools(new WeatherTool())</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">                .build();</span><br></div></code></pre></div></div>
<p>执行流程：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">User</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Tool Calling</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Java Method</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Tool Result</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Final Answer</span><br></div></code></pre></div></div>
<p>企业项目常见 Tool：</p>
<ul>
<li class="">商品查询</li>
<li class="">用户信息查询</li>
<li class="">订单查询</li>
<li class="">库存查询</li>
<li class="">CRM系统</li>
<li class="">OA系统</li>
<li class="">ERP系统</li>
</ul>
<p>本质上都是把已有业务能力开放给大模型。</p>
<hr>
<h1>6. 企业项目中的封装方式</h1>
<p>很多团队刚开始接入 AI 时会这样写：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">controller</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  -&gt; service</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      -&gt; model.chat()</span><br></div></code></pre></div></div>
<p>项目稍微复杂后会迅速失控。</p>
<p>推荐分层架构：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Controller</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Agent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Workflow</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Tools</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LLM</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="第一层agent">第一层：Agent<a href="https://jameszbl.github.io/best-practise-for-javaer-on-ai#%E7%AC%AC%E4%B8%80%E5%B1%82agent" class="hash-link" aria-label="第一层：Agent的直接链接" title="第一层：Agent的直接链接" translate="no">​</a></h2>
<p>负责业务入口。</p>
<p>例如：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">CustomerServiceAgent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">ReviewReplyAgent</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">KnowledgeAgent</span><br></div></code></pre></div></div>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="第二层workflow">第二层：Workflow<a href="https://jameszbl.github.io/best-practise-for-javaer-on-ai#%E7%AC%AC%E4%BA%8C%E5%B1%82workflow" class="hash-link" aria-label="第二层：Workflow的直接链接" title="第二层：Workflow的直接链接" translate="no">​</a></h2>
<p>负责流程编排。</p>
<p>例如商品评价回复：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">评价内容</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">情感分析</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">意图识别</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">知识库检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">提示词组装</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生成回复</span><br></div></code></pre></div></div>
<p>这一层可以使用：</p>
<ul>
<li class="">LangGraph4j</li>
<li class="">Spring AI Workflow</li>
<li class="">Flowable</li>
<li class="">自研工作流</li>
</ul>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="第三层tool层">第三层：Tool层<a href="https://jameszbl.github.io/best-practise-for-javaer-on-ai#%E7%AC%AC%E4%B8%89%E5%B1%82tool%E5%B1%82" class="hash-link" aria-label="第三层：Tool层的直接链接" title="第三层：Tool层的直接链接" translate="no">​</a></h2>
<p>统一封装外部能力：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">UserTool</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">OrderTool</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">ProductTool</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">SearchTool</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">KnowledgeTool</span><br></div></code></pre></div></div>
<p>避免业务直接访问数据库。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="第四层模型网关">第四层：模型网关<a href="https://jameszbl.github.io/best-practise-for-javaer-on-ai#%E7%AC%AC%E5%9B%9B%E5%B1%82%E6%A8%A1%E5%9E%8B%E7%BD%91%E5%85%B3" class="hash-link" aria-label="第四层：模型网关的直接链接" title="第四层：模型网关的直接链接" translate="no">​</a></h2>
<p>统一管理：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">DeepSeek</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">GPT-4o</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Qwen-Max</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Claude</span><br></div></code></pre></div></div>
<p>实现：</p>
<div class="language-java codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-java codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">ChatModelFactory</span><br></div></code></pre></div></div>
<p>能力包括：</p>
<ul>
<li class="">模型路由</li>
<li class="">降级切换</li>
<li class="">重试</li>
<li class="">限流</li>
<li class="">监控</li>
</ul>
<hr>
<h1>总结</h1>
<p>对于 Java 工程师来说，大模型应用开发的核心技术路线已经逐渐清晰：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">OpenAI协议</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">LangChain4j</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">SpringBoot</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">SSE/WebFlux</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Tool Calling</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Workflow</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">      ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">Agent</span><br></div></code></pre></div></div>
<p>如果把企业 AI 应用看成一个系统工程，那么大模型只是其中的一环。真正决定项目质量的，往往是工作流编排、工具体系建设、知识库设计以及模型治理能力。</p>]]></content:encoded>
            <category>ai</category>
            <category>java</category>
        </item>
        <item>
            <title><![CDATA[为什么Java工程师也需要理解大模型原理]]></title>
            <link>https://jameszbl.github.io/why-javaer-know-ai</link>
            <guid>https://jameszbl.github.io/why-javaer-know-ai</guid>
            <pubDate>Wed, 08 Oct 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[{/ truncate /}]]></description>
            <content:encoded><![CDATA[
<p>近年来，AI技术的发展速度远超大多数人的预期。从 ChatGPT 到企业级 Agent，从智能客服到代码生成，大模型正在快速改变软件开发行业。</p>
<p>很多 Java 工程师认为：</p>
<blockquote>
<p>“我只需要会调用 API 就行，大模型原理不重要。”</p>
</blockquote>
<p>事实上，如果只是做一个 Demo，调用 API 确实足够；但如果要构建真正可落地的企业 AI 应用，那么理解大模型的工作原理几乎是必修课。</p>
<p>因为后续你会遇到：</p>
<ul>
<li class="">为什么模型会胡说八道？</li>
<li class="">为什么上下文太长成本会暴涨？</li>
<li class="">为什么知识库问答效果不好？</li>
<li class="">为什么 Agent 经常执行失败？</li>
<li class="">为什么同一个问题每次答案都不一样？</li>
</ul>
<p>这些问题最终都能追溯到大模型本身的运行机制。</p>
<p>本文作为整个系列的开篇，帮助大家建立对大模型的整体认知。</p>
<hr>
<h1>ChatGPT到底解决了什么问题</h1>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="nlp的发展历史">NLP的发展历史<a href="https://jameszbl.github.io/why-javaer-know-ai#nlp%E7%9A%84%E5%8F%91%E5%B1%95%E5%8E%86%E5%8F%B2" class="hash-link" aria-label="NLP的发展历史的直接链接" title="NLP的发展历史的直接链接" translate="no">​</a></h2>
<p>在 ChatGPT 出现之前，自然语言处理（NLP）已经发展了很多年。</p>
<p>大致经历了三个阶段：</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="第一阶段规则系统">第一阶段：规则系统<a href="https://jameszbl.github.io/why-javaer-know-ai#%E7%AC%AC%E4%B8%80%E9%98%B6%E6%AE%B5%E8%A7%84%E5%88%99%E7%B3%BB%E7%BB%9F" class="hash-link" aria-label="第一阶段：规则系统的直接链接" title="第一阶段：规则系统的直接链接" translate="no">​</a></h3>
<p>早期的智能客服通常依赖大量规则。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">如果用户包含“退款”</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    返回退款流程</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">如果用户包含“发票”</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    返回开发票流程</span><br></div></code></pre></div></div>
<p>这种方式实现简单，但问题明显：</p>
<ul>
<li class="">规则维护成本极高</li>
<li class="">无法理解上下文</li>
<li class="">表达方式稍有变化就无法识别</li>
</ul>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">我要退钱</span><br></div></code></pre></div></div>
<p>和</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">这个商品我不想要了</span><br></div></code></pre></div></div>
<p>本质意思相同，但规则系统可能完全识别不出来。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="第二阶段机器学习">第二阶段：机器学习<a href="https://jameszbl.github.io/why-javaer-know-ai#%E7%AC%AC%E4%BA%8C%E9%98%B6%E6%AE%B5%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0" class="hash-link" aria-label="第二阶段：机器学习的直接链接" title="第二阶段：机器学习的直接链接" translate="no">​</a></h3>
<p>后来出现了：</p>
<ul>
<li class="">SVM</li>
<li class="">Random Forest</li>
<li class="">XGBoost</li>
</ul>
<p>等机器学习模型。</p>
<p>这时候系统已经可以通过训练数据学习分类能力。</p>
<p>例如：</p>
<ul>
<li class="">情感分析</li>
<li class="">垃圾邮件识别</li>
<li class="">意图识别</li>
</ul>
<p>但依然存在问题：</p>
<p>模型只能完成特定任务。</p>
<p>训练一个模型做情感分析后，它无法直接做翻译或者问答。</p>
<hr>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="第三阶段深度学习与大模型">第三阶段：深度学习与大模型<a href="https://jameszbl.github.io/why-javaer-know-ai#%E7%AC%AC%E4%B8%89%E9%98%B6%E6%AE%B5%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0%E4%B8%8E%E5%A4%A7%E6%A8%A1%E5%9E%8B" class="hash-link" aria-label="第三阶段：深度学习与大模型的直接链接" title="第三阶段：深度学习与大模型的直接链接" translate="no">​</a></h3>
<p>随后出现：</p>
<ul>
<li class="">Word2Vec</li>
<li class="">LSTM</li>
<li class="">BERT</li>
</ul>
<p>再到今天的：</p>
<ul>
<li class="">GPT系列</li>
<li class="">DeepSeek系列</li>
<li class="">Qwen系列</li>
<li class="">Gemini系列</li>
</ul>
<p>大模型第一次实现了：</p>
<blockquote>
<p>一个模型完成多种语言任务。</p>
</blockquote>
<p>包括：</p>
<ul>
<li class="">问答</li>
<li class="">翻译</li>
<li class="">总结</li>
<li class="">写代码</li>
<li class="">文本生成</li>
<li class="">信息提取</li>
</ul>
<p>这也是 ChatGPT 爆火的根本原因。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="从规则系统到transformer">从规则系统到Transformer<a href="https://jameszbl.github.io/why-javaer-know-ai#%E4%BB%8E%E8%A7%84%E5%88%99%E7%B3%BB%E7%BB%9F%E5%88%B0transformer" class="hash-link" aria-label="从规则系统到Transformer的直接链接" title="从规则系统到Transformer的直接链接" translate="no">​</a></h2>
<p>过去的软件开发逻辑：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">输入</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">规则判断</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">固定输出</span><br></div></code></pre></div></div>
<p>而大模型变成：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">输入</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">神经网络推理</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">生成输出</span><br></div></code></pre></div></div>
<p>开发模式从：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">写规则</span><br></div></code></pre></div></div>
<p>变成：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">写Prompt</span><br></div></code></pre></div></div>
<p>因此对于 Java 工程师来说：</p>
<p>未来越来越多的业务逻辑，不再由 if-else 完成，而是由模型完成。</p>
<hr>
<h1>Token是什么</h1>
<p>当我们调用大模型时，实际上并不是直接把文字发送给模型。</p>
<p>模型看到的是 Token。</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="tokenization">Tokenization<a href="https://jameszbl.github.io/why-javaer-know-ai#tokenization" class="hash-link" aria-label="Tokenization的直接链接" title="Tokenization的直接链接" translate="no">​</a></h2>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Hello World</span><br></div></code></pre></div></div>
<p>可能被拆分成：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Hello</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">World</span><br></div></code></pre></div></div>
<p>两个 Token。</p>
<p>而中文情况更复杂：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">今天天气不错</span><br></div></code></pre></div></div>
<p>可能被拆成：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">今天</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">天气</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">不错</span><br></div></code></pre></div></div>
<p>也可能拆成更多 Token。</p>
<p>不同模型的分词策略并不相同。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="token数量为什么影响成本">Token数量为什么影响成本<a href="https://jameszbl.github.io/why-javaer-know-ai#token%E6%95%B0%E9%87%8F%E4%B8%BA%E4%BB%80%E4%B9%88%E5%BD%B1%E5%93%8D%E6%88%90%E6%9C%AC" class="hash-link" aria-label="Token数量为什么影响成本的直接链接" title="Token数量为什么影响成本的直接链接" translate="no">​</a></h2>
<p>大模型计费单位通常不是字符数，而是 Token 数。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">Prompt</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">上下文</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">知识库内容</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">历史对话</span><br></div></code></pre></div></div>
<p>都会占用 Token。</p>
<p>假设：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">输入 5000 Token</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">输出 1000 Token</span><br></div></code></pre></div></div>
<p>那么一次请求就消耗：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">6000 Token</span><br></div></code></pre></div></div>
<p>对于企业系统：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">100万次调用</span><br></div></code></pre></div></div>
<p>成本可能非常可观。</p>
<p>因此后面学习 RAG 时，你会发现：</p>
<p>Chunk切分、召回数量控制、上下文压缩，本质上都在优化 Token 成本。</p>
<hr>
<h1>Transformer解决了什么</h1>
<p>Transformer 是现代大模型的基础。</p>
<p>2017年，Google发表论文：</p>
<p>《Attention Is All You Need》</p>
<p>彻底改变了 NLP 领域。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="rnn的问题">RNN的问题<a href="https://jameszbl.github.io/why-javaer-know-ai#rnn%E7%9A%84%E9%97%AE%E9%A2%98" class="hash-link" aria-label="RNN的问题的直接链接" title="RNN的问题的直接链接" translate="no">​</a></h2>
<p>在 Transformer 之前，主流模型是：</p>
<ul>
<li class="">RNN</li>
<li class="">LSTM</li>
<li class="">GRU</li>
</ul>
<p>其处理方式类似：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">第1个字</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">第2个字</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">第3个字</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">...</span><br></div></code></pre></div></div>
<p>必须按顺序处理。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">我今天去了北京，</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">那里天气很好。</span><br></div></code></pre></div></div>
<p>当模型读到最后时，前面的信息已经逐渐被遗忘。</p>
<p>这就是著名的：</p>
<blockquote>
<p>长距离依赖问题</p>
</blockquote>
<p>同时因为串行计算，训练速度也非常慢。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="attention机制">Attention机制<a href="https://jameszbl.github.io/why-javaer-know-ai#attention%E6%9C%BA%E5%88%B6" class="hash-link" aria-label="Attention机制的直接链接" title="Attention机制的直接链接" translate="no">​</a></h2>
<p>Attention机制的核心思想非常简单：</p>
<blockquote>
<p>每个词都可以关注其它词。</p>
</blockquote>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">小明把书给了小红，</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">她很开心。</span><br></div></code></pre></div></div>
<p>这里的：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">她</span><br></div></code></pre></div></div>
<p>到底是谁？</p>
<p>Attention 会计算：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">她 -&gt; 小明</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">她 -&gt; 小红</span><br></div></code></pre></div></div>
<p>之间的关联程度。</p>
<p>最终发现：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">她 -&gt; 小红</span><br></div></code></pre></div></div>
<p>概率更高。</p>
<p>因此模型能够理解上下文关系。</p>
<hr>
<p>Transformer带来的优势：</p>
<ul>
<li class="">并行计算</li>
<li class="">更长上下文</li>
<li class="">更强语义理解</li>
<li class="">更好的扩展能力</li>
</ul>
<p>今天几乎所有主流大模型都建立在 Transformer 架构之上。</p>
<hr>
<h1>大模型为什么会幻觉</h1>
<p>很多人第一次使用 ChatGPT 时都会发现：</p>
<blockquote>
<p>它有时候说得特别像真的，但实际上是错的。</p>
</blockquote>
<p>这就是幻觉（Hallucination）。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="概率生成本质">概率生成本质<a href="https://jameszbl.github.io/why-javaer-know-ai#%E6%A6%82%E7%8E%87%E7%94%9F%E6%88%90%E6%9C%AC%E8%B4%A8" class="hash-link" aria-label="概率生成本质的直接链接" title="概率生成本质的直接链接" translate="no">​</a></h2>
<p>很多人以为：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">模型在数据库里查答案</span><br></div></code></pre></div></div>
<p>实际上不是。</p>
<p>大模型的本质是：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">预测下一个Token</span><br></div></code></pre></div></div>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">中国的首都是</span><br></div></code></pre></div></div>
<p>模型可能预测：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">北京</span><br></div></code></pre></div></div>
<p>概率最高。</p>
<p>于是输出：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">北京</span><br></div></code></pre></div></div>
<p>接着继续预测下一个 Token。</p>
<p>整个回答就是这样一步一步生成出来的。</p>
<p>因此模型本质上是在：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">生成最合理的答案</span><br></div></code></pre></div></div>
<p>而不是：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">查找真实答案</span><br></div></code></pre></div></div>
<p>这就是幻觉产生的根源。</p>
<hr>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="知识截止时间">知识截止时间<a href="https://jameszbl.github.io/why-javaer-know-ai#%E7%9F%A5%E8%AF%86%E6%88%AA%E6%AD%A2%E6%97%B6%E9%97%B4" class="hash-link" aria-label="知识截止时间的直接链接" title="知识截止时间的直接链接" translate="no">​</a></h2>
<p>另一个原因是：</p>
<p>模型知识来自训练数据。</p>
<p>例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">训练数据截止2025年</span><br></div></code></pre></div></div>
<p>那么：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">2026年的新闻</span><br></div></code></pre></div></div>
<p>模型天然不知道。</p>
<p>如果强行回答：</p>
<p>就可能编造内容。</p>
<p>因此企业级应用通常会：</p>
<ul>
<li class="">接入搜索系统</li>
<li class="">接入知识库</li>
<li class="">接入数据库</li>
</ul>
<p>让模型获得实时信息。</p>
<hr>
<h1>企业为什么不能直接调用ChatGPT</h1>
<p>很多人认为：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户提问</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">ChatGPT</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">返回答案</span><br></div></code></pre></div></div>
<p>就结束了。</p>
<p>现实中的企业系统远比这复杂。</p>
<hr>
<p>例如用户提问：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">订单为什么还没发货？</span><br></div></code></pre></div></div>
<p>ChatGPT 并不知道：</p>
<ul>
<li class="">用户是谁</li>
<li class="">订单号是什么</li>
<li class="">仓库状态如何</li>
<li class="">当前物流信息</li>
</ul>
<p>这些信息都存在企业内部系统。</p>
<p>因此需要：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">大模型</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">企业知识</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">+</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">企业系统</span><br></div></code></pre></div></div>
<p>共同工作。</p>
<hr>
<p>再例如：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">商品为什么不甜？</span><br></div></code></pre></div></div>
<p>企业希望模型：</p>
<ol>
<li class="">识别差评意图</li>
<li class="">查询商品知识库</li>
<li class="">查询售后政策</li>
<li class="">生成回复</li>
<li class="">提交审核</li>
</ol>
<p>这已经不是简单聊天。</p>
<p>而是业务流程执行。</p>
<hr>
<p>因此企业AI应用通常会演化成：</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">用户问题</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">RAG检索</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">大模型推理</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">工具调用</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">业务系统执行</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"> ↓</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">结果返回</span><br></div></code></pre></div></div>
<p>这就是今天最主流的：</p>
<ul>
<li class="">RAG（检索增强生成）</li>
<li class="">Agent（智能体）</li>
</ul>
<p>架构。</p>
<hr>
<h1>总结</h1>
<p>对于 Java 工程师来说，学习 AI 开发并不是要成为算法工程师，而是理解大模型的运行逻辑。</p>]]></content:encoded>
            <category>ai</category>
            <category>java</category>
        </item>
        <item>
            <title><![CDATA[Welcome]]></title>
            <link>https://jameszbl.github.io/welcome</link>
            <guid>https://jameszbl.github.io/welcome</guid>
            <pubDate>Thu, 26 Aug 2021 00:00:00 GMT</pubDate>
            <description><![CDATA[Docusaurus blogging features are powered by the blog plugin.]]></description>
            <content:encoded><![CDATA[<p><a href="https://docusaurus.io/docs/blog" target="_blank" rel="noopener noreferrer" class="">Docusaurus blogging features</a> are powered by the <a href="https://docusaurus.io/docs/api/plugins/@docusaurus/plugin-content-blog" target="_blank" rel="noopener noreferrer" class="">blog plugin</a>.</p>
<p>Here are a few tips you might find useful.</p>
<!-- -->
<p>Simply add Markdown files (or folders) to the <code>blog</code> directory.</p>
<p>Regular blog authors can be added to <code>authors.yml</code>.</p>
<p>The blog post date can be extracted from filenames, such as:</p>
<ul>
<li class=""><code>2019-05-30-welcome.md</code></li>
<li class=""><code>2019-05-30-welcome/index.md</code></li>
</ul>
<p>A blog post folder can be convenient to co-locate blog post images:</p>
<p><img decoding="async" loading="lazy" alt="Docusaurus Plushie" src="https://jameszbl.github.io/assets/images/docusaurus-plushie-banner-a60f7593abca1e3eef26a9afa244e4fb.jpeg" width="1500" height="500" class="img_ev3q"></p>
<p>The blog supports tags as well!</p>
<p><strong>And if you don't want a blog</strong>: just delete this directory, and use <code>blog: false</code> in your Docusaurus config.</p>]]></content:encoded>
            <category>Facebook</category>
            <category>Hello</category>
            <category>Docusaurus</category>
        </item>
        <item>
            <title><![CDATA[MDX Blog Post]]></title>
            <link>https://jameszbl.github.io/mdx-blog-post</link>
            <guid>https://jameszbl.github.io/mdx-blog-post</guid>
            <pubDate>Sun, 01 Aug 2021 00:00:00 GMT</pubDate>
            <description><![CDATA[Blog posts support Docusaurus Markdown features, such as MDX.]]></description>
            <content:encoded><![CDATA[<p>Blog posts support <a href="https://docusaurus.io/docs/markdown-features" target="_blank" rel="noopener noreferrer" class="">Docusaurus Markdown features</a>, such as <a href="https://mdxjs.com/" target="_blank" rel="noopener noreferrer" class="">MDX</a>.</p>
<div class="theme-admonition theme-admonition-tip admonition_xJq3 alert alert--success"><div class="admonitionHeading_Gvgb"><span class="admonitionIcon_Rf37"><svg viewBox="0 0 12 16"><path fill-rule="evenodd" d="M6.5 0C3.48 0 1 2.19 1 5c0 .92.55 2.25 1 3 1.34 2.25 1.78 2.78 2 4v1h5v-1c.22-1.22.66-1.75 2-4 .45-.75 1-2.08 1-3 0-2.81-2.48-5-5.5-5zm3.64 7.48c-.25.44-.47.8-.67 1.11-.86 1.41-1.25 2.06-1.45 3.23-.02.05-.02.11-.02.17H5c0-.06 0-.13-.02-.17-.2-1.17-.59-1.83-1.45-3.23-.2-.31-.42-.67-.67-1.11C2.44 6.78 2 5.65 2 5c0-2.2 2.02-4 4.5-4 1.22 0 2.36.42 3.22 1.19C10.55 2.94 11 3.94 11 5c0 .66-.44 1.78-.86 2.48zM4 14h5c-.23 1.14-1.3 2-2.5 2s-2.27-.86-2.5-2z"></path></svg></span>提示</div><div class="admonitionContent_BuS1"><p>Use the power of React to create interactive blog posts.</p></div></div>
<!-- -->
<p>For example, use JSX to create an interactive button:</p>
<div class="language-js codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-js codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token operator" style="color:#393A34">&lt;</span><span class="token plain">button onClick</span><span class="token operator" style="color:#393A34">=</span><span class="token punctuation" style="color:#393A34">{</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token arrow operator" style="color:#393A34">=&gt;</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">alert</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">'button clicked!'</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">}</span><span class="token operator" style="color:#393A34">&gt;</span><span class="token maybe-class-name">Click</span><span class="token plain"> me</span><span class="token operator" style="color:#393A34">!</span><span class="token operator" style="color:#393A34">&lt;</span><span class="token operator" style="color:#393A34">/</span><span class="token plain">button</span><span class="token operator" style="color:#393A34">&gt;</span><br></div></code></pre></div></div>
<button>Click me!</button>]]></content:encoded>
            <category>Docusaurus</category>
        </item>
        <item>
            <title><![CDATA[Long Blog Post]]></title>
            <link>https://jameszbl.github.io/long-blog-post</link>
            <guid>https://jameszbl.github.io/long-blog-post</guid>
            <pubDate>Wed, 29 May 2019 00:00:00 GMT</pubDate>
            <description><![CDATA[This is the summary of a very long blog post,]]></description>
            <content:encoded><![CDATA[<p>This is the summary of a very long blog post,</p>
<p>Use a <code>{/*</code> <code>truncate</code> <code>*/}</code> comment to limit blog post size in the list view.</p>
<!-- -->
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>]]></content:encoded>
            <category>Hello</category>
            <category>Docusaurus</category>
        </item>
        <item>
            <title><![CDATA[First Blog Post]]></title>
            <link>https://jameszbl.github.io/first-blog-post</link>
            <guid>https://jameszbl.github.io/first-blog-post</guid>
            <pubDate>Tue, 28 May 2019 00:00:00 GMT</pubDate>
            <description><![CDATA[Lorem ipsum dolor sit amet...]]></description>
            <content:encoded><![CDATA[<p>Lorem ipsum dolor sit amet...</p>
<!-- -->
<p>...consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet</p>]]></content:encoded>
            <category>Hola</category>
            <category>Docusaurus</category>
        </item>
    </channel>
</rss>