基于 GPT 实现 Agent 🤖️

一、导言

前置阅读：

Agent: js.langchain.com/docs/module…

OpenAI API: platform.openai.com/docs/api-re…

在这篇文章中，咱们将探讨怎么运用 GPT 来完结 LangChain 中的 Agent 概念。LangChain 是一个基于言语模型的编程框架，其中的 Agent、LLM 和 Tool 是其中心组件。咱们将首要了解这些组件的概念和联系，然后规划一个状况机来描绘 Agent 的工作流程，并运用 GPT 来完结 LLM 的功用。

二、布景知识

首要，咱们先来了解下什么是 Agent？一句话：AI 作为中心署理人，然后进行决议计划和履行。

举个场景：比如，你有一个应用，里边包括了核算服务、联网服务、气候服务，但是用户输入的格局是不确定的，咱们往往需求依据用户的输入，然后去判别用户的一个意图，然后去调用对应服务，回来终究的成果。而 Agent 就能够直接帮你主动处理，你只需求告知他，你的需求是什么，后续他会回来给你答案。

LangChain 的 Agents 概念，就是针对于相似的场景，结合 LLM，能够主动帮你进行处理，需求走详细哪个服务，这个服务需求什么参数，然后给出对应的输出。

一句话：在 langChain 中 agent 做调用，llm 做决议计划，tools 做详细功用完结。

agent 做调用：只起调用作用，调用 llm 得到决议计划，调用 tools 得到详细功用成果。
llm 做决议计划：llm 做决议计划，依据 agent 供给的信息（用户输入和 tools），得到调用详细的 tool，以及详细给 tool 的传递的参数，这个决议计划给到 agent。
tools 做详细功用完结：tool 做详细功用完结，如 weather tool 承受供给 weather 服务，search tool 供给查找服务，并把成果回来给 Agent。

好了，了解了 Agent 的概念，以及 GPT 的调用方法，咱们就能够开始动手。

（Tip: 关于 GPT 的调用，能够参阅基于 OpenAI 的 API 快速搭建 ChaBot ️）

三、全体思路

实际上，咱们本次就是完结上方这个图。首要完结三个模块吧。

Agent：Agent是主导者，负责和谐整个流程
LLM：言语处理东西，负责处理言语相关的使命。
Tools: Tools是履行详细使命的东西或服务，供给了履行使命所需的功用和服务。

然后咱们对流程进行拆解，能够得到下方的表格，而且咱们定义对应的状况机。

步骤称号	对应的状况机状况	规划模块	人物需求做的工作
接纳用户指令	INITIAL	Agent	接纳并存储用户的指令
了解指令	CALL_LLM	LLM	解析用户的指令，生成行动计划
调用东西	CALL_TOOL	Agent, Tool	依据行动计划，挑选并调用相应的东西
处理东西成果	PROCESS_RESULT	Agent, LLM	接纳东西的成果，假如需求，再次调用 LLM 进行处理
输出成果	OUTPUT_RESULT	Agent	将终究的成果回来给用户

对模块和流程，咱们知道了，那么咱们就详细来完结吧。

四、详细完结

这儿咱们首要从模块规划和流程规划两方面讲起。

4.1 模块规划

Tool

在 LangChain 中，东西（Tool）是详细履行使命的模块，它们为 Agent 供给了履行使命所需的功用和服务。每个东西都有一个特定的功用，例如查询气候、进行数学核算、查找网页等。东西的规划方针是使其能够承受一个字符串参数并回来一个字符串成果，这样 Agent 就能够运用这个接口来调用各种东西。

为了完结这个方针，咱们为东西规划了一个接口，该接口包括三个首要的特点：

name：东西的称号，用于在调用东西时进行辨认。
description：东西的描绘，用于解说东西的功用和运用方法。
call：东西的首要方法，承受一个字符串参数并回来一个字符串成果。

interface Tool {
  name: string;  // The name of the tool, used to identify it when the tool is invoked.
  description: string;  // A description of the tool, explaining what the tool does and how to use it.
  call: (input: string) => Promise<string>; // The main method of the tool, accepts a string parameter and returns a string result.
}

这样的规划使得咱们能够轻松地增加新的东西。只需求完结这个接口，就能够创建一个新的东西。然后，咱们能够将这个东西增加到 Agent 的东西列表中，Agent 就能够在履行使命时调用这个东西。

LLM

在 LangChain 中，LLM 是一个言语模型，它能够了解用户的指令并生成相应的行动计划。LLM 的首要使命是解析用户的指令，确定下一步的动作，并生成相应的参数。

为了完结这个方针，咱们运用 OpenAI 的 API 来完结 LLM 的功用。在调用 LLM 时，咱们生成一个包括用户指令和东西信息的提示词，然后将其发送给 OpenAI 的 API。API 回来的成果是一个 JSON 字符串，包括下一步的动作和参数。

咱们期望经过调用 llm 的暴露的 api，然后得到一个动作和参数

import { Configuration, ConfigurationParameters, OpenAIApi } from "openai";
import { Tool } from "./tool";
export class LLM {
  openai: OpenAIApi;
  async understand(instruction: string, result: string | null, tools: Tool[]) {
    let prompt = generatePrompt(instruction, tools);
    const response = this.openai.createChatCompletion(prompt);
    const { action, params } = JSON.parse(response.data.choices[0].message.content);
    return { action, params };
  }
}

所以上方分三步：prompt 加工、GPT 调用、呼应提取。

prompt 加工：这儿首要是为了告知 GPT 指令、现在东西集以及呼应回来的格局。

export class LLM {
  openai: OpenAIApi;
  async understand(instruction: string, result: string | null, tools: Tool[]) {
    // 生成提示词
    let prompt = '';
    if (result) {
      // 假如有成果，让 LLM 加工成果
      prompt = `You are a helpful assistant. The user says: "${instruction}". The result from the tool is: "${result}". Based on the user's instruction and the tool's result, decide the next action. Your can processing result in params, but must respond with {"action": "output_result", "params": {"result": "the final result"}}.`;
    } else {
      // 假如没有成果，让 LLM 决议下一步动作
      prompt = `You are a helpful assistant. You have the following tools at your disposal: ${toolInfo}. The user says: "${instruction}". Based on the user's instruction and the tools available, decide the next action. If a tool should be used, respond with {"action": "call_tool", "params": {"toolName": "the name of the tool", "toolInput": "the input for the tool"}}. If the final result should be output, respond with {"action": "output_result", "params": {"result": "the final result"}}. Please note that the tool name should be one of the available tools and the tool input should be a valid input for that tool.`;
    }
  }
}

GPT 调用：这儿调用一下 GPT 供给的 API 就能够。

export class LLM {
  openai: OpenAIApi;
  async understand(instruction: string, result: string | null, tools: Tool[]) {
    // other code
		const response = await this.openai.createChatCompletion({
      model: 'gpt-4-0613',
      messages: [
        {
          role: 'system',
          content: prompt,
        },
        {
          role: 'user',
          content: instruction,
        },
      ],
    });
    // other code
  }
}

呼应提取： 得到的成果 JSON 字符串的格局如下, 咱们提取对应的 action 和 params 就行了。

{
  "action": "call_tool",  // action 表明行为 call_tool 为调用 tool，output_result 为输出成果。
  "params": {
    "toolName": "weather_tool",
    "toolInput": "Beijing"
  }
}

Agent

agent 做 llm 和 tool 的调用，而且将流程串起来，所以能够抽象为三个 API。

callLLM：调用 LLM 模型，让 LLM 给决议计划
callTool：调用 TOOL 东西，得到详细 TOOL 的输出。
processInstruction: 履行命令，串联流程，中心调用 LLM 和 TOOL。

export class Agent {
  state: AgentState;
  llm: LLM;
  tools: Tool[];
  constructor(llm: LLM, tools: Tool[]) {
    this.state = AgentState.INITIAL;
    this.llm = llm;
    this.tools = tools;
  }
  async processInstruction(instruction: string) {
  }
  async callLLM(instruction: string, result: string | null) {
  }
  async callTool({ toolName, toolInput }: { toolName: string; toolInput: string }) {
  }
}

好了，上方咱们规划了 tool , Agent, llm 的模块规划，下方咱们详细来看流程规划。

4.2 流程规划

上面讲到，流程规划首要是 Agent 中的 processInstruction 承接，然后咱们之前也梳理了对应的步骤称号的状况机器和详细涉及模块。（下图），实际上，咱们去进行实施就能够了。

步骤称号	对应的状况机状况	规划模块	人物需求做的工作
接纳用户指令	INITIAL	Agent	接纳并存储用户的指令
了解指令	CALL_LLM	LLM	解析用户的指令，生成行动计划
调用东西	CALL_TOOL	Agent, Tool	依据行动计划，挑选并调用相应的东西
处理东西成果	PROCESS_RESULT	Agent, LLM	接纳东西的成果，假如需求，再次调用 LLM 进行处理
输出成果	OUTPUT_RESULT	Agent	将终究的成果回来给用户

class Agent {
  async processInstruction(instruction: string) {
    let result = null;
    let actionInfo = null;
    while (this.state !== AgentState.OUTPUT_RESULT && this.state !== AgentState.HANDLE_ERROR) {
      try {
        switch (this.state) {
          case AgentState.INITIAL:
            this.state = AgentState.CALL_LLM;
            break;
          case AgentState.CALL_LLM:
          case AgentState.PROCESS_RESULT:
            actionInfo = await this.callLLM(instruction, result);
            result = this.updateStateBasedOnAction(actionInfo);
            break;
          case AgentState.CALL_TOOL:
            result = await this.callTool(actionInfo.params);
            this.state = AgentState.PROCESS_RESULT;
            break;
          default:
            throw new Error(`Invalid state, ${this.state}`);
        }
      } catch (error) {
        this.state = AgentState.HANDLE_ERROR;
        result = 'An error occurred.';
      }
    }
    return result;
  }
}

5.1 效果

下方咱们已经完结了大体的一个署理，这个时候，咱们参加日志功用就能然后来看详细效果吧。

下面举两个比如。

气候的比如： What is the weather in Shanghai?

核算比如：What is the result of 100 * 100 – 30?

上方两个比如，符合咱们 Agent 的一个流程，咱们也能看出他的决议计划流程。

结论

经过这个项目，咱们成功地运用 GPT 完结了 LangChain 中的 Agent 概念。咱们规划了一个状况机来描绘 Agent 的工作流程。

总的来说，本文的代码不重要，我觉得 Agent 这个概念是更重要的。

Agent 这个概念是十分强大和灵活的，它能够被看作是一个能够了解和履行用户指令的智能实体。在 LangChain 中，Agent 经过调用 LLM（Language Logic Model）和各种东西来完结用户的指令，这种规划使得 Agent 能够处理各种复杂的使命，而且能够经过增加新的东西来扩展其功用。

参阅资料：

ChatGPT
platform-openai
Agents-LangChain

完好代码：

Agent-Github

假如本文对你有一点点帮助或启发，期望能够点个赞哈 / 下方评论区评论 / 互关注 Github、大众号学习沟通，支持是创作的动力～。

大众号：华铧同学
Github:github.com/hua-bang

声明：本站所有文章，如无特殊说明或标注，均为本站原创发布。任何个人或组织，在未征得本站同意时，禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益，可联系我们进行处理。