该文档根据文档创建时最新稳定版0.3.0: github.com/Significant…

AutoGPT是什么

AutoGPT是GitHub上的一个开源项目,它致力于使GPT4彻底自主。用户运用AutoGPT,只需求告知AutoGPT一个方针,AutoGPT会自主生成履行计划,自主和GPT4或许GPT3.5交互,并一步一步完结计划,最后输出用户想要的成果,整个进程彻底不需求用户参加。别的,AutoGPT完成了很多东西,能够进行网络查找、文件操作、代码履行等操作,和实际国际打通,极大扩展了ChatGPT的才干

因为不需求用户引导AI,项目一经推出就爆火,到现在,项目上线五十多天,拥有13.1w star,2.67w fork,是GitHub前史上增长速度最快的项目之一。前特斯拉总监、刚刚回归 OpenAI 的 Andrej Karpathy 也为其大力宣扬,称「AutoGPT 是 prompt 工程的下一个前沿。」

和openai官方的ChatGPT/GPT4比较,AutoGPT首要省去了一步步和ChatGPT/GPT4交互的进程,隐藏了prompt等细节;一起AutoGPT完成了多个东西,并经过prompt告知大模型东西的才干,使大模型能够运用东西,扩展大模型的才干边界

和langchain比较,langchain是一个库,需求编程后才干和大模型交互,AutoGPT是一个运用,能够直接和大模型交互;从概念上来讲,AutoGPT相当于langchain中的Agent,AutoGPT中的东西相当于langchain中的tool,AutoGPT中的memory和langchain的memory相当

AutoGPT的官方站点:

  • GitHub:github.com/Significant…

  • 网站:agpt.co

  • 文档

    • docs.agpt.co

    • github.com/Significant…

AutoGPT运用

文档:docs.agpt.co/setup

装备

在运用之前,需求先完结装备,AutoGPT支撑openai和azure的GPT3.5/GPT4模型

  • openai

    • .env 中装备OPENAI_API_KEY
  • azure

    • 敞开Azure:.env 中设置USE_AZURE=True

    • 装备key:.env 中装备OPENAI_API_KEY,该key是azure的key,和openai无关

    • 装备azure模型信息:azure.yaml中装备

      • azure_api_base:azure的api base
      • azure_api_version:azure的api版别
      • fast_llm_model_deployment_id:azure中部署的大模型,假如需求AutoGPT运用GPT3.5,写入GPT3.5模型id;假如需求AutoGPT用GPT4,写入GPT4的模型id
      • smart_llm_model_deployment_id:azure中部署的大模型,id挑选同上,和以上保持一致即可
      • embedding_model_deployment_id:azure中部署的embeding模型,是必选项,一般是 text-embedding-ada-002 模型

以上是大模型相关装备,也是运转AutoGPT有必要的装备。除此之外,AutoGPT还支撑东西、memory相关装备,比方能够装备运用Stable Diffusion WebUI生成图片,详细能够在文档中查看

运转

AutoGPT能够在docker中和docker外运转

  • docker内

AutoGPT已经构建好docker镜像,能够直接拉下来运用

官方文档中供给了docker-compose装备文件,其间做好了文件挂载等作业,直接运转即可

  • docker外

需求提早装置python3依靠,依靠位于requirements.txt

运转AutoGPT项目根目录的run.sh


AutoGPT供给了多个指令行选项,操控运转行为,首要选项如下:
选项 作用
–help 列出并解释一切可用选项
–continuous/ -c 敞开接连模式,敞开之后,会直接履行大模型挑选的指令,不需求用户确认
–gpt3only 只运用GPT3.5,0.3.0版别,对openai有用,对azure,运用的模型是azure装备中的模型,不受影响
–gpt4only 只运用GPT4,0.3.0版别,对openai有用,对azure,运用的模型是azure装备中的模型,不受影响
–debug 敞开debug模式,输出debug日志

项目运转完结,成果在项目根目录的 autogpt/auto_gpt_workspace 中
 AutoGPT Web版已经在路上,现在能够在官网供给邮件参加waitlist

作业流程

发动AutoGPT之后,会提示用户输入需求GPT做的事,输入使命后,AutoGPT首先会问询大模型,把使命分化为几个方针,之后会敞开一个循环,一步步自主完结用户的指定的使命

AutoGPT履行首要流程如下,其间省略了部分分支

Auto-GPT介绍

Prompt

Prompt是和大模型交互的仅有方法,Prompt的质量直接影响大模型的输出,能够说是运用大模型时最重要的部分。本节从多个场景剖析AutoGPT怎么构造Prompt

分化使命

用户输入使命后,AutoGPT会和大模型交互,恳求一个AI名、一个AI角色描绘和至多5个方针

相关Prompt如下,其间 {user_prompt} 为用户输入使命

{
    "role": "system",
    "content": """
Your task is to devise up to 5 highly effective goals and an appropriate role-based name (_GPT) for an autonomous agent, ensuring that the goals are optimally aligned with the successful completion of its assigned task.
The user will provide the task, you will provide only the output in the exact format specified below with no explanation or conversation.
Example input:
Help me with marketing my business
Example output:
Name: CMOGPT
Description: a professional digital marketer AI that assists Solopreneurs in growing their businesses by providing world-class expertise in solving marketing problems for SaaS, content products, agencies, and more.
Goals:
- Engage in effective problem-solving, prioritization, planning, and supporting execution to address your marketing needs as your virtual Chief Marketing Officer.
- Provide specific, actionable, and concise advice to help you make informed decisions without the use of platitudes or overly wordy explanations.
- Identify and prioritize quick wins and cost-effective campaigns that maximize results with minimal time and budget investment.
- Proactively take the lead in guiding you and offering suggestions when faced with unclear information or uncertainty to ensure your marketing strategy remains on track.
"""
},
{
    "role": "user",
    "content": f"Task: '{user_prompt}'\nRespond only with the output in the exact format specified in the system prompt, with no explanation or conversation.\n",
},

前史对话摘要

因为前史对话长度很长,超过大模型答应的token数量,AutoGPT把前史对话进行摘要后发送

相关Prompt如下,其间 {current_memory} 是之前对话摘要,{new_events} 是本轮新增对话内容

{
    "role": "user",
    "content": f'''Your task is to create a concise running summary of actions and information results in the provided text, focusing on key and potentially important information to remember.
You will receive the current summary and the your latest actions. Combine them, adding relevant key information from the latest development in 1st person past tense and keeping the summary concise.
Summary So Far:
"""
{current_memory}
"""
Latest Development:
"""
{new_events}
"""
'''
}

问询下一步动作

AutoGPT运转选用step-by-step的方式,每次都会问询大模型下一步动作,得到动作后本地履行,在把履行成果反馈给大模型,持续问询下一步动作。问询下一步动作的Prompt比较复杂,能够分为多个部分,分别如下

system prompt

system prompt示例如下,也能够分成几部分:

  • 分化使命之后获得的AI名、AI角色和使命方针

  • 约束

  • 支撑的指令

  • 能够供给的资源或许才干

  • 功能评估

  • 输出描绘

"system": """You are WeatherGPT_CN, 一个智能气候帮手,为您供给准确、实时的气候信息,帮助您更好地规划活动和出行。
Your decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications.
GOALS:
1. 以中文为首要语言,为您供给明晰、简练的气候预报。
2. 获取并剖析来自权威数据源的实时气候信息,保证准确性和可靠性。
3. 针对您的详细需求,供给详细的气候信息,包括温度、湿度、风向、风速等。
4. 根据您的地理位置,为您供给最相关的气候预报。
5. 及时更新气候信息,保证您随时了解最新的气候状况。
Constraints:
1. ~4000 word limit for short term memory. Your short term memory is short, so immediately save important information to files.
2. If you are unsure how you previously did something or want to recall past events, thinking about similar events will help you remember.
3. No user assistance
4. Exclusively use the commands listed in double quotes e.g. "command name"
Commands:
1. analyze_code: Analyze Code, args: "code": "<full_code_string>"
2. execute_python_file: Execute Python File, args: "filename": "<filename>"
3. append_to_file: Append to file, args: "filename": "<filename>", "text": "<text>"
4. delete_file: Delete file, args: "filename": "<filename>"
5. list_files: List Files in Directory, args: "directory": "<directory>"
6. read_file: Read file, args: "filename": "<filename>"
7. write_to_file: Write to file, args: "filename": "<filename>", "text": "<text>"
8. google: Google Search, args: "query": "<query>"
9. improve_code: Get Improved Code, args: "suggestions": "<list_of_suggestions>", "code": "<full_code_string>"
10. send_tweet: Send Tweet, args: "tweet_text": "<tweet_text>"
11. browse_website: Browse Website, args: "url": "<url>", "question": "<what_you_want_to_find_on_website>"
12. write_tests: Write Tests, args: "code": "<full_code_string>", "focus": "<list_of_focus_areas>"
13. delete_agent: Delete GPT Agent, args: "key": "<key>"
14. get_hyperlinks: Get text summary, args: "url": "<url>"
15. get_text_summary: Get text summary, args: "url": "<url>", "question": "<question>"
16. list_agents: List GPT Agents, args: () -> str
17. message_agent: Message GPT Agent, args: "key": "<key>", "message": "<message>"
18. start_agent: Start GPT Agent, args: "name": "<name>", "task": "<short_task_desc>", "prompt": "<prompt>"
19. Task Complete (Shutdown): "task_complete", args: "reason": "<reason>"
Resources:
1. Internet access for searches and information gathering.
2. Long Term memory management.
3. GPT-3.5 powered Agents for delegation of simple tasks.
4. File output.
Performance Evaluation:
1. Continuously review and analyze your actions to ensure you are performing to the best of your abilities.
2. Constructively self-criticize your big-picture behavior constantly.
3. Reflect on past decisions and strategies to refine your approach.
4. Every command has a cost, so be smart and efficient. Aim to complete tasks in the least number of steps.
5. Write all code to a file.
You should only respond in JSON format as described below 
Response Format: 
{
    "thoughts": {
        "text": "thought",
        "reasoning": "reasoning",
        "plan": "- short bulleted\n- list that conveys\n- long-term plan",
        "criticism": "constructive self-criticism",
        "speak": "thoughts summary to say to user"
    },
    "command": {
        "name": "command name",
        "args": {
            "arg name": "value"
        }
    }
} 
Ensure the response can be parsed by Python json.loads
"""

context prompt

context部分是对话需求的上下文,示例如下。也能够分为几部分

  • 当时时间

  • 前史一切对话和指令履行成果摘要

  • 最近的对话和指令履行成果

{
    "system": "The current time and date is Tue May 16 11:39:01 2023",
    "system": '''This reminds you of these events from your past: 
I was created and googled the real-time weather in Shanghai. I found several websites with weather information, including the Shanghai Meteorological Bureau and The Weather Channel.''',
    "user": "Determine which next command to use, and respond using the format specified above:",
    "assistant": '''{
    "thoughts": {
        "text": "我需求查找上海的实时气候信息。",
        "reasoning": "用户需求了解上海的实时气候状况。",
        "plan": "- 运用browse_website指令\n- 拜访weather.com.cn网站\n- 查询上海的实时气候",
        "criticism": "无",
        "speak": "我将查找上海的实时气候信息。"
    },
    "command": {
        "name": "browse_website",
        "args": {
            "url": "http://www.weather.com.cn/weather/101020100.shtml",
            "question": "上海现在的气候情况是怎样的?"
        }
    }
}''',
    "system": "Command browse_website returned: ('Error: timeout: Timed out receiving message from renderer: 299.610', None)",
}

trigger prompt

trigger prompt用来引导大模型进行输出,示例如下

    "user": "Determine which next command to use, and respond using the format specified above:"

Command

AutoGPT的command概念能够类比langchain的tool,能够履行特定的使命。AutoGPT内置了很多command,汇总如下

分类 command名 作用 作业原理
代码 analyze_code 剖析python代码,提出提升主张 问询大模型
execute_python_file 履行python代码 拉取docker镜像,在容器内履行
improve_code 根据主张,改写代码 问询大模型
write_tests 为python代码写测验 问询大模型
文件 append_to_file 追加内容到文件
delete_file 删除文件
list_files 列出目录下文件
read_file 读文件
write_to_file 写文件
网络 google 运用查找引擎,查找内容 支撑两种查找:1. google api查找,需求google key 2. ddg查找
browse_website 恳求网页,获取网页内容 运用selenium恳求网页,运用BeautifulSoup解析网页内容,解析内容后问询大模型,从网页内容中获取成果
get_hyperlinks 从url中获取超链接 经过requests恳求网页,运用BeautifulSoup解析网页内容
gpt agent start_agent 创建agent agent能够和大模型交互,有独立的上下文
message_agent 运用agent和大模型交互
list_agents 列出一切agent
delete_agent 删除agent
其他 get_text_summary 获取文本摘要 问询大模型
send_tweet 发送tweeter 运用tweeter api
task_complete 结束使命

Memory

因为和大模型交互时,屡次恳求之间没有相关,需求额定的组件办理一次会话的多轮对话信息。此外,一次恳求的token数量有上限,现在GPT3.5答应最多4K,GPT4最多答应8K或许32K,当上下文过长时,需求额定处理。AutoGPT经过memory处理以上问题

Memory类型

AutoGPT支撑多种Memory类型,详细如下

类型 描绘 备注
local 本地json文件 docker外运转时默许
redis redis服务器 运用docker运转时默许;能够跨屡次发动,其他memory在发动时会清空
pinecone 一种向量数据库
milvus 一种开源向量数据库
weaviate 一种开源向量数据库

上下文处理

0.3.0 版别,会把一切前史对话和指令履行成果做摘要,并把摘要参加Prompt

0.2.2版别,会对每一轮对话做embeding,并把对话和embeding存入memory。问询大模型下一步动作时,会把最近几轮对话最相关的内容从memory中召回,在token满足时,尽可能多的参加召回的内容

文件embeding

AutoGPT支撑把多个文件做embeding后参加memory,详细embeding的方法是固定大小做chunk,并保留必定overlap

在项目根目录,直接履行 python data_ingestion.py -h 能够获取更多信息

plugin

AutoGPT经过插件扩展才干。在AutoGPT履行流程中,多处都增加了钩子,插件能够在恣意钩子处履行。经过插件,用户能够增加自定义指令、修改prompt、处理大模型输出或履行其他自定义逻辑

用户能够经过继承 AutoGPTPluginTemplate完成插件,只需求完成对应的函数即可。但当时的插件协议并不固定,需求留意

AutoGPT官方插件保护在库房 github.com/Significant… 中,其间包括一个百度查找插件,完成位于 baidu_search,感兴趣的同学能够自行探究

插件的运用方法能够参考上述官方库房README文件

作用和展望

Benchmark

AutoGPT官方关注到benchmark问题,敞开了新项目Auto-GPT-Benchmarks。该项目运用OpenAI供给的Evals框架,运用OpenAI benchmark数据

该项目最近才开始,现在还没有看到相关数据

实际运用

归纳实际运用,以及网络和交际平台的信息,AutoGPT的作用并不令人十分满意,多数成果不及预期,还有比较大的提升空间

展望

尽管现在AutoGPT作用不尽如人意,但它第一次把想象中彻底自动化的AI帮手带入实际。现在项目发展迅速,跟着大模型的才干不断增强,以及command才干的不断细化,AutoGPT的未来很值得等待

Reference

github.com/Significant…

github.com/Significant…

docs.agpt.co

news.agpt.co

github.com/Significant…

github.com/openai/eval…

github.com/Significant…

github.com/Significant…

www.51cto.com/article/751…

/post/722284…