GPT Cache能够完成什么效果

两个问题:

  • question1 = “what do you think about chatgpt”
  • question2 = “what do you feel like chatgpt”

第一次将会去问询chatgpt答案,测试耗费12s,第2次命中缓存,只需要0.06s。

GPT Cache:1 介绍

GPT Cache:1 介绍

中文问题:

测试注意修改模型:

towhee = Towhee(model="uer/albert-base-chinese-cluecorpussmall")
  • question1 = “做博后有什么优势?”
  • question2 = “为什么有人会挑选做博后?”

第一次将会去问询chatgpt答案,测试耗费31s,第2次命中缓存,只需要0.14s。

GPT Cache:1 介绍

GPT Cache:1 介绍

假如觉得还不错,欢迎 star,项目地址:github.com/zilliztech/…

什么是GPT Cache?

大型言语模型(LLMs)是一种有前途和具有革新性的技能,近年来迅速发展。这些模型能够生成自然言语文本,并具有许多运用,包括聊天机器人、言语翻译和构思写作。但是,跟着这些模型的规划增大,运用它们需要的本钱和性能要求也增加了。这导致了在大型模型上开发ChatGPT等运用程序方面的严重应战。

为了解决这个问题,咱们开发了GPT Cache,这是一个专心于缓存言语模型呼应的项目,也称为语义缓存。该体系供给了两个首要的好处:

  1. 快速响运用户恳求:缓存体系供给比大型模型推理更快的呼应时刻,然后下降延迟并更快地响运用户恳求。
  2. 下降服务本钱:现在大多数ChatGPT服务是基于恳求数量收费的。假如用户恳求命中缓存,它能够削减恳求数量并下降服务本钱。

GPT缓存为什么会有帮助?

我认为是有必要,原因是:

  • 局部性无处不在。像传统的运用体系一样,AIGC运用程序也面对类似的热点问题。例如,ChatGPT本身可能是程序员们热议的论题。
  • 面向特定范畴的SaaS服务,用户往往在特定的范畴内提出问题,具有时刻和空间上的局部性。
  • 经过利用向量类似度查找,能够以相对较低的本钱找到问题和答案之间的类似关系。

快速接入

项目地址:github.com/zilliztech/…

pip 装置

pip install gptcache

假如只是想完成恳求的精准匹配缓存,即两次如出一辙的恳求,则只需要两步就能够接入这个cache !!!

  1. cache初始化
from gptcache.core import cache
cache.init()
# 假如运用`openai.api_key = xxx`设置API KEY,需要用下面句子替换它
# 办法读取OPENAI_API_KEY环境变量并进行设置,确保key的安全性 
cache.set_openai_key()
  1. 替换原始openai包
from gptcache.adapter import openai
# openai恳求不需要做任何改变
answer = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "foo"}
        ],
    )

假如想快速在本地体会下向量类似查找缓存,(第一次运转需要等必定时刻,因为需要下载模型环境、模型数据、依靠等),参阅代码:

import json
import os
import time
from gptcache.cache.factory import get_data_manager, get_si_data_manager
from gptcache.core import cache, Cache
from gptcache.encoder import Towhee
from gptcache.ranker.simple import pair_evaluation
from gptcache.adapter import openai
def run():
    ```
    towhee = Towhee()
    # towhee = Towhee(model="uer/albert-base-chinese-cluecorpussmall")
    os.environ["OPENAI_API_KEY"] = "API KEY"
    cache.set_openai_key()
    data_manager = get_si_data_manager("sqlite", "faiss",
                                       dimension=towhee.dimension(), max_size=2000)
    cache.init(embedding_func=towhee.to_embeddings,
               data_manager=data_manager,
               evaluation_func=pair_evaluation,
               similarity_threshold=1,
               similarity_positive=False)
    question1 = "what do you think about chatgpt"
    question2 = "what do you feel like chatgpt"
    # question1 = "做博后有什么优势?"
    # question2 = "为什么有人会挑选做博后?"
    start_time = time.time()
    answer = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "user", "content": question1}
        ],
    )
    end_time = time.time()
    print("time consuming: {:.2f}s".format(end_time - start_time))
    print(json.dumps(answer))
    start_time = time.time()
    answer = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "user", "content": question2}
        ],
    )
    end_time = time.time()
    print("time consuming: {:.2f}s".format(end_time - start_time))
    print(answer)
if __name__ == '__main__':
    run()

实测输出:

time consuming: 12.83s
{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "As an AI language model, I do not have personal opinions or biases towards websites or specific platforms. However, I can say that ChatGPT is a platform designed for interactive communication where users can connect with others and have conversations on various topics such as lifestyle, relationships, health, and personal growth among others. It could be a useful tool for individuals seeking support or advice from peers, or just for casual conversations to pass the time.",
        "role": "assistant"
      }
    }
  ],
  "created": 1680272771,
  "id": "chatcmpl-709zfmJyYDEl4dfTFfmAlfq7luTjM",
  "model": "gpt-3.5-turbo-0301",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 87,
    "prompt_tokens": 16,
    "total_tokens": 103
  }
}
time consuming: 0.06s
{'gpt_cache': True, 'choices': [{'message': {'role': 'assistant', 'content': 'As an AI language model, I do not have personal opinions or biases towards websites or specific platforms. However, I can say that ChatGPT is a platform designed for interactive communication where users can connect with others and have conversations on various topics such as lifestyle, relationships, health, and personal growth among others. It could be a useful tool for individuals seeking support or advice from peers, or just for casual conversations to pass the time.'}, 'finish_reason': 'stop', 'index': 0}]}

能够发现,第一次恳求耗费12s,类似的问题,第2次问则命中缓存,只需要0.06s。

项目地址:github.com/zilliztech/…

下一节:GPT Cache:2 怎么装备缓存存储