一、ChatGLM-6B+Prefix-tuning/lora微调生成心思咨询问答

曼曼心理咨询【基于ChatGLM-6B微调】

1.构思布景

在现代社会中,人们面对许多压力和应战,比方,买房、结婚、升学、考试、作业成绩等等。身心健康成为愈发重要的关注点。随着人工智能技能的迅猛开展,咱们能够利用这一技能来协助人们改进他们的日子质量和心思健康,下降对立的发生,防止自杀,构建积极的社会和人生。

2.构思目标

经过AI和心思学原理的结合,供给个性化的心思健康辅佐服务,能够随时随地协助用户完结心情办理、自我生长和心里平衡,并且不必忧虑隐私问题,一起能大幅度下降心思咨询的费用。

3.构思设计

  • 用户画像:经过用户运用AI大语言模型进行文字、图片和视频的交互,系统了解用户的爱好、需求和个性特点,树立起个性化的用户画像。
  • 心情剖析与办理:依据用户的文字、图片、视频内容,运用内置的情感剖析算法,协助用户识别心情改变及原因,并供给心情办理策略和主张,包含但不限于文字、图片、音频、视频、游戏等方法。
  • 自我探究与生长:结合心思学常识和个人开展目标,为用户供给定制化的生长计划和反应,引导他们发现潜在的优势和改进空间。
  • 心里安静与放松:供给冥想、呼吸练习、轻音乐等资源,协助用户放松身心,培养专心力和心里安静感。
  • 交际支撑和共享:树立用户社区,让用户能够相互支撑、共享经验和鼓舞,增强交际联系和减轻孤独感。

4.最终作用

曼曼心理咨询【基于ChatGLM-6B微调】

2023-7-14新增lora微调

二、数据集

1. chatglm-6b数据集格式

默认为{“content”:”这里是content,也就是数据”,”summary”:”这里是summary,能够了解为标签”}

content summary
“从现在起,容许自己的事就尽力去做到,” “容许自己要去的当地就尽力去抵达”
“若你困于无风之地” “,我将奏响高天之歌”
“世事易变” “,匪石弗转”
“若你困于无风之地” “我将为你奏响高天之歌”
“漩涡无法击碎的磐岩” “,也究竟会在时光的冲刷下磨损”
“从天堂到阴间,” “我路过了人世”

2.需求对数据格式进行转化。

!unzip data/data231239/smileData_v3.zip -d data/data231239/
Archive:  data/data231239/smileData_v3.zip
  inflating: data/data231239/smileData_v3.json  
!head data/data231239/smileData_v3.json
[
  {
    "instruction": "假定你是友善的心思辅导师,请依据咨询者的问题答复实在有用的答复",
    "input": "求助者:最近我遇到了一个问题。大姑姐离婚,不要女儿,所以我收养了她女儿。可是现在我儿子和姑姐女儿的联系很严重,她自己也不做家务,我该怎么办?支撑者:",
    "output": "看来你很烦恼啊。那你先了解一下孩子的定见吧。问问她喜不喜欢在你家,或许她想不想跟她的父亲一起日子。"
  },
  {
    "instruction": "假定你是友善的心思辅导师,请依据咨询者的问题答复实在有用的答复",
    "input": "求助者:最近我遇到了一个问题。大姑姐离婚,不要女儿,所以我收养了她女儿。可是现在我儿子和姑姐女儿的联系很严重,她自己也不做家务,我该怎么办?支撑者:看来你很烦恼啊。那你先了解一下孩子的定见吧。问问她喜不喜欢在你家,或许她想不想跟她的父亲一起日子。求助者:我没有问过她,可是听我儿子说,她不喜欢在咱们家住,并且还总抱怨。支撑者:",
    "output": "那你也不能就这么轻易地抛弃她啊。究竟离婚关于孩子来说也是一种负面影响。可是这也不能成为她固执变坏的资本。要看她的实在情况,然后试着协助她。"

3.数据格式转化

数据太多,选择1000条,这样速度快

!mkdir soul
# 练习集
import json
from pprint import pprint
# 读取 JSON 文件
with open('data/data231239/smileData_v3.json', 'r', encoding='utf-8') as f:
    data = json.load(f)
result_list = []
with open('soul/train.json', 'w', encoding="utf-8") as f:
    for item in data[:1000]:
        temp = dict()
        temp['content'] = item['instruction']+ '。'+item['input']
        temp['summary'] = item['output']
        json.dump(temp, f, ensure_ascii=False)
        f.write('\n')
# 测验集
import json
from pprint import pprint
# 读取 JSON 文件
with open('data/data231239/smileData_v3.json', 'r', encoding='utf-8') as f:
    data = json.load(f)
result_list = []
with open('soul/dev.json', 'w', encoding="utf-8") as f:
    for item in data[1000:200]:
        temp = dict()
        temp['content'] = item['instruction']+ '。'+item['input']
        temp['summary'] = item['output']
        json.dump(temp, f, ensure_ascii=False)
        f.write('\n')
!head -n2 1.jsonl
{"content": "假定你是友善的心思辅导师,请依据咨询者的问题答复实在有用的答复。求助者:最近我遇到了一个问题。大姑姐离婚,不要女儿,所以我收养了她女儿。可是现在我儿子和姑姐女儿的联系很严重,她自己也不做家务,我该怎么办?支撑者:", "summary": "看来你很烦恼啊。那你先了解一下孩子的定见吧。问问她喜不喜欢在你家,或许她想不想跟她的父亲一起日子。"}
{"content": "假定你是友善的心思辅导师,请依据咨询者的问题答复实在有用的答复。求助者:最近我遇到了一个问题。大姑姐离婚,不要女儿,所以我收养了她女儿。可是现在我儿子和姑姐女儿的联系很严重,她自己也不做家务,我该怎么办?支撑者:看来你很烦恼啊。那你先了解一下孩子的定见吧。问问她喜不喜欢在你家,或许她想不想跟她的父亲一起日子。求助者:我没有问过她,可是听我儿子说,她不喜欢在咱们家住,并且还总抱怨。支撑者:", "summary": "那你也不能就这么轻易地抛弃她啊。究竟离婚关于孩子来说也是一种负面影响。可是这也不能成为她固执变坏的资本。要看她的实在情况,然后试着协助她。"}

三、预备环境

请留意:

  • 运用%%capture不显示预备环境时呈现的大量文本
  • 运用paddlepaddlegpu的dev版别
  • 运用paddlenlp最新版别
%%capture
#要更新pip要不容易装置失败
!pip install --upgrade pip
%%capture
!python -m pip install paddlepaddle-gpu==0.0.0.post112 -f https://www.paddlepaddle.org.cn/whl/linux/gpu/develop.html
# https://gitee.com/livingbody/PaddleNLP为我同步的最新的PaddleNLP,官方gitee的有一些旧
!git clone https://gitee.com/livingbody/PaddleNLP -b develop --depth=1
/home/aistudio
正克隆到 'PaddleNLP'...
remote: Enumerating objects: 6435, done.
remote: Counting objects: 100% (6435/6435), done.
remote: Compressing objects: 100% (4450/4450), done.
remote: Total 6435 (delta 2517), reused 3737 (delta 1674), pack-reused 0
接收对象中: 100% (6435/6435), 24.00 MiB | 5.01 MiB/s, 完结.
处理 delta 中: 100% (2517/2517), 完结.
检查连接... 完结。
%%capture
!pip install -e PaddleNLP/

四、创建chatglm-6b模型

1.模型加载

  • 本地加载
  • 主动下载并加载

主张数据集挂载并本地加载,这样速度快,不必等太久!

import json
import paddle
from paddle.distributed import fleet
from paddlenlp.peft import LoRAConfig, LoRAModel, PrefixConfig, PrefixModelForCausalLM
from paddlenlp.peft.prefix import (
    chatglm_pad_attention_mask,
    chatglm_postprocess_past_key_value,
)
from paddlenlp.transformers import ChatGLMConfig, ChatGLMForCausalLM, ChatGLMTokenizer
#读取原始的chatglm-6b模型
# model_name_or_path = 'THUDM/chatglm-6b' # 运用该路径会主动下载和加载模型
model_name_or_path = 'THUDM/chatglm-6b'
tokenizer = ChatGLMTokenizer.from_pretrained(model_name_or_path)
config = ChatGLMConfig.from_pretrained(model_name_or_path)
paddle.set_default_dtype(config.paddle_dtype)
model = ChatGLMForCausalLM.from_pretrained(
    model_name_or_path,
    tensor_parallel_degree=0,
    tensor_parallel_rank=0,
    load_state_as_np=True,
    dtype=config.paddle_dtype,
)
model.eval()
[2023-07-21 10:30:01,867] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/THUDM/chatglm-6b/ice_text.model
[2023-07-21 10:30:01,869] [    INFO] - Downloading https://bj.bcebos.com/paddlenlp/models/community/THUDM/chatglm-6b/added_tokens.json and saved to /home/aistudio/.paddlenlp/models/THUDM/chatglm-6b
[2023-07-21 10:30:01,948] [ WARNING] - file<https://bj.bcebos.com/paddlenlp/models/community/THUDM/chatglm-6b/added_tokens.json> not exist
[2023-07-21 10:30:01,952] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/THUDM/chatglm-6b/special_tokens_map.json
[2023-07-21 10:30:01,955] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/THUDM/chatglm-6b/tokenizer_config.json
[2023-07-21 10:30:02,358] [    INFO] - Found /home/aistudio/.paddlenlp/models/THUDM/chatglm-6b/config.json
[2023-07-21 10:30:02,362] [    INFO] - Loading configuration file /home/aistudio/.paddlenlp/models/THUDM/chatglm-6b/config.json
[2023-07-21 10:30:02,366] [ WARNING] - `load_state_as_np` is deprecated,  please delete it!
[2023-07-21 10:30:02,423] [    INFO] - Found /home/aistudio/.paddlenlp/models/THUDM/chatglm-6b/config.json
[2023-07-21 10:30:02,427] [    INFO] - Loading configuration file /home/aistudio/.paddlenlp/models/THUDM/chatglm-6b/config.json
[2023-07-21 10:30:02,430] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/THUDM/chatglm-6b/model_state.pdparams
[2023-07-21 10:30:02,433] [    INFO] - loading weights file model_state.pdparams from cache at /home/aistudio/.paddlenlp/models/THUDM/chatglm-6b/model_state.pdparams
[2023-07-21 10:30:24,093] [    INFO] - Loaded weights file from disk, setting weights to model.
W0721 10:30:24.098392 16136 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0721 10:30:24.102008 16136 gpu_resources.cc:149] device: 0, cuDNN Version: 8.2.
[2023-07-21 10:30:37,643] [ WARNING] - Some weights of the model checkpoint at THUDM/chatglm-6b were not used when initializing ChatGLMForCausalLM: ['transformer.rotary_emb.inv_freq']
- This IS expected if you are initializing ChatGLMForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing ChatGLMForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2023-07-21 10:30:37,646] [ WARNING] - Some weights of ChatGLMForCausalLM were not initialized from the model checkpoint at THUDM/chatglm-6b and are newly initialized: ['transformer.rotary_embeddings.inv_freq', 'lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

2.对话推理

本部分从ChatGLM-6B运用测验,包含全部装置过程,封装好了调用代码及图形界面运用了单轮对话函数,如有其他需求,如多轮对话、图形界面等,请检查该链接

def glm_single_QA(model,tokenizer,next_inputs,input_length,output_length):
    # 输入格式转化
    inputs = tokenizer(
        next_inputs,
        return_tensors="np",
        padding=True,
        max_length=input_length,
        truncation=True,
        truncation_side="left",
    )
    input_map = {}
    for key in inputs:
        input_map[key] = paddle.to_tensor(inputs[key])
    # 获取成果
    infer_result = model.generate(
        **input_map,
        decode_strategy="sampling",
        top_k=1,
        # top_p =5,
        max_length=output_length,
        use_cache=True,
        use_fast=True,
        use_fp16_decoding=True,
        repetition_penalty=1,
        temperature = 0.95,
        length_penalty=1,
    )[0]
    # 成果转化
    output = ''
    result = []
    for x in infer_result.tolist():
        res = tokenizer.decode(x, skip_special_tokens=True)
        res = res.strip("\n")
        result.append(res)
        output = output + res
    return output

3.微调前chatglm-6b模型才能

Q_motif = "假定你是友善的心思辅导师,请依据咨询者的问题答复实在有用的答复。求助者:最近我遇到了一个问题。大姑姐离婚,不要女儿,所以我收养了她女儿。可是现在我儿子和姑姐女儿的联系很严重,她自己也不做家务,我该怎么办?支撑者:"
print(Q_motif)
result=glm_single_QA(model,tokenizer,Q_motif,2048,2048)
print("A:"+result)
假定你是友善的心思辅导师,请依据咨询者的问题答复实在有用的答复。求助者:最近我遇到了一个问题。大姑姐离婚,不要女儿,所以我收养了她女儿。可是现在我儿子和姑姐女儿的联系很严重,她自己也不做家务,我该怎么办?支撑者:
A:首要,我能够了解您所面对的应战和困惑。收养一个孩子是一个艰难的决议,或许会对家庭发生深远的影响。
关于您儿子和姑姐女儿的联系,您能够测验与他们树立杰出的交流和互动。您能够试着与他们共享您的主意和感触,以及您对收养孩子的观点。一起,您也能够测验与他们一起讨论怎么改进他们之间的联系。您还能够寻求专业心思咨询师或家庭治疗师的协助,以协助您更好地处理这个问题。
关于您大姑姐的女儿,您能够测验与她树立杰出的联系。您能够试着让她感触到您对她的关心和支撑,并鼓舞她积极参加家庭活动和家务。一起,您也能够测验与她共享您对收养孩子的观点,以及您期望她怎么参加孩子的生长和开展。
最终,我主张您寻求专业心思咨询师或家庭治疗师的协助。他们能够协助您更好地处理您所面对的问题,并供给专业的主张和支撑。

五、大模型微调

下面是prefix-tuning和lora两种微调方法,选择一种微调方法和对应的模型实例化代码即可,两个都选会报错

1.运用prefix-tuning对chatglm-6b进行微调

  • 假如想要完结自己的使命,请将–task_name_or_path后边参数修改为你的数据集地点目录

  • 假如微调过程中,报错out of memory,请修改–per_device_train_batch_size以及–per_device_eval_batch_size后边的参数为1

  • 练习代码来自 gitee.com/paddlepaddl…

  • 留意代码版别要一致,否则会有各种过错,例如:

曼曼心理咨询【基于ChatGLM-6B微调】
曼曼心理咨询【基于ChatGLM-6B微调】
# 创建微调模型保存目录
!mkdir -p soul_bak/chatglm-6b 
# 承认运用的是开发版的paddlepaddle-gpu 
!pip list|grep paddlepaddle
paddlepaddle-gpu               0.0.0.post112
!python work/finetune_generation.py \
    --output_dir soul_bak/chatglm-6b \
    --per_device_train_batch_size 2 \
    --per_device_eval_batch_size 2 \
    --gradient_accumulation_steps 1 \
    --model_name_or_path data/data217141 \
    --task_name_or_path  soul\
    --num_train_epochs 2 \
    --learning_rate 3e-2 \
    --warmup_ratio 0.03 \
    --logging_steps 250 \
    --eval_steps 500 \
    --save_steps 2000 \
    --src_length 128 \
    --tgt_length 512 \
    --fp16 \
    --fp16_opt_level O2 \
    --recompute True \
    --do_train \
    --do_eval \
    --disable_tqdm True \
    --metric_for_best_model accuracy \
    --load_best_model_at_end True \
    --do_generation False \
    --prefix_tuning True \
    --save_total_limit 1 
[2023-07-21 09:14:06,561] [    INFO] dygraph_sharding_optimizer.py:27 - g_shard_use_reduce 0
[2023-07-21 09:14:06,561] [    INFO] dygraph_sharding_optimizer.py:29 - g_shard_norm_align_dp 1
[2023-07-21 09:14:06,562] [    INFO] hybrid_parallel_optimizer.py:43 - g_shard_norm_align_dp 1
[2023-07-21 09:14:06,570] [    INFO] pipeline_parallel.py:48 - g_shard_use_reduce 0
[2023-07-21 09:14:08,233] [ WARNING] - evaluation_strategy reset to IntervalStrategy.STEPS for do_eval is True. you can also set evaluation_strategy='epoch'.
[2023-07-21 09:14:08,233] [    INFO] - The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).
[2023-07-21 09:14:08,233] [    INFO] - ============================================================
[2023-07-21 09:14:08,233] [    INFO] -      Model Configuration Arguments      
[2023-07-21 09:14:08,233] [    INFO] - paddle commit id              : fa084e5e15b951900e3d1f0ea12262305cdebe30
[2023-07-21 09:14:08,233] [    INFO] - do_generation                 : False
[2023-07-21 09:14:08,233] [    INFO] - lora                          : False
[2023-07-21 09:14:08,233] [    INFO] - model_name_or_path            : data/data217141
[2023-07-21 09:14:08,233] [    INFO] - prefix_tuning                 : True
[2023-07-21 09:14:08,234] [    INFO] - 
[2023-07-21 09:14:08,234] [    INFO] - ============================================================
[2023-07-21 09:14:08,234] [    INFO] -       Data Configuration Arguments      
[2023-07-21 09:14:08,234] [    INFO] - paddle commit id              : fa084e5e15b951900e3d1f0ea12262305cdebe30
[2023-07-21 09:14:08,234] [    INFO] - generate_num                  : 100
[2023-07-21 09:14:08,234] [    INFO] - num_beams                     : 5
[2023-07-21 09:14:08,234] [    INFO] - src_length                    : 128
[2023-07-21 09:14:08,234] [    INFO] - task_name_or_path             : soul
[2023-07-21 09:14:08,234] [    INFO] - tgt_length                    : 512
[2023-07-21 09:14:08,234] [    INFO] - 
[2023-07-21 09:14:08,234] [ WARNING] - Process rank: -1, device: gpu, world_size: 1, distributed training: False, 16-bits training: True
[2023-07-21 09:14:08,235] [    INFO] - loading configuration file data/data217141/config.json
[2023-07-21 09:14:08,236] [    INFO] - Model config ChatGLMConfig {
  "activation": "gelu",
  "attention_scale": true,
  "bos_token_id": 130004,
  "eos_token_id": 130005,
  "gmask_token_id": 130001,
  "hidden_size": 4096,
  "inner_hidden_size": 16384,
  "layernorm_epsilon": 1e-05,
  "mask_token_id": 130000,
  "max_sequence_length": 2048,
  "model_type": "chatglm",
  "num_attention_heads": 32,
  "num_hidden_layers": 28,
  "num_image_tokens": 0,
  "output_predict": true,
  "pad_token_id": 3,
  "paddle_dtype": "float16",
  "paddlenlp_version": null,
  "position_encoding_2d": true,
  "pre_seq_len": null,
  "prefix_projection": false,
  "quantization_bit": 0,
  "recompute": false,
  "use_cache": true,
  "vocab_size": 130528
}
W0721 09:14:35.405668  3227 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0721 09:14:35.409356  3227 gpu_resources.cc:149] device: 0, cuDNN Version: 8.2.
[2023-07-21 09:14:51,868] [ WARNING] - Some weights of the model checkpoint at data/data217141 were not used when initializing ChatGLMForConditionalGeneration: ['transformer.layers.16.attention.rotary_emb.inv_freq', 'transformer.layers.13.attention.rotary_emb.inv_freq', 'transformer.layers.12.attention.rotary_emb.inv_freq', 'transformer.layers.18.attention.rotary_emb.inv_freq', 'transformer.layers.1.attention.rotary_emb.inv_freq', 'transformer.layers.11.attention.rotary_emb.inv_freq', 'transformer.layers.26.attention.rotary_emb.inv_freq', 'transformer.layers.10.attention.rotary_emb.inv_freq', 'transformer.layers.24.attention.rotary_emb.inv_freq', 'transformer.layers.6.attention.rotary_emb.inv_freq', 'transformer.layers.5.attention.rotary_emb.inv_freq', 'transformer.layers.8.attention.rotary_emb.inv_freq', 'transformer.layers.9.attention.rotary_emb.inv_freq', 'transformer.layers.7.attention.rotary_emb.inv_freq', 'transformer.layers.4.attention.rotary_emb.inv_freq', 'transformer.layers.2
- This IS expected if you are initializing ChatGLMForConditionalGeneration from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing ChatGLMForConditionalGeneration from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2023-07-21 09:14:51,869] [ WARNING] - Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at data/data217141 and are newly initialized: ['transformer.layers.16.attention.rotary_embeddings.inv_freq', 'transformer.layers.9.attention.rotary_embeddings.inv_freq', 'transformer.layers.0.attention.rotary_embeddings.inv_freq', 'transformer.layers.15.attention.rotary_embeddings.inv_freq', 'transformer.layers.3.attention.rotary_embeddings.inv_freq', 'transformer.layers.10.attention.rotary_embeddings.inv_freq', 'transformer.layers.7.attention.rotary_embeddings.inv_freq', 'transformer.layers.4.attention.rotary_embeddings.inv_freq', 'transformer.layers.21.attention.rotary_embeddings.inv_freq', 'transformer.layers.2.attention.rotary_embeddings.inv_freq', 'transformer.layers.14.attention.rotary_embeddings.inv_freq', 'transformer.layers.19.attention.rotary_embeddings.inv_freq', 'transformer.layers.22.attention.rotary_embeddings.inv_freq', 'transformer.layers
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[2023-07-21 09:14:51,970] [    INFO] - Frozen parameters: 6.71e+09 || Trainable parameters:1.47e+07 || Total parameters:6.72e+09|| Trainable:0.22%
[2023-07-21 09:14:52,507] [    INFO] - Using half precision
[2023-07-21 09:14:52,515] [    INFO] - ============================================================
[2023-07-21 09:14:52,515] [    INFO] -     Training Configuration Arguments    
[2023-07-21 09:14:52,515] [    INFO] - paddle commit id              : fa084e5e15b951900e3d1f0ea12262305cdebe30
[2023-07-21 09:14:52,515] [    INFO] - _no_sync_in_gradient_accumulation: True
[2023-07-21 09:14:52,515] [    INFO] - adam_beta1                    : 0.9
[2023-07-21 09:14:52,515] [    INFO] - adam_beta2                    : 0.999
[2023-07-21 09:14:52,516] [    INFO] - adam_epsilon                  : 1e-08
[2023-07-21 09:14:52,516] [    INFO] - bf16                          : False
[2023-07-21 09:14:52,516] [    INFO] - bf16_full_eval                : False
[2023-07-21 09:14:52,516] [    INFO] - current_device                : gpu:0
[2023-07-21 09:14:52,516] [    INFO] - data_parallel_rank            : 0
[2023-07-21 09:14:52,516] [    INFO] - dataloader_drop_last          : False
[2023-07-21 09:14:52,516] [    INFO] - dataloader_num_workers        : 0
[2023-07-21 09:14:52,516] [    INFO] - dataset_rank                  : 0
[2023-07-21 09:14:52,516] [    INFO] - dataset_world_size            : 1
[2023-07-21 09:14:52,516] [    INFO] - device                        : gpu
[2023-07-21 09:14:52,516] [    INFO] - disable_tqdm                  : True
[2023-07-21 09:14:52,516] [    INFO] - do_eval                       : True
[2023-07-21 09:14:52,516] [    INFO] - do_export                     : False
[2023-07-21 09:14:52,516] [    INFO] - do_predict                    : False
[2023-07-21 09:14:52,516] [    INFO] - do_train                      : True
[2023-07-21 09:14:52,516] [    INFO] - eval_accumulation_steps       : None
[2023-07-21 09:14:52,516] [    INFO] - eval_batch_size               : 2
[2023-07-21 09:14:52,516] [    INFO] - eval_steps                    : 500
[2023-07-21 09:14:52,516] [    INFO] - evaluation_strategy           : IntervalStrategy.STEPS
[2023-07-21 09:14:52,516] [    INFO] - flatten_param_grads           : False
[2023-07-21 09:14:52,516] [    INFO] - fp16                          : True
[2023-07-21 09:14:52,516] [    INFO] - fp16_full_eval                : False
[2023-07-21 09:14:52,516] [    INFO] - fp16_opt_level                : O2
[2023-07-21 09:14:52,517] [    INFO] - gradient_accumulation_steps   : 1
[2023-07-21 09:14:52,517] [    INFO] - greater_is_better             : True
[2023-07-21 09:14:52,517] [    INFO] - ignore_data_skip              : False
[2023-07-21 09:14:52,517] [    INFO] - label_names                   : None
[2023-07-21 09:14:52,517] [    INFO] - lazy_data_processing          : True
[2023-07-21 09:14:52,517] [    INFO] - learning_rate                 : 0.03
[2023-07-21 09:14:52,517] [    INFO] - load_best_model_at_end        : True
[2023-07-21 09:14:52,517] [    INFO] - local_process_index           : 0
[2023-07-21 09:14:52,517] [    INFO] - local_rank                    : -1
[2023-07-21 09:14:52,517] [    INFO] - log_level                     : -1
[2023-07-21 09:14:52,517] [    INFO] - log_level_replica             : -1
[2023-07-21 09:14:52,517] [    INFO] - log_on_each_node              : True
[2023-07-21 09:14:52,517] [    INFO] - logging_dir                   : soul_bak/chatglm-6b/runs/Jul21_09-14-08_jupyter-89263-6559395
[2023-07-21 09:14:52,517] [    INFO] - logging_first_step            : False
[2023-07-21 09:14:52,517] [    INFO] - logging_steps                 : 250
[2023-07-21 09:14:52,517] [    INFO] - logging_strategy              : IntervalStrategy.STEPS
[2023-07-21 09:14:52,517] [    INFO] - lr_scheduler_type             : SchedulerType.LINEAR
[2023-07-21 09:14:52,517] [    INFO] - max_grad_norm                 : 1.0
[2023-07-21 09:14:52,517] [    INFO] - max_steps                     : -1
[2023-07-21 09:14:52,517] [    INFO] - metric_for_best_model         : accuracy
[2023-07-21 09:14:52,517] [    INFO] - minimum_eval_times            : None
[2023-07-21 09:14:52,517] [    INFO] - no_cuda                       : False
[2023-07-21 09:14:52,517] [    INFO] - num_train_epochs              : 2.0
[2023-07-21 09:14:52,517] [    INFO] - optim                         : OptimizerNames.ADAMW
[2023-07-21 09:14:52,517] [    INFO] - optimizer_name_suffix         : None
[2023-07-21 09:14:52,517] [    INFO] - output_dir                    : soul_bak/chatglm-6b
[2023-07-21 09:14:52,518] [    INFO] - overwrite_output_dir          : False
[2023-07-21 09:14:52,518] [    INFO] - past_index                    : -1
[2023-07-21 09:14:52,518] [    INFO] - per_device_eval_batch_size    : 2
[2023-07-21 09:14:52,518] [    INFO] - per_device_train_batch_size   : 2
[2023-07-21 09:14:52,518] [    INFO] - pipeline_parallel_config      : 
[2023-07-21 09:14:52,518] [    INFO] - pipeline_parallel_degree      : -1
[2023-07-21 09:14:52,518] [    INFO] - pipeline_parallel_micro_batch_size: 1
[2023-07-21 09:14:52,518] [    INFO] - pipeline_parallel_rank        : 0
[2023-07-21 09:14:52,518] [    INFO] - prediction_loss_only          : False
[2023-07-21 09:14:52,518] [    INFO] - process_index                 : 0
[2023-07-21 09:14:52,518] [    INFO] - recompute                     : True
[2023-07-21 09:14:52,518] [    INFO] - remove_unused_columns         : True
[2023-07-21 09:14:52,518] [    INFO] - report_to                     : ['visualdl']
[2023-07-21 09:14:52,518] [    INFO] - resume_from_checkpoint        : None
[2023-07-21 09:14:52,518] [    INFO] - run_name                      : soul_bak/chatglm-6b
[2023-07-21 09:14:52,518] [    INFO] - save_on_each_node             : False
[2023-07-21 09:14:52,518] [    INFO] - save_steps                    : 2000
[2023-07-21 09:14:52,518] [    INFO] - save_strategy                 : IntervalStrategy.STEPS
[2023-07-21 09:14:52,518] [    INFO] - save_total_limit              : 1
[2023-07-21 09:14:52,518] [    INFO] - scale_loss                    : 32768
[2023-07-21 09:14:52,518] [    INFO] - seed                          : 42
[2023-07-21 09:14:52,518] [    INFO] - sharding                      : []
[2023-07-21 09:14:52,518] [    INFO] - sharding_degree               : -1
[2023-07-21 09:14:52,518] [    INFO] - sharding_parallel_degree      : -1
[2023-07-21 09:14:52,518] [    INFO] - sharding_parallel_rank        : 0
[2023-07-21 09:14:52,518] [    INFO] - should_log                    : True
[2023-07-21 09:14:52,518] [    INFO] - should_save                   : True
[2023-07-21 09:14:52,519] [    INFO] - should_save_model_state       : True
[2023-07-21 09:14:52,519] [    INFO] - skip_memory_metrics           : True
[2023-07-21 09:14:52,519] [    INFO] - tensor_parallel_degree        : -1
[2023-07-21 09:14:52,519] [    INFO] - tensor_parallel_rank          : 0
[2023-07-21 09:14:52,519] [    INFO] - train_batch_size              : 2
[2023-07-21 09:14:52,519] [    INFO] - use_hybrid_parallel           : False
[2023-07-21 09:14:52,519] [    INFO] - warmup_ratio                  : 0.03
[2023-07-21 09:14:52,519] [    INFO] - warmup_steps                  : 0
[2023-07-21 09:14:52,519] [    INFO] - weight_decay                  : 0.0
[2023-07-21 09:14:52,519] [    INFO] - weight_name_suffix            : None
[2023-07-21 09:14:52,519] [    INFO] - world_size                    : 1
[2023-07-21 09:14:52,519] [    INFO] - 
[2023-07-21 09:14:52,522] [    INFO] - ***** Running training *****
[2023-07-21 09:14:52,523] [    INFO] -   Num examples = 1000
[2023-07-21 09:14:52,523] [    INFO] -   Num Epochs = 2
[2023-07-21 09:14:52,523] [    INFO] -   Instantaneous batch size per device = 2
[2023-07-21 09:14:52,523] [    INFO] -   Total train batch size (w. parallel, distributed & accumulation) = 2
[2023-07-21 09:14:52,523] [    INFO] -   Gradient Accumulation steps = 1
[2023-07-21 09:14:52,523] [    INFO] -   Total optimization steps = 1000
[2023-07-21 09:14:52,523] [    INFO] -   Total num train samples = 2000
[2023-07-21 09:14:52,524] [    INFO] -   Number of trainable parameters = 14680064 (per device)
Found inf or nan, current scale is: 32768.0, decrease to: 32768.0*0.5
[2023-07-21 09:14:54,158] [ WARNING] - optimizer not run, scale_before: 32768.0, scale_after: 16384.0
Found inf or nan, current scale is: 16384.0, decrease to: 16384.0*0.5
[2023-07-21 09:14:57,539] [ WARNING] - optimizer not run, scale_before: 16384.0, scale_after: 8192.0
[2023-07-21 09:16:54,738] [    INFO] - loss: 3.24559375, learning_rate: 0.02326, global_step: 250, interval_runtime: 122.2133, interval_samples_per_second: 4.091, interval_steps_per_second: 2.046, ppl: 25.67695122484951, epoch: 0.5
[2023-07-21 09:18:56,960] [    INFO] - loss: 2.94326953, learning_rate: 0.01553, global_step: 500, interval_runtime: 122.2218, interval_samples_per_second: 4.091, interval_steps_per_second: 2.045, ppl: 18.977793453082345, epoch: 1.0
[2023-07-21 09:18:56,961] [    INFO] - ***** Running Evaluation *****
[2023-07-21 09:18:56,961] [    INFO] -   Num examples = 0
[2023-07-21 09:18:56,961] [    INFO] -   Total prediction steps = 0
[2023-07-21 09:18:56,961] [    INFO] -   Pre device batch size = 2
[2023-07-21 09:18:56,961] [    INFO] -   Total Batch size = 2
[2023-07-21 09:18:56,966] [    INFO] - eval_runtime: 0.0045, eval_samples_per_second: 0.0, eval_steps_per_second: 0.0, epoch: 1.0
[2023-07-21 09:20:59,372] [    INFO] - loss: 2.83462891, learning_rate: 0.007794, global_step: 750, interval_runtime: 122.4124, interval_samples_per_second: 4.085, interval_steps_per_second: 2.042, ppl: 17.024081661613064, epoch: 1.5
[2023-07-21 09:23:02,859] [    INFO] - loss: 2.80505469, learning_rate: 6.186e-05, global_step: 1000, interval_runtime: 123.4859, interval_samples_per_second: 4.049, interval_steps_per_second: 2.025, ppl: 16.52797979656062, epoch: 2.0
[2023-07-21 09:23:02,859] [    INFO] - ***** Running Evaluation *****
[2023-07-21 09:23:02,859] [    INFO] -   Num examples = 0
[2023-07-21 09:23:02,859] [    INFO] -   Total prediction steps = 0
[2023-07-21 09:23:02,859] [    INFO] -   Pre device batch size = 2
[2023-07-21 09:23:02,859] [    INFO] -   Total Batch size = 2
[2023-07-21 09:23:02,864] [    INFO] - eval_runtime: 0.0044, eval_samples_per_second: 0.0, eval_steps_per_second: 0.0, epoch: 2.0
[2023-07-21 09:23:02,864] [    INFO] - 
Training completed. 
[2023-07-21 09:23:02,865] [    INFO] - train_runtime: 490.3406, train_samples_per_second: 4.079, train_steps_per_second: 2.039, train_loss: 2.95713671875, epoch: 2.0
[2023-07-21 09:23:02,865] [    INFO] - Saving model checkpoint to soul_bak/chatglm-6b
[2023-07-21 09:23:03,034] [    INFO] - tokenizer config file saved in soul_bak/chatglm-6b/tokenizer_config.json
[2023-07-21 09:23:03,035] [    INFO] - Special tokens file saved in soul_bak/chatglm-6b/special_tokens_map.json
[2023-07-21 09:23:03,044] [    INFO] - ***** train metrics *****
[2023-07-21 09:23:03,044] [    INFO] -   epoch                    =        2.0
[2023-07-21 09:23:03,044] [    INFO] -   train_loss               =     2.9571
[2023-07-21 09:23:03,044] [    INFO] -   train_runtime            = 0:08:10.34
[2023-07-21 09:23:03,044] [    INFO] -   train_samples_per_second =      4.079
[2023-07-21 09:23:03,044] [    INFO] -   train_steps_per_second   =      2.039
[2023-07-21 09:23:03,046] [    INFO] - ***** Running Evaluation *****
[2023-07-21 09:23:03,046] [    INFO] -   Num examples = 0
[2023-07-21 09:23:03,046] [    INFO] -   Total prediction steps = 0
[2023-07-21 09:23:03,046] [    INFO] -   Pre device batch size = 2
[2023-07-21 09:23:03,046] [    INFO] -   Total Batch size = 2
[2023-07-21 09:23:03,051] [    INFO] - eval_runtime: 0.005, eval_samples_per_second: 0.0, eval_steps_per_second: 0.0, epoch: 2.0
[2023-07-21 09:23:03,051] [    INFO] - ***** test metrics *****
[2023-07-21 09:23:03,051] [    INFO] -   epoch                   =        2.0
[2023-07-21 09:23:03,051] [    INFO] -   eval_runtime            = 0:00:00.00
[2023-07-21 09:23:03,051] [    INFO] -   eval_samples_per_second =        0.0
[2023-07-21 09:23:03,051] [    INFO] -   eval_steps_per_second   =        0.0

2.加载prefix权重

将实例化后的模型,直接加载prefix-tuning权重,无需重载模型

%cd ~
from paddlenlp.peft import ChatGLMForCausalLM
from paddlenlp.peft.prefix import (
    chatglm_pad_attention_mask,
    chatglm_postprocess_past_key_value,
)
model = ChatGLMForCausalLM.from_pretrained(model, '/home/aistudio/soul_bak/chatglm-6b', chatglm_postprocess_past_key_value, chatglm_pad_attention_mask)

3.运用lora对chatglm-6b进行微调

运用v100进行微调很快完结(yuanshen文件夹下数据)

假如想要完结自己的使命,请将–task_name_or_path后边参数修改为你的数据集地点目录

假如微调过程中,报错out of memory,请修改–per_device_train_batch_size以及–per_device_eval_batch_size后边的参数为1

# 创建微调模型保存目录
!mkdir -p soul_bak/chatglm-6b_lora
!python work/finetune_generation.py \
    --output_dir soul_bak/chatglm-6b_lora \
    --per_device_train_batch_size 4 \
    --gradient_accumulation_steps 2 \
    --per_device_eval_batch_size 8 \
    --model_name_or_path THUDM/chatglm-6b \
    --task_name_or_path  soul \
    --num_train_epochs 2 \
    --learning_rate 3e-4 \
    --warmup_steps 30 \
    --logging_steps 1 \
    --evaluation_strategy epoch \
    --save_strategy epoch \
    --src_length 1024 \
    --tgt_length 1024 \
    --fp16 \
    --fp16_opt_level O2 \
    --do_train \
    --do_eval \
    --disable_tqdm True \
    --load_best_model_at_end True \
    --metric_for_best_model accuracy \
    --eval_with_do_generation False \
    --recompute \
    --save_total_limit 1 \
    --overwrite_output_dir \
    --lora True \
    --lora_rank 8

4.加载lora权重

将实例化后的模型,直接加载lora权重,无需重载模型

from paddlenlp.peft import LoRAModel
model = LoRAModel.from_pretrained(model, 'soul_bak/chatglm-6b_lora/checkpoint-125/')
model.mark_only_lora_as_trainable()
[2023-07-21 10:31:31,437] [ WARNING] - Reset tensor_parallel_degree of lora_config to 0.
[2023-07-21 10:31:31,475] [    INFO] - Loading the LoRA weights from soul_bak/chatglm-6b_lora/checkpoint-125/lora_model_state.pdparams
[2023-07-21 10:31:31,506] [    INFO] - Load lora weight successfully

5.文本预处理,推理,输出后处理,猜测文本以及将文本batch化

#预处理
def preprocess(input_text):
    inputs = tokenizer(
            input_text,
            return_tensors="np",
            padding=True,
            max_length=128,
            truncation=True,
            truncation_side="left",
        )
    inputs_tensor = {}
    for key in inputs:
            inputs_tensor[key] = paddle.to_tensor(inputs[key])
    return inputs_tensor
#推理
def infer(inputs):
    result = model.generate(
            **inputs,
            decode_strategy="sampling",
            top_k=1,
            max_length=128,
            bos_token_id=tokenizer.bos_token_id,
            eos_token_id=tokenizer.eos_token_id,
            pad_token_id=tokenizer.pad_token_id,
            use_cache=True,
        )
    result = result[0]
    return result
#后处理
def postprocess(infer_data):
    result = []
    for x in infer_data.tolist():
        res = tokenizer.decode(x, skip_special_tokens=True)
        res = res.strip("\n")
        result.append(res)
    out_dict = {"result": result}
    return out_dict
#文本猜测
def predict(texts):
    input_map = preprocess(texts)
    infer_result = infer(input_map)
    output = postprocess(infer_result)
    return output
#输入batch化
def batchfy_text(texts, batch_size):
    batch_texts = []
    batch_start = 0
    while batch_start < len(texts):
        batch_texts += [texts[batch_start : min(batch_start + batch_size, len(texts))]]
        batch_start += batch_size
    return batch_texts

6.比照

def pre_Single_conversation(input_text):
    all_texts = [
        str(input_text)
    ]
    batch_texts = batchfy_text(all_texts, 1)
    for bs, texts in enumerate(batch_texts):
        outputs = predict(texts)
        for text, result in zip(texts, outputs["result"]):
            print("{}\n{}".format(text, text+result))
pre_Single_conversation(input_text = "假定你是友善的心思辅导师,请依据咨询者的问题答复实在有用的答复。求助者:最近我遇到了一个问题。大姑姐离婚,不要女儿,所以我收养了她女儿。可是现在我儿子和姑姐女儿的联系很严重,她自己也不做家务,我该怎么办?支撑者:")
假定你是友善的心思辅导师,请依据咨询者的问题答复实在有用的答复。求助者:最近我遇到了一个问题。大姑姐离婚,不要女儿,所以我收养了她女儿。可是现在我儿子和姑姐女儿的联系很严重,她自己也不做家务,我该怎么办?支撑者:
假定你是友善的心思辅导师,请依据咨询者的问题答复实在有用的答复。求助者:最近我遇到了一个问题。大姑姐离婚,不要女儿,所以我收养了她女儿。可是现在我儿子和姑姐女儿的联系很严重,她自己也不做家务,我该怎么办?支撑者:听到你的阅历,我很了解你的感触。你收养了她女儿,她也很感谢你。可是,你儿子和姑姐女儿的联系严重,或许是由于你们之间的共处方法不同。你能够测验和他们交流,了解他们的主意和感触。
  • Q_motif = “假定你是友善的心思辅导师,请依据咨询者的问题答复实在有用的答复。求助者:最近我遇到了一个问题。大姑姐离婚,不要女儿,所以我收养了她女儿。可是现在我儿子和姑姐女儿的联系很严重,她自己也不做家务,我该怎么办?支撑者:”
  • 微调前:首要,我能够了解您所面对的应战和困惑。收养一个孩子是一个艰难的决议,或许会对家庭发生深远的影响。
    • 关于您儿子和姑姐女儿的联系,您能够测验与他们树立杰出的交流和互动。您能够试着与他们共享您的主意和感触,以及您对收养孩子的观点。一起,您也能够测验与他们一起讨论怎么改进他们之间的联系。您还能够寻求专业心思咨询师或家庭治疗师的协助,以协助您更好地处理这个问题。
    • 关于您大姑姐的女儿,您能够测验与她树立杰出的联系。您能够试着让她感触到您对她的关心和支撑,并鼓舞她积极参加家庭活动和家务。一起,您也能够测验与她共享您对收养孩子的观点,以及您期望她怎么参加孩子的生长和开展。
    • 最终,我主张您寻求专业心思咨询师或家庭治疗师的协助。他们能够协助您更好地处理您所面对的问题,并供给专业的主张和支撑。
  • 微调后: 假定你是友善的心思辅导师,请依据咨询者的问题答复实在有用的答复。求助者:最近我遇到了一个问题。大姑姐离婚,不要女儿,所以我收养了她女儿。可是现在我儿子和姑姐女儿的联系很严重,她自己也不做家务,我该怎么办?支撑者:听到你的阅历,我很了解你的感触。你收养了她女儿,她也很感谢你。可是,你儿子和姑姐女儿的联系严重,或许是由于你们之间的共处方法不同。你能够测验和他们交流,了解他们的主意和感触。
Q_motif = "我最近很焦虑,上班总是划水,我该怎么办?"
print(Q_motif)
result=glm_single_QA(model,tokenizer,Q_motif,2048,2048)
print("A:"+result)
我最近很焦虑,上班总是划水,我该怎么办?
A:焦虑是一种很常见的心情,但也是能够被克服的。以下是一些主张:
1. 深呼吸:深呼吸能够协助你放松身体和思想。试着渐渐吸气,然后渐渐呼气,重复几回。
2. 寻求支撑:假如感到焦虑,能够寻求朋友或家人的支撑。他们能够供给一些安慰和主张。
3. 拟定计划:拟定一个计划能够协助你更好地办理时间。试着拟定一个日程表,并设置优先级。
4. 寻找支撑:假如感到作业划水,能够寻求同事或上司的协助。他们能够供给一些主张和指导。
5. 学习放松技巧:学习一些放松技巧,如冥想或瑜伽,能够协助你更好地办理焦虑。
期望这些主张能有所协助。假如感到焦虑心情继续存在,请考虑寻求专业协助。

7.留意事项

  • 1.微调前模型就具备一些才能;
  • 2.微调能够不必全量数据,能够运用1%的数据提升速度;
  • 3.运用微调后,能较好的依据自己的需求生成文本 ;
  • 4.运用最新的PaddleNLP,由于api变动较大,主张参照此文。

六、构思总结

MyMind是一款依据AI大模型的心思健康辅佐运用,旨在协助用户完结心情办理、自我生长和心里平衡。经过个性化的用户画像、心情剖析与办理、自我探究与生长、心里安静与放松以及交际支撑和共享等功能,供给全方位的心思健康支撑。借助人工智能技能和专业合作伙伴的支撑,MyMind将成为用户在充实人生、保持心思健康方面的得力助手。

项目地址: aistudio.baidu.com/aistudio/pr…