chatglm2-6b p-tuning微调后变傻chatglm2-6b ptuning微调后无法正常交流，全都是胡说八

通过官方给出的微调步骤 github ，部署模型后基本无法正常交流，以下是效果：

没有报错，步骤完全是根据官方给出的参考，使用web_demo.sh启动也是同样效果。chatglm2和1都是如此，有了解原因的吗？

PRE_SEQ_LEN=128
LR=2e-2
NUM_GPUS=3

torchrun --standalone --nnodes=1 --nproc-per-node=$NUM_GPUS main.py \
    --do_train \
    --train_file AdvertiseGen/train.json \
    --validation_file AdvertiseGen/dev.json \
    --preprocessing_num_workers 10 \
    --prompt_column content \
    --response_column summary \
    --overwrite_cache \
    --model_name_or_path $model_path \
    --output_dir output/adgen-chatglm2-6b-pt-$PRE_SEQ_LEN-$LR \
    --overwrite_output_dir \
    --max_source_length 64 \
    --max_target_length 128 \
    --per_device_train_batch_size 1 \
    --per_device_eval_batch_size 1 \
    --gradient_accumulation_steps 16 \
    --predict_with_generate \
    --max_steps 3000 \
    --logging_steps 10 \
    --save_steps 1000 \
    --learning_rate $LR \
    --pre_seq_len $PRE_SEQ_LEN \ ```

torchrun --standalone --nnodes=1 --nproc-per-node=$NUM_GPUS main.py \
--do_predict \
--validation_file AdvertiseGen/dev.json \
--test_file AdvertiseGen/dev.json \
--overwrite_cache \
--prompt_column content \
--response_column summary \
--model_name_or_path $model_path \
--ptuning_checkpoint ./output/$CHECKPOINT/checkpoint-$STEP \
--output_dir ./output/$CHECKPOINT \
--overwrite_output_dir \
--max_source_length 64 \
--max_target_length 64 \
--per_device_eval_batch_size 1 \
--predict_with_generate \
--pre_seq_len $PRE_SEQ_LEN \
```

tokenizer = AutoTokenizer.from_pretrained(chatglm26b_PATH, trust_remote_code=True)

config = AutoConfig.from_pretrained(chatglm26b_PATH, trust_remote_code=True, pre_seq_len=128)
model = AutoModel.from_pretrained(chatglm26b_PATH, config=config, trust_remote_code=True)
prefix_state_dict = torch.load(os.path.join(CHECKPOINT_PATH, "pytorch_model.bin"))
new_prefix_state_dict = {}
for k, v in prefix_state_dict.items():
    if k.startswith("transformer.prefix_encoder."):
        new_prefix_state_dict[k[len("transformer.prefix_encoder."):]] = v
model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict)

model = model.half().cuda()
model.transformer.prefix_encoder.float()
model = model.eval()


print("用户：你好\n")
response, history = model.chat(tokenizer, "你好", history=[])
print("ChatGLM-6B：\n",response)
print("\n------------------------------------------------\n用户：")

line = input()
while line:
    response, history = model.chat(tokenizer, line, history=history)
    print("ChatGLM-6B：\n", response)
    print("\n------------------------------------------------\n用户：")
    line = input()
    ```