用ms-swift finetune Llama 3.1_8B_Instruct在ModelScope平台上下载Llam

Finetuning steps

数据准备

数据集生成
格式标准化：将数据集格式化为jsonl

模型加载与配置

在ModelScope平台上下载Llama 3.1 8B模型。

# 下载模型
import transformers
import torch
from modelscope import snapshot_download
model_id = snapshot_download("LLM-Research/Meta-Llama-3.1-8B-Instruct",cache_dir='/data/llama3.1')
print("model:",model_id)

# pipeline部署
pipeline=transformers.pipeline("text-generation",model=model_id,model_kwargs={"torch_dtype":torch.bfloat16},device_map="auto",)

# 推理
messages=[
    {
    'role': 'system',
    'content': 'You are a helpful assistant named "Pipo".',
  },
  {
    'role': 'user',
    'content': 'Can you tell me a story?',
  },]
  
output=pipeline(messages,max_new_tokens=128,)
print(outputs[0]["generated_text"][-1]

clone ms-swift git clone https://github.com/modelscope/ms-swift.git

微调训练

设置训练参数：根据任务需求，设置合适的训练轮次、批处理大小、优化器等参数。

#微调
CUDA_VISIBLE_DEVICES=0,1,2,3 swift sft --model_type llama3_1-8b-instruct  
--model_id_or_path   ./llama3.1/LLM-Research/Meta-Llama-3___1-8B-Instruct 
--sft_type lora 
--output_dir output-function-call   
--dataset  /home/yezh/codes/ms-swift/converted_dataset.jsonl 
--num_train_epochs 1 
--max_length 2048 --gradient_checkpointing true 
--batch_size 1 
--gradient_accumulation_steps 16 
--warmup_ratio 0.1 
--eval_steps 100 
--save_steps 100 
--save_total_limit -1 
--logging_steps 10

#合并参数
CUDA_VISIBLE_DEVICES=0 swift export  --ckpt_dir  /home/yezh/codes/ms-swift/output-self/llama3_1-8b-instruct/v3-20240906-193023/checkpoint-1237 --merge_lora true

#部署合并后的模型
CUDA_VISIBLE_DEVICES=0 swift infer --ckpt_dir  /home/yezh/codes/ms-swift/output-self/llama3_1-8b-instruct/v3-20240906-193023/checkpoint-1237-merged --infer_backend vllm --max_model_len 4096 

#在<<<中执行推理查看效果

然后就可以欣赏自己的垃圾模型啦~ ~~至少我的非常垃圾哈哈哈哈~~

参数解释

-   CUDA_VISIBLE_DEVICES=0：指定使用第0号GPU进行训练。
-   swift sft：调用ms-swift工具进行微调（sft）。
-   --model_type llama3_1-8b-instruct：指定模型类型为llama3_1-8b-instruct。
-   --model_id_or_path ./llama3.1/LLM-Research/Meta-Llama-3___1-8B-Instruct：指定模型的路径。
-   --sft_type lora：指定微调类型为LoRA（Low-Rank Adaptation）。
-   --output_dir output：指定输出目录为output。
-   --dataset classical-chinese-translate：指定使用的数据集为classical-chinese-translate。
-   --num_train_epochs 1：指定训练的轮数为1。
-   --max_length 2048：指定输入序列的最大长度为2048。
-   --gradient_checkpointing true：启用梯度检查点，减少显存占用。
-   --batch_size 1：指定批处理大小为1。
-   --gradient_accumulation_steps 16：指定梯度累积的步数为16，即每16步更新一次模型参数。
-   --warmup_ratio 0.1：指定学习率预热比例为0.1。
-   --eval_steps 100：每100步进行一次评估。
-   --save_steps 100：每100步保存一次模型。
-   --save_total_limit -1：保存的模型数量没有限制。
-   --logging_steps 10：每10步记录一次日志。