使用 OpenCompass 评测 InternLM2-Chat-7B 模型在 C-Eval 数据集上的性能
import torch
from modelscope import snapshot_download, AutoModel, AutoTokenizer
import os
# model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2-chat-7b', cache_dir='/root/model', revision='v1.0.3')
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2-chat-7b', cache_dir='/root/model')
git clone https://gitee.com/hunan_ai_league_jayhust/opencompass
cd opencompass
cp /share/temp/datasets/OpenCompassData-core-20231110.zip /root/opencompass/
unzip OpenCompassData-core-20231110.zip
python tools/list_configs.py internlm2 ceval
结果如下:
+-----------------------+-----------------------------------------------------+
| Model | Config Path |
|-----------------------+-----------------------------------------------------|
| hf_internlm2_20b | configs/models/hf_internlm/hf_internlm2_20b.py |
| hf_internlm2_7b | configs/models/hf_internlm/hf_internlm2_7b.py |
| hf_internlm2_chat_20b | configs/models/hf_internlm/hf_internlm2_chat_20b.py |
| hf_internlm2_chat_7b | configs/models/hf_internlm/hf_internlm2_chat_7b.py |
+-----------------------+-----------------------------------------------------+
+----------------------------+------------------------------------------------------+
| Dataset | Config Path |
|----------------------------+------------------------------------------------------|
| ceval_clean_ppl | configs/datasets/ceval/ceval_clean_ppl.py |
| ceval_gen | configs/datasets/ceval/ceval_gen.py |
| ceval_gen_2daf24 | configs/datasets/ceval/ceval_gen_2daf24.py |
| ceval_gen_5f30c7 | configs/datasets/ceval/ceval_gen_5f30c7.py |
| ceval_ppl | configs/datasets/ceval/ceval_ppl.py |
| ceval_ppl_578f8d | configs/datasets/ceval/ceval_ppl_578f8d.py |
| ceval_ppl_93e5ce | configs/datasets/ceval/ceval_ppl_93e5ce.py |
| ceval_zero_shot_gen_bd40ef | configs/datasets/ceval/ceval_zero_shot_gen_bd40ef.py |
+----------------------------+------------------------------------------------------+
(先下载模型到本地:/root/model/Shanghai_AI_Laboratory/internlm2-chat-7b)
python run.py --datasets ceval_gen --hf-path /root/model/Shanghai_AI_Laboratory/internlm2-chat-7b/ --tokenizer-path /root/model/Shanghai_AI_Laboratory/internlm2-chat-7b/ --tokenizer-kwargs padding_side='left' truncation='left' trust_remote_code=True --model-kwargs trust_remote_code=True device_map='auto' --max-seq-len 2048 --max-out-len 16 --batch-size 4 --num-gpus 1 --debug