​ 第三期书生大模型实战营 基础岛第四关 OpenCompass 评测 InternLM-1.8B 实践

109 阅读1分钟

环境搭建

conda create -n opencompass python=3.10
conda activate opencompass
conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=12.1 -c pytorch -c nvidia -y

安装​ OpenCompass

cd /root
git clone -b 0.2.4 https://github.com/open-compass/opencompass
cd opencompass
pip install -e .


# 安装依赖
apt-get install cmake
pip install -r requirements.txt
pip install protobuf

配置与执行

修改文件/root/opencompass/configs/models/hf_internlm/hf_internlm2_1_8b.py,主要是修改模型的实际路径。 修改前后的代码如下:(注释部分为修改前的代码)

from opencompass.models import HuggingFaceCausalLM


# models = [
#     dict(
#         type=HuggingFaceCausalLM,
#         abbr='internlm2-1.8b-hf',
#         path="internlm/internlm2-1_8b",
#         tokenizer_path='internlm/internlm2-1_8b',
#         model_kwargs=dict(
#             trust_remote_code=True,
#             device_map='auto',
#         ),
#         tokenizer_kwargs=dict(
#             padding_side='left',
#             truncation_side='left',
#             use_fast=False,
#             trust_remote_code=True,
#         ),
#         max_out_len=100,
#         min_out_len=1,
#         max_seq_len=2048,
#         batch_size=8,
#         run_cfg=dict(num_gpus=1, num_procs=1),
#     )
# ]


models = [
    dict(
        type=HuggingFaceCausalLM,
        abbr='internlm2-1.8b-hf',
        path="/root/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b",
        tokenizer_path='/root/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b',
        model_kwargs=dict(
            trust_remote_code=True,
            device_map='auto',
        ),
        tokenizer_kwargs=dict(
            padding_side='left',
            truncation_side='left',
            use_fast=False,
            trust_remote_code=True,
        ),
        max_out_len=100,
        min_out_len=1,
        max_seq_len=2048,
        batch_size=8,
        run_cfg=dict(num_gpus=1, num_procs=1),
    )
]

接下来使用以下命令执行评测:

python run.py --datasets ceval_gen --models hf_internlm2_chat_1_8b --debug

得分

评测完成后得到以下评分:

image.png

image.png