用于评估大语言模型的数据集 EmbodiedTech 2024-07-22 44 阅读1分钟 MMLU:docs.confident-ai.com/docs/benchm…;arxiv.org/pdf/2009.03… CMMLU:github.com/haonan-li/C… CEVAL:github.com/hkust-nlp/c…