大语言模型对齐研究综述

刘昆麟; 屈新纪; 谭芳; 康红辉; 赵少伟; 施嵘

doi:10.11959/j.issn.1000-0801.2024151

您当前的位置：

首页 >

文章列表页 >

大语言模型对齐研究综述

专栏：面向大模型的网络技术 | 更新时间：2024-08-05

- 大语言模型对齐研究综述
- Survey on large language models alignment research
- 电信科学 2024年40卷第6期页码：173-194
- 作者机构：
  
  中兴通讯股份有限公司，广东深圳 518057
- 作者简介：
  
  [ "刘昆麟（1997- ），男，博士，现就职于中兴通讯股份有限公司，主要研究方向为电信领域的大语言模型、大模型安全等。" ]
  [ "屈新纪（1999- ），男，现就职于中兴通讯股份有限公司，主要研究方向为大语言模型等。" ]
  [ "谭芳（1976- ），男，中兴通讯股份有限公司高级工程师，主要研究方向为大数据、大语言模型等。" ]
  [ "康红辉（1972- ），男，中兴通讯股份有限公司无线网络智能化首席架构师，主要研究方向为未来网智、自智网络等。" ]
  [ "赵少伟（1974- ），男，中兴通讯股份有限公司副总裁，主要研究方向为无线虚拟化平台、人工智能等。" ]
  [ "施嵘（1973- ），男，中兴通讯股份有限公司副总裁，主要研究方向为无线网络产品研发、云和人工智能基础设施等。" ]
- 基金信息：
- DOI：10.11959/j.issn.1000-0801.2024151
  中图分类号： TN92
- 收稿日期：2024-03-28，
  
  修回日期：2024-05-18，
  
  纸质出版日期：2024-06-20
- 稿件说明：
移动端阅览
刘昆麟,屈新纪,谭芳等.大语言模型对齐研究综述[J].电信科学,2024,40(06):173-194.

LIU Kunlin,QU Xinji,TAN Fang,et al.Survey on large language models alignment research[J].Telecommunications Science,2024,40(06):173-194.
刘昆麟,屈新纪,谭芳等.大语言模型对齐研究综述[J].电信科学,2024,40(06):173-194. DOI： 10.11959/j.issn.1000-0801.2024151.

LIU Kunlin,QU Xinji,TAN Fang,et al.Survey on large language models alignment research[J].Telecommunications Science,2024,40(06):173-194. DOI： 10.11959/j.issn.1000-0801.2024151.

摘要

随着人工智能技术的飞速发展，大语言模型已在众多领域得到了广泛应用。然而，大语言模型可能会生成不准确、有误导性甚至有害的内容，这引发了人们对大语言模型可靠性的担忧，采用对齐技术来确保大语言模型的行为与人类价值观一致已经成为一个亟待解决的问题。对近年来大语言模型对齐技术的研究进展进行综述。介绍了常用的指令数据收集方法和人类偏好数据集，概述了监督调整和对齐调整的相关研究，讨论了模型评估常用的数据集和方法，总结并展望了未来的研究方向。

Abstract

With the rapid development of artificial intelligence technology

large language models have been widely applied in numerous fields. However

the potential of large language models to generate inaccurate

misleading

or even harmful contents has raised concerns about their reliability. Adopting alignment techniques to ensure the behavior of large language models is consistent with human values has become an urgent issue to address. Recent research progress on alignment techniques for large language models were surveyed. Common methods for collecting instruction data and human preference datasets were introduced

research on supervised tuning and alignment adjustments was summarized

commonly used datasets and methods for model evaluation were discussed

and future research directions were concluded.

关键词

Keywords

references

OPENAI . Introducing ChatGPT [EB ] . 2023 .

BUBECK S , CHANDRASEKARAN V , ELDAN R , et al . Sparks of artificial general intelligence: early experiments with GPT-4 [J ] . arXiv Preprint , arXiv: 2303.12712 , 2023 .

OPENAI . GPT-4 technical report [J ] . arXiv Preprint , arXiv: 2303.08774 , 2023 .

OUYANG L , WU J , JIANG X , et al . Training language models to follow instructions with human feedback [J ] . Advances in Neural Information Processing Systems , 2022 ( 35 ): 27730 - 27744 .

BACH S H , SANH V , YONG Z X , et al . Promptsource: an integrated development environment and repository for natural language prompts [J ] . arXiv Preprint , arXiv: 2202.01279 , 2022 .

WEI J , BOSMA M , ZHAO V Y , et al . Finetuned language models are zero-shot learners [J ] . arXiv Preprint , arXiv: 2109.01652 , 2021 .

LONGPRE S , HOU L , VU T , et al . The flan collection: designing data and methods for effective instruction tuning [J ] . arXiv Preprint , arXiv: 2301.13688 , 2023 .

WANG Y Z , MISHRA S , ALIPOORMOLABASHI P , et al . Super-NaturalInstructions: generalization via declarative instructions on 1600+ NLP tasks [J ] . arXiv Preprint , arXiv: 2204.07705 , 2022 .

MISHRA S , KHASHABI D , BARAL C , et al . Cross-task generalization via natural language crowdsourcing instructions [J ] . arXiv Preprint , arXiv: 2104.08773 , 2021 .

ZHANG G , SHI Y M , LIU R B , et al . Chinese open instruction generalist: a preliminary release [J ] . arXiv Preprint , arXiv: 2304.07987 , 2023 .

CONOVER M , HAYES M , MATHUR A , et al . Free dolly: introducing the world's first truly open instruction-tuned llm [EB ] . 2023 .

KÖPF A , KILCHER Y , VON RÜTTE D , et al . OpenAssistant conversations: democratizing large language model alignment [J ] . arXiv Preprint , arXiv: 2304.07327 , 2023 .

CHIANG W L , LI Z , LIN Z , et al . Vicuna: an open-source chatbot impressing gpt-4 with 90%* chatgpt quality [EB ] . 2023 .

DING N , CHEN Y L , XU B K , et al . Enhancing chat language models by scaling high-quality instructional conversations [J ] . arXiv Preprint , arXiv: 2305.14233 , 2023 .

XU C W , GUO D Y , DUAN N , et al . Baize: an open-source chat model with parameter-efficient tuning on self-chat data [J ] . arXiv Preprint , arXiv: 2304.01196 , 2023 .

BROWN T B , MANN B , RYDER N , et al . Language models are few-shot learners [J ] . Advances in Neural Information Processing Systems , 2020 ( 33 ): 1877 - 1901 .

JENTZSCH S , KERSTING K . ChatGPT is fun, but it is not funny! humor is still challenging large language models [J ] . arXiv Preprint , arXiv: 2306.04563 , 2023 .

WANG Y , KORDI Y , MISHRA S , et al . Self-instruct: aligning language model with self generated instructions [J ] . arXiv Preprint , arXiv: 2212.10560 , 2022 .

TAORI R , GULRAJANI I , ZHANG T , et al . Stanford alpaca: an instruction-following llama model [EB ] . 2023 .

CUI Y M , YANG Z Q , YAO X . Efficient and effective text encoding for Chinese LLaMA and alpaca [J ] . arXiv Preprint , arXiv: 2304.08177 , 2023 .

WU M H , WAHEED A , ZHANG C Y , et al . LaMini-LM: a diverse herd of distilled models from large-scale instructions [J ] . arXiv Preprint , arXiv: 2304.14402 , 2023 .

XU C , SUN Q F , ZHENG K , et al . WizardLM: empowering large language models to follow complex instruction [J ] . arXiv Preprint , arXiv: 2304.12244 , 2023 .

ZHENG L M , CHIANG W L , SHENG Y , et al . Judging LLM-as-a-judge with MT-bench and chatbot arena [J ] . arXiv Preprint , arXiv: 2306.05685 , 2023 .

DUBOIS Y , LI X C , TAORI R , et al . AlpacaFarm: a simulation framework for methods that learn from human feedback [J ] . arXiv Preprint , arXiv: 2305.14387 , 2023 .

STIENNON N , OUYANG L , WU J , et al . Learning to summarize with human feedback [J ] . Advances in Neural Information Processing Systems , 2020 ( 33 ): 3008 - 3021 .

NAKANO R , HILTON J , BALAJI S , et al . WebGPT: browser-assisted question-answering with human feedback [J ] . arXiv Preprint , arXiv: 2112.09332 , 2021 .

BAI Y T , JONES A , NDOUSSE K , et al . Training a helpful and harmless assistant with reinforcement learning from human feedback [J ] . arXiv Preprint , arXiv: 2204.05862 , 2022 .

FAN A , JERNITE Y , PEREZ E , et al . ELI5: long form question answering [J ] . arXiv Preprint , arXiv: 1907.09190 , 2019 .

GUO B Y , ZHANG X , WANG Z Y , et al . How close is ChatGPT to human experts? comparison corpus, evaluation, and detection [J ] . arXiv Preprint , arXiv: 2301.07597 , 2023 .

LAMBERT N , TUNSTALL L , RAJANI N , et al . Huggingface h4 stack exchange preference dataset [EB ] . 2023 .

CUI G Q , YUAN L F , DING N , et al . UltraFeedback: boosting language models with high-quality feedback [J ] . arXiv Preprint , arXiv: 2310.01377 , 2023 .

DENG J , DONG W , SOCHER R , et al . ImageNet: a large-scale hierarchical image database [C ] // Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE Press , 2009 : 248 - 255 .

SARZYNSKA-WAWER J , WAWER A , PAWLAK A , et al . Detecting formal thought disorder by deep contextualized word representations [J ] . Psychiatry Research , 2021 , 304 : 114135 .

RADFORD A , WU J , CHILD R , et al . Language models are unsupervised multitask learners [J ] . OpenAI blog , 2019 , 1 ( 8 ): 9 .

DEVLIN J , CHANG M W , LEE K , et al . BERT: pre-training of deep bidirectional transformers for language understanding [J ] . arXiv Preprint , arXiv: 1810.04805 , 2018 .

SIMONS J . The creator of ChatGPT thinks AI should be regulated [EB ] . 2023 .

WEIDINGER L , UESATO J , RAUH M , et al . Taxonomy of risks posed by language models [C ] // Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency . New York : ACM Press , 2022 : 214 - 229 .

TURNER A M , SMITH L , SHAH R , et al . Optimal policies tend to seek power [J ] . arXiv Preprint , arXiv: 1912.01683 , 2019 .

PEREZ E , RINGER S , LUKOŠIŪTĖ K , et al . Discovering language model behaviors with model-written evaluations [J ] . arXiv Preprint , arXiv: 2212.09251 , 2022 .

BOSTROM N . Existential risk prevention as global priority [J ] . Global Policy , 2013 , 4 ( 1 ): 15 - 31 .

BUCKNALL B S , DORI-HACOHEN S . Current and near-term AI as a potential existential risk factor [C ] // Proceedings of the Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society . New York : ACM Press , 2022 : 119 - 129 .

ORD T . The precipice: existential risk and the future of humanity [M ] . Hachette Books , 2020 .

LEIKE J , KRUEGER D , EVERITT T , et al . Scalable agent alignment via reward modeling: a research direction [J ] . arXiv Preprint , arXiv: 1811.07871 , 2018 .

SCHWARTZ S H , CIECIUCH J , VECCHIONE M , et al . Refining the theory of basic individual values [J ] . Journal of Personality and Social Psychology , 2012 , 103 ( 4 ): 663 - 688 .

KENTON Z , EVERITT T , WEIDINGER L , et al . Alignment of language agents [J ] . arXiv Preprint , arXiv: 2103.14659 , 2021 .

ASKELL A , BAI Y T , CHEN A N , et al . A general language assistant as a laboratory for alignment [J ] . arXiv Preprint , arXiv: 2112.00861 , 2021 .

SCHULMAN J , WOLSKI F , DHARIWAL P , et al . Proximal policy optimization algorithms [J ] . arXiv Preprint , arXiv: 1707.06347 , 2017 .

GLAESE A , MCALEESE N , TRĘBACZ M , et al . Improving alignment of dialogue agents via targeted human judgements [J ] . arXiv Preprint , arXiv: 2209.14375 , 2022 .

BAHETI A , LU X M , BRAHMAN F , et al . Leftover lunch: advantage-based offline reinforcement learning for language models [J ] . arXiv Preprint , arXiv: 2305.14718 , 2023 .

GO D , KORBAK T , KRUSZEWSKI G , et al . Aligning language models with preferences through f-divergence minimization [J ] . arXiv Preprint , arXiv: 2302.08215 , 2023 .

ZHU B , JIAO J , JORDAN M I . Principled reinforcement learning with human feedback from pairwise or $ K $-wise Comparisons [J ] . arXiv Preprint , arXiv: 2301.11270 , 2023 .

ZIEBART B D , MAAS A L , BAGNELL J A , et al . Maximum entropy inverse reinforcement learning [C ] // Aaai . 2008 ( 8 ): 1433 - 1438 .

HADFIELD-MENELL D , MILLI S , ABBEEL P , et al . Inverse reward design [J ] . Advances in neural information processing systems , 2017 , 30 .

LI Z N , XU T , ZHANG Y S , et al . ReMax: a simple, effective, and efficient reinforcement learning method for aligning large language models [J ] . arXiv Preprint , arXiv: 2310.10505 , 2023 .

WAN A , WALLACE E , SHEN S , et al . Poisoning language models during instruction tuning [J ] . arXiv Preprint , arXiv: 2305.00944 , 2023 .

WEI J , WANG X Z , SCHUURMANS D , et al . Chain-of-thought prompting elicits reasoning in large language models [J ] . Advances in Neural Information Processing Systems , 2022 ( 35 ): 24824 - 24837 .

LEE H , PHATALE S , MANSOOR H , et al . RLAIF: scaling reinforcement learning from human feedback with AI feedback [J ] . arXiv Preprint , arXiv: 2309.00267 , 2023 .

ZHU B H , FRICK E , WU T H , et al . Starling-7B: improving LLM helpfulness & harmlessness with RLAIF [EB ] . 2023 .

LI X , ZHANG T , DUBOIS Y , et al . AlpacaEval: an automatic evaluator of instruction-following models [EB ] . 2023 .

SUN Z Q , SHEN Y K , ZHANG H X , et al . SALMON: self-alignment with instructable reward models [J ] . arXiv Preprint , arXiv: 2310.05910 , 2023 .

LIU R B , JIA C Y , ZHANG G , et al . Second thoughts are best: learning to re-align with human values from text edits [J ] . Advances in Neural Information Processing Systems , 2022 , 35 : 181 - 196 .

KIM S , BAE S , SHIN J , et al . Aligning large language models through synthetic feedback [J ] . arXiv Preprint , arXiv: 2305.13735 , 2023 .

LI Z K , PENG B L , HE P C , et al . Guiding large language models via directional stimulus prompting [J ] . arXiv Preprint , arXiv: 2302.11520 , 2023 .

AKYÜREK A F , AKYÜREK E , MADAAN A , et al . RL4F: generating natural language feedback with reinforcement learning for repairing model outputs [J ] . arXiv Preprint , arXiv: 2305.08844 , 2023 .

CASPER S , DAVIES X , SHI C , et al . Open problems and fundamental limitations of reinforcement learning from human feedback [J ] . arXiv Preprint , arXiv: 2307.15217 , 2023 .

LIU H , SFERRAZZA C , ABBEEL P . Chain of hindsight aligns language models with feedback [J ] . arXiv Preprint , arXiv:2302. 02676 , 2023 , 3 .

DONG H Z , XIONG W , GOYAL D , et al . RAFT: reward ranked finetuning for generative foundation model alignment [J ] . arXiv Preprint , arXiv: 2304.06767 , 2023 .

RAFAILOV R , SHARMA A , MITCHELL E , et al . Direct preference optimization: your language model is secretly a reward model [J ] . arXiv Preprint , arXiv: 2305.18290 , 2023 .

SONG F F , YU B W , LI M H , et al . Preference ranking optimization for human alignment [J ] . arXiv Preprint , arXiv: 2306.17492 , 2023 .

ZIEGLER D M , STIENNON N , WU J , et al . Fine-tuning language models from human preferences [J ] . arXiv Preprint , arXiv: 1909.08593 , 2019 .

ZHAO Y , KHALMAN M , JOSHI R , et al . Calibrating sequence likelihood improves conditional language generation [J ] . arXiv Preprint , arXiv: 2210.00045 , 2022 .

LIU Y X , LIU P F , RADEV D , et al . BRIO: bringing order to abstractive summarization [J ] . arXiv Preprint , arXiv: 2203.16804 , 2022 .

EDUNOV S , OTT M , AULI M , et al . Classical structured prediction losses for sequence to sequence learning [J ] . arXiv Preprint , arXiv: 1711.04956 , 2017 .

ZHANG T Y , KISHORE V , WU F , et al . BERTScore: evaluating text generation with BERT [J ] . arXiv Preprint , arXiv: 1904.09675 , 2019 .

YUAN Z , YUAN H Y , TAN C Q , et al . Rrhf: rank responses to align language models with human feedback without tears [J ] . arXiv Preprint , arXiv: 2304.05302 , 2023 .

HONOVICH O , AHARONI R , HERZIG J , et al . TRUE: re-evaluating factual consistency evaluation [J ] . arXiv Preprint , arXiv: 2204.04991 , 2022 .

PEZESHKPOUR P . Measuring and modifying factual knowledge in large language models [J ] . arXiv Preprint , arXiv: 2306.06264 , 2023 .

WANG C X , CHENG S R , GUO Q P , et al . Evaluating open-QA evaluation [J ] . arXiv Preprint , arXiv: 2305.12421 , 2023 .

ZHA Y H , YANG Y C , LI R C , et al . AlignScore: evaluating factual consistency with a unified alignment function [J ] . arXiv Preprint , arXiv: 2305.16739 , 2023 .

RUDINGER R , NARADOWSKY J , LEONARD B , et al . Gender bias in coreference resolution [J ] . arXiv Preprint , arXiv: 1804.09301 , 2018 .

ZHAO J , WANG T , YATSKAR M , et al . Gender bias in coreference resolution: Evaluation and debiasing methods [J ] . arXiv Preprint , arXiv: 1804.06876 , 2018 .

LEVESQUE H J , DAVIS E , MORGENSTERN L . The Winograd schema challenge [C ] // Proceedings of the Proceedings of the Thirteenth International Conference on Principles of Knowledge Representation and Reasoning . New York : ACM Press , 2012 : 552 - 561 .

CAO Y T , DAUMÉ H III . Toward gender-inclusive coreference resolution: an analysis of gender and bias throughout the machine learning lifecycle [J ] . Computational Linguistics , 2021 , 47 ( 3 ): 615 - 661 .

STANOVSKY G , SMITH N A , ZETTLEMOYER L . Evaluating gender bias in machine translation [J ] . arXiv Preprint , arXiv: 1906.00591 , 2019 .

ZHANG G , LI Y Z , WU Y Y , et al . CORGI-PM: a Chinese corpus for gender bias probing and mitigation [J ] . arXiv Preprint , arXiv: 2301.00395 , 2023 .

DHAMALA J , SUN T , KUMAR V , et al . BOLD: dataset and metrics for measuring biases in open-ended language generation [C ] // Proceedings of the Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency . New York : ACM Press , 2021 : 862 - 872 .

SMITH E M , HALL M , KAMBADUR M , et al . "I'm sorry to hear that": finding new biases in language models with a holistic descriptor dataset [C ] // Proceedings of the Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing . Stroudsburg, PA, USA : Association for Computational Linguistics , 2022 : 9180 - 9211 .

COSTA-JUSSÀ M R , ANDREWS P , SMITH E , et al . Multilingual holistic bias: extending descriptors and patterns to unveil demographic biases in languages at scale [J ] . arXiv Preprint , arXiv: 2305.13198 , 2023 .

HUANG Y F , XIONG D Y . CBBQ: a Chinese bias benchmark dataset curated with human-AI collaboration for large language models [J ] . arXiv Preprint , arXiv: 2306.16244 , 2023 .

HENDRYCKS D , BURNS C , BASART S , et al . Aligning AI with shared human values [J ] . arXiv Preprint , arXiv: 2008.02275 , 2020 .

FORBES M , HWANG J D , SHWARTZ V , et al . Social chemistry 101: learning to reason about social and moral norms [J ] . arXiv Preprint , arXiv: 2011.00620 , 2020 .

CHEN Y , ZHOU Y L , ZHU S C , et al . Detecting offensive language in social media to protect adolescent online safety [C ] // Proceedings of the 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Conference on Social Computing . Piscataway : IEEE Press , 2012 : 71 - 80 .

RAZAVI A H , INKPEN D , URITSKY S , et al . Offensive language detection using multi-level classification [C ] //FARZINDAR A, KEŠELJ V. Canadian Conference on Artificial Intelligence . Berlin, Heidelberg : Springer , 2010 : 16 - 27 .

WASEEM Z , HOVY D . Hateful symbols or hateful people? predictive features for hate speech detection on twitter [C ] // Proceedings of the NAACL Student Research Workshop . Stroudsburg, PA, USA : Association for Computational Linguistics , 2016 : 88 - 93 .

ROSS B , RIST M , CARBONELL G , et al . Measuring the reliability of hate speech annotations: the case of the European refugee crisis [J ] . arXiv Preprint , arXiv: 1701.08118 , 2017 .

WULCZYN E , THAIN N , DIXON L . Ex machina: personal attacks seen at scale [C ] // Proceedings of the Proceedings of the 26th International Conference on World Wide Web . Republic and Canton of Geneva, Switzerland : International World Wide Web Conferences Steering Committee , 2017 : 1391 - 1399 .

XU J , JU D , LI M , et al . Recipes for safety in open-domain chatbots [J ] . arXiv Preprint , arXiv: 2010.07079 , 2020 .

GEHMAN S , GURURANGAN S , SAP M , et al . RealToxicityPrompts: evaluating neural toxic degeneration in language models [J ] . arXiv Preprint , arXiv: 2009.11462 , 2020 .

RADFORD A , NARASIMHAN K , SALIMANS T , et al . Improving language understanding by generative pre-training [EB ] . 2018 .

DENG J W , ZHOU J Y , SUN H , et al . COLD: a benchmark for Chinese offensive language detection [J ] . arXiv Preprint , arXiv: 2201.06025 , 2022 .

HENDRYCKS D , BURNS C , BASART S , et al . Measuring massive multitask language understanding [J ] . arXiv Preprint , arXiv: 2009.03300 , 2020 .

LI H N , ZHANG Y X , KOTO F , et al . CMMLU: measuring massive multitask language understanding in Chinese [J ] . arXiv Preprint , arXiv: 2306.09212 , 2023 .

HUANG Y Z , BAI Y Z , ZHU Z H , et al . C-eval: a multi-level multi-discipline Chinese evaluation suite for foundation models [J ] . arXiv Preprint , arXiv: 2305.08322 , 2023 .

LIU C , JIN R R , REN Y Q , et al . M3KE: a massive multi-level multi-subject knowledge evaluation benchmark for Chinese large language models [J ] . arXiv Preprint , arXiv: 2305.10263 , 2023 .

ZHONG W J , CUI R X , GUO Y D , et al . AGIEval: a human-centric benchmark for evaluating foundation models [J ] . arXiv Preprint , arXiv: 2304.06364 , 2023 .

COBBE K , KOSARAJU V , BAVARIAN M , et al . Training verifiers to solve math word problems [J ] . arXiv Preprint , arXiv: 2110.14168 , 2021 .

TALMOR A , HERZIG J , LOURIE N , et al . CommonsenseQA: a question answering challenge targeting commonsense knowledge [J ] . arXiv Preprint , arXiv: 1811.00937 , 2018 .

GEVA M , KHASHABI D , SEGAL E , et al . Did aristotle use a laptop? a question answering benchmark with implicit reasoning strategies [J ] . Transactions of the Association for Computational Linguistics , 2021 ( 9 ): 346 - 361 .

SUZGUN M , SCALES N , SCHÄRLI N , et al . Challenging big-bench tasks and whether chain-of-thought can solve them [J ] . arXiv Preprint , arXiv: 2210.09261 , 2022 .

CHEN M , TWOREK J , JUN H , et al . Evaluating large language models trained on code [J ] . arXiv Preprint , arXiv: 2107.03374 , 2021 .

LIU J W , XIA C S , WANG Y Y , et al . Is your code generated by ChatGPT really correct? rigorous evaluation of large language models for code generation [J ] . arXiv Preprint , arXiv: 2305.01210 , 2023 .

AUSTIN J , ODENA A , NYE M , et al . Program synthesis with large language models [J ] . arXiv Preprint , arXiv: 2108.07732 , 2021 .

PAPINENI K , ROUKOS S , WARD T , et al . BLEU: a method for automatic evaluation of machine translation [C ] // Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02 . Morristown, NJ, USA : Association for Computational Linguistics , 2001 : 311 - 318 .

XU F , SONG Y , IYYER M , et al . A critical evaluation of evaluations for long-form question answering [J ] . arXiv Preprint , arXiv: 2305.18201 , 2023 .

FU J L , NG S K , JIANG Z B , et al . GPTScore: evaluate as you desire [J ] . arXiv Preprint , arXiv: 2302.04166 , 2023 .

GAO M Q , RUAN J , SUN R L , et al . Human-like summarization evaluation with ChatGPT [J ] . arXiv Preprint , arXiv: 2304.02554 , 2023 .

ZHUO T Y . Large language models are state-of-the-art evaluators of code generation [J ] . arXiv Preprint , arXiv: 2304.14317 , 2023 .

BAI Y S , YING J H , CAO Y X , et al . Benchmarking foundation models with language-model-as-an-examiner [J ] . arXiv Preprint , arXiv: 2306.04181 , 2023 .

LIN Y T , CHEN Y N . LLM-eval: unified multi-dimensional automatic evaluation for open-domain conversations with large language models [J ] . arXiv Preprint , arXiv: 2305.13711 , 2023 .

WANG P Y , LI L , CHEN L , et al . Large language models are not fair evaluators [J ] . arXiv Preprint , arXiv: 2305.17926 , 2023 .

WANG Y D , YU Z H , ZENG Z R , et al . PandaLM: an automatic evaluation benchmark for LLM instruction tuning optimization [J ] . arXiv Preprint , arXiv: 2306.05087 , 2023 .

ZIEMS C , HELD W , SHAIKH O , et al . Can large language models transform computational social science? [J ] . arXiv Preprint , arXiv: 2305.03514 , 2023 .

BANG Y J , CAHYAWIJAYA S , LEE N , et al . A multitask, multilingual, multimodal evaluation of ChatGPT on reasoning, hallucination, and interactivity [J ] . arXiv Preprint , arXiv: 2302.04023 , 2023 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于OS-MBRL的网络切片资源动态分配方法研究

基于双通道理论的通信认知增强技术研究

基于策略约束强化学习的算网多目标优化研究

基于优化决策树的时延敏感流智能感知调度

基于艾宾浩斯遗忘曲线和注意力机制的推荐算法