Enhancing LLM Performance: Efficacy, Fine-Tuning, and Inference Techniques
暫譯: 提升大型語言模型性能:效能、微調與推理技術
Passban, Peyman, Way, Andy, Rezagholizadeh, Mehdi
相關主題
商品描述
商品描述(中文翻譯)
本書是對驅動大型語言模型(LLMs)朝向更高效率和可擴展性所採用的最先進技術的開創性探索。由三位傑出專家——Peyman Passban、Mehdi Rezagholizadeh 和 Andy Way 編輯,本書提供了針對訓練和部署這些龐大模型日益增長的挑戰的實用解決方案。作者們在學術界、研究和產業的豐富經驗,為改善 LLM 性能同時降低計算需求所需的工具和策略提供了深入見解。
本書不僅僅是一本技術指南;它架起了研究與實際應用之間的橋樑。每一章都介紹了推理優化、模型架構和微調技術的前沿進展,旨在提升 LLM 在各個領域的可用性。讀者將發現有關在現實場景中實施和部署 LLM 的實際方面的廣泛討論。本書作為研究人員和業界專業人士的綜合資源,提供了深入的技術見解和實用的實踐指導的平衡結合。它是計算機科學及相關子領域(包括機器學習、計算語言學等)學生和研究人員的首選參考書。
作者簡介
Peyman Passban obtained his Ph.D. in 2018 from Dublin City University, Dublin, Ireland, where he also spent one year as a postdoctoral researcher and worked on neural machine translation models for low-resource and morphologically complex languages. Following that, Peyman has transitioned into full-time research roles within the industry, in which he has held a variety of positions such as Sr Researcher, Research Lead, Director of Engineering in different companies including Huawei Technologies, Amazon, and a few others. He published more than 20 papers in different venues (ACL, AAAI, NAACL, EMNLP, COLING, TALLIP, etc.) and is an active member of the research community (Organizer at LoResMT, Organizer and Area Chair at ENLSP, Committee Member at ACL, EMNLP, Reviewer at Springer Journal of Machine Translation, and many others). The influence of his work extends beyond research facilities. Peyman has actively participated in various sectors, including life sciences, conversational AI agents, and smart glasses.
Andy Way obtained a B.Sc. (Hons) in 1986, an M.Sc. in 1989, and his Ph.D. in 2001 from the University of Essex, Colchester, U.K. From 1988 to 91, he worked at the University of Essex, U.K., on the Eurotra MT project. He joined DCU in 1991 and is employed as Full Professor. He was Recipient of the 2015 DCU President's Research Award for Science and Engineering, and in 2019, he received the extremely prestigious Award of Honour from the International Association for Machine Translation for his services to the community. Prof. Way co-founded the SFI-funded Centre CNGL in 2007 and the ADAPT Centre in 2015. He took a career break from 2011 to 2013 to work in the translation industry in the UK. On his return to DCU in January 2014, Prof. Way acted as Deputy Director of CNGL and subsequently Deputy Director and Co-Applicant of ADAPT.
Mehdi Rezagholizadeh obtained a B.Sc. in 2009, an M.Sc. in 2011 from the University of Tehran, and Ph.D. in 2016 from McGill University in Electrical and Computer Engineering (Centre of Intelligent Machines). He joined Huawei in January 2017 and his research focus has been on different deep learning and NLP projects such as generative adversarial networks, neural machine translation, adversarial neural text generation, and efficient NLP for pre-trained language models. He is now Principal Research Scientist, and he has been leading the NLP team of Huawei Noah's Ark Lab in Canada since 2018. He has more than 8 years' industrial experience in broad spectrum roles such as Software Developer, Research Engineer, Research Scientist, and Team Leader. Mehdi has contributed to more than 15 patents and 50 published papers in top journals and conferences such as TACL, NeurIPS, AAAI, ACL, NAACL, EMNLP, EACL, Interspeech, and ICASSP.
作者簡介(中文翻譯)
Peyman Passban 於 2018 年在愛爾蘭都柏林城市大學獲得博士學位,並在該校擔任了一年的博士後研究員,專注於低資源和形態複雜語言的神經機器翻譯模型。隨後,Peyman 轉向全職的產業研究角色,擔任過多個職位,包括高級研究員、研究負責人、工程總監,並在華為技術、亞馬遜等多家公司工作。他在不同的會議上發表了超過 20 篇論文(ACL、AAAI、NAACL、EMNLP、COLING、TALLIP 等),並且是研究社群的活躍成員(LoResMT 組織者、ENLSP 組織者及區域主席、ACL 和 EMNLP 委員會成員、Springer Journal of Machine Translation 審稿人等)。他的工作影響力超越了研究機構,Peyman 積極參與生命科學、對話式 AI 代理和智慧眼鏡等多個領域。
Andy Way 於 1986 年獲得榮譽理學士學位,1989 年獲得碩士學位,2001 年在英國科爾切斯特的埃塞克斯大學獲得博士學位。從 1988 年到 1991 年,他在英國埃塞克斯大學參與 Eurotra MT 項目。1991 年加入都柏林城市大學,擔任全職教授。他曾獲得 2015 年都柏林城市大學校長科學與工程研究獎,並於 2019 年因對社群的貢獻獲得國際機器翻譯協會頒發的極具聲望的榮譽獎。Way 教授於 2007 年共同創立了 SFI 資助的 CNGL 中心,並於 2015 年創立了 ADAPT 中心。他在 2011 年至 2013 年期間休假,於英國翻譯產業工作。2014 年 1 月回到都柏林城市大學後,Way 教授擔任 CNGL 副主任,隨後成為 ADAPT 的副主任及共同申請人。
Mehdi Rezagholizadeh 於 2009 年獲得理學士學位,2011 年獲得碩士學位,並於 2016 年在麥吉爾大學獲得電機與計算機工程(智能機器中心)博士學位。他於 2017 年 1 月加入華為,研究重點為各種深度學習和自然語言處理項目,如生成對抗網絡、神經機器翻譯、對抗性神經文本生成及針對預訓練語言模型的高效 NLP。他目前擔任首席研究科學家,自 2018 年以來一直領導華為在加拿大的 Noah's Ark Lab NLP 團隊。他在軟體開發、研究工程師、研究科學家和團隊領導等多個角色中擁有超過 8 年的產業經驗。Mehdi 已為超過 15 項專利和 50 篇發表於頂級期刊和會議的論文作出貢獻,包括 TACL、NeurIPS、AAAI、ACL、NAACL、EMNLP、EACL、Interspeech 和 ICASSP。