参考链接:
[1] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding https://arxiv.org/abs/1810.04805
[2] RoBERTa: A Robustly Optimized BERT Pretraining Approach https://arxiv.org/abs/1907.11692
[3] ALBERT: A Lite BERT for Self-supervised Learning of Language Representations https://arxiv.org/abs/1909.11942
[4] Language Models are Unsupervised Multitask Learners https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf
[5] Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer https://arxiv.org/abs/1910.10683
[6] T-NLG https://www.microsoft.com/en-us/research/blog/turing-nlg-a-17-billion-parameter-language-model-by-microsoft/
[7] Language Models are Few-Shot Learners https://arxiv.org/abs/2005.14165
[8] Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity https://arxiv.org/abs/2101.03961
[9] PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation https://arxiv.org/abs/2104.12369
[10] ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation https://arxiv.org/abs/2112.12731
[11] PaLM: Scaling Language Modeling with Pathways https://arxiv.org/abs/2204.02311
[12] GLaM: Efficient Scaling of Language Models with Mixture-of-Experts https://arxiv.org/abs/2112.06905
[13] Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers https://arxiv.org/abs/2002.11794
[14] A Review of Sparse Expert Models in Deep Learning https://arxiv.org/abs/2209.01667
[15] RoFormer: Enhanced Transformer with Rotary Position Embedding https://arxiv.org/abs/2104.09864
[16] Talking-Heads Attention https://arxiv.org/abs/2003.02436
[17] GLU Variants Improve Transformer https://arxiv.org/abs/2002.05202
[18] 腾讯AI Lab发布智能创作助手「文涌 (Effidit)」,用技术助力「文思泉涌」https://mp.weixin.qq.com/s/b-kPSR3aFPKHpUnFv7gmeA
[19] 腾讯“混元”AI大模型登顶CLUE三大榜单,打破多项行业记录 http://ex.chinadaily.com.cn/exchange/partners/82/rss/channel/cn/columns/snl9a7/stories/WS628df605a3101c3ee7ad730e.html
—完—
@量子位 · 追踪AI技术和产品新动态
深有感触的朋友,欢迎赞同、关注、分享三连վ'ᴗ' ի ❤