BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE score to 80.5% (7.7% point absolute improvement), MultiNLI accuracy to 86.7% (4.6% absolute improvement), SQuAD v1.1 question answering Test F1 to 93.2 (1.5 point absolute improvement) and SQuAD v2.0 Test F1 to 83.1 (5.1 point absolute improvement).

References in zbMATH (referenced in 92 articles )

Showing results 1 to 20 of 92.
Sorted by year (citations)

1 2 3 4 5 next

  1. Baskerville, Nicholas P.; Keating, Jonathan P.; Mezzadri, Francesco; Najnudel, Joseph: A spin glass model for the loss surfaces of generative adversarial networks (2022)
  2. Hettiarachchi, Hansi; Adedoyin-Olowe, Mariam; Bhogal, Jagdev; Gaber, Mohamed Medhat: Embed2detect: temporally clustered embedded words for event detection in social media (2022)
  3. Liu, Ruibo; Jia, Chenyan; Wei, Jason; Xu, Guangxuan; Vosoughi, Soroush: Quantifying and alleviating political bias in language models (2022)
  4. Li, Zuchao; Zhou, Junru; Zhao, Hai; Zhang, Zhisong; Li, Haonan; Ju, Yuqi: Neural character-level syntactic parsing for Chinese (2022)
  5. Loureiro, Daniel; Mário Jorge, Alípio; Camacho-Collados, Jose: LMMS reloaded: transformer-based sense embeddings for disambiguation and beyond (2022)
  6. Lu, Yaojie; Lin, Hongyu; Tang, Jialong; Han, Xianpei; Sun, Le: End-to-end neural event coreference resolution (2022)
  7. Md Tanvirul Alam, Dipkamal Bhusal, Youngja Park, Nidhi Rastogi: CyNER: A Python Library for Cybersecurity Named Entity Recognition (2022) arXiv
  8. Ras, Gabrielle; Xie, Ning; van Gerven, Marcel; Doran, Derek: Explainable deep learning: a field guide for the uninitiated (2022)
  9. Ribeiro, Eugénio; Ribeiro, Ricardo; Martins de Matos, David: Automatic recognition of the general-purpose communicative functions defined by the ISO 24617-2 standard for dialog act annotation (2022)
  10. Tian, Xuedong; Wang, Jiameng; Wen, Yu; Ma, Hongyan: Multi-attribute scientific documents retrieval and ranking model based on GBDT and LR (2022)
  11. Zeng, Zhiyuan; Xiong, Deyi: Unsupervised and few-shot parsing from pretrained language models (2022)
  12. Bakhtin, Anton; Deng, Yuntian; Gross, Sam; Ott, Myle; Ranzato, Marc’aurelio; Szlam, Arthur: Residual energy-based models for text (2021)
  13. Baskerville, Nicholas P.; Keating, Jonathan P.; Mezzadri, Francesco; Najnudel, Joseph: The loss surfaces of neural networks with general activation functions (2021)
  14. Chen, Jiaoyan; Hu, Pan; Jimenez-Ruiz, Ernesto; Holter, Ole Magnus; Antonyrajah, Denvar; Horrocks, Ian: \textttOWL2Vec*: embedding of OWL ontologies (2021)
  15. Christopher Schröder, Lydia Müller, Andreas Niekler, Martin Potthast: Small-text: Active Learning for Text Classification in Python (2021) arXiv
  16. Ding, Xiaofeng; Yang, Hongfei; Chan, Raymond H.; Hu, Hui; Peng, Yaxin; Zeng, Tieyong: A new initialization method for neural networks with weight sharing (2021)
  17. Evans, Richard; Bošnjak, Matko; Buesing, Lars; Ellis, Kevin; Pfau, David; Kohli, Pushmeet; Sergot, Marek: Making sense of raw input (2021)
  18. Feinauer, Christoph; Lucibello, Carlo: Reconstruction of pairwise interactions using energy-based models (2021)
  19. Fitzpatrick, Trevor; Mues, Christophe: How can lenders prosper? Comparing machine learning approaches to identify profitable peer-to-peer loan investments (2021)
  20. Fresca, Stefania; Dede’, Luca; Manzoni, Andrea: A comprehensive deep learning-based approach to reduced order modeling of nonlinear time-dependent parametrized PDEs (2021)

1 2 3 4 5 next