• GPT-3

  • Referenced in 18 articles [sw42135]
  • Generative Pre-trained Transformer 3 (GPT-3) (stylized GPT·3) is an autoregressive language model...
  • DialoGPT

  • Referenced in 2 articles [sw42251]
  • response generation model, DialoGPT (dialogue generative pre-trained transformer). Trained on 147M conversation-like exchanges ... DialoGPT extends the Hugging Face PyTorch transformer to attain a performance close to human both ... generate more relevant, contentful and context-consistent responses than strong baseline systems. The pre-trained...
  • CodeBERT

  • Referenced in 5 articles [sw36366]
  • Languages. We present CodeBERT, a bimodal pre-trained model for programming language ... documentation generation, etc. We develop CodeBERT with Transformer-based neural architecture, and train it with ... hybrid objective function that incorporates the pre-training task of replaced token detection, which ... detect plausible alternatives sampled from generators. This enables us to utilize both bimodal data...
  • Colorization Transformer

  • Referenced in 2 articles [sw42457]
  • high resolution image. Sampling from the Colorization Transformer produces diverse colorings whose fidelity outperforms ... generated colorings over the ground truth. The code and pre-trained checkpoints for Colorization Transformer...
  • PEGASUS

  • Referenced in 3 articles [sw42120]
  • this work, we propose pre-training large Transformer-based encoder-decoder models on massive text ... removed/masked from an input document and are generated together as one output sequence from...
  • MeshTransformer

  • Referenced in 2 articles [sw42732]
  • relax the mesh topology and allow the transformer self-attention mechanism to freely attend between ... handling challenging situations like partial occlusions. METRO generates new state-of-the-art results ... methods on FreiHAND dataset. Code and pre-trained models are available at https://github.com/microsoft/MeshTransformer...
  • PyMT5

  • Referenced in 2 articles [sw40112]
  • method text-to-text transfer transformer, which is trained to translate between all pairs ... generation, PyMT5 outperforms similarly-sized auto-regressive language models (GPT2) which were English pre-trained...
  • UCI-ml

  • Referenced in 3444 articles [sw04074]
  • UC Irvine Machine Learning Repository. We currently maintain...
  • JStatCom

  • Referenced in 108 articles [sw04873]
  • JStatCom is a software framework that makes it...
  • darch

  • Referenced in 321 articles [sw11086]
  • darch: Package for deep architectures and Restricted-Bolzmann...
  • MNIST

  • Referenced in 295 articles [sw12859]
  • THE MNIST DATABASE of handwritten digits. The MNIST...
  • Python

  • Referenced in 2164 articles [sw14460]
  • Python is a widely used high-level, general...
  • word2vec

  • Referenced in 200 articles [sw14978]
  • This tool provides an efficient implementation of the...
  • TensorFlow

  • Referenced in 653 articles [sw15170]
  • TensorFlow™ is an open source software library for...
  • ImageNet

  • Referenced in 695 articles [sw21105]
  • ImageNet is an image dataset organized according to...
  • Adam

  • Referenced in 948 articles [sw22205]
  • Adam: A Method for Stochastic Optimization. We introduce...
  • GloVe

  • Referenced in 100 articles [sw26211]
  • GloVe: Global Vectors for Word Representation. GloVe is...