Europarl: A Parallel Corpus for Statistical Machine Translation. We collected a corpus of parallel text in 11 languages from the proceedings of the European Parliament, which are published on the web. This corpus has found widespread use in the NLP community. Here, we focus on its acquisition and its application as training data for statistical machine translation (SMT). We trained SMT systems for 110 language pairs, which reveal interesting clues into the challenges ahead.

References in zbMATH (referenced in 15 articles )

Showing results 1 to 15 of 15.
Sorted by year (citations)

  1. Fan, Angela; Bhosale, Shruti; Schwenk, Holger; Ma, Zhiyi; El-Kishky, Ahmed; Goyal, Siddharth; Baines, Mandeep; Celebi, Onur; Wenzek, Guillaume; Chaudhary, Vishrav; Goyal, Naman; Birch, Tom; Liptchinsky, Vitaliy; Edunov, Sergey; Auli, Michael; Joulin, Armand: Beyond English-centric multilingual machine translation (2021)
  2. Song, Yangqiu; Upadhyay, Shyam; Peng, Haoruo; Mayhew, Stephen; Roth, Dan: Toward any-language zero-shot topic classification of textual documents (2019)
  3. Zhu, Hong; Zhang, Xiaowei; Chu, Delin; Liao, Li-Zhi: Nonconvex and nonsmooth optimization with generalized orthogonality constraints: an approximate augmented Lagrangian method (2017)
  4. Chandar, Sarath; Khapra, Mitesh M.; Larochelle, Hugo; Ravindran, Balaraman: Correlational neural networks (2016)
  5. Eldén, Lars; Merkel, Magnus; Ahrenberg, Lars; Fagerlund, Martin: Computing semantic clusters by semantic mirroring and spectral graph partitioning (2013)
  6. Sas, Jerzy; Żołnierek, Andrzej: Pipelined language model construction for Polish speech recognition (2013) ioport
  7. Martínez-Gómez, Pascual; Sanchis-Trilles, Germán; Casacuberta, Francisco: Online adaptation strategies for statistical machine translation in post-editing scenarios (2012)
  8. Navigli, Roberto; Ponzetto, Simone Paolo: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network (2012)
  9. Silvestre-Cerdà, Joan Albert; Andrés-Ferrer, Jesús; Civera, Jorge: Explicit length modelling for statistical machine translation (2012) ioport
  10. Hardoon, David R.; Shawe-Taylor, John: Sparse canonical correlation analysis (2011)
  11. Tinsley, John; Way, Andy: Automatically generated parallel treebanks and their exploitability in machine translation (2009) ioport
  12. Carl, Michael; Melero, Maite; Badia, Toni; Vandeghinste, Vincent; Dirix, Peter; Schuurman, Ineke; Markantonatou, Stella; Sofianopoulos, Sokratis; Vassiliou, Marina; Yannoutsou, Olga: METIS-II: Low resource machine translation (2008) ioport
  13. Owczarzak, Karolina; van Genabith, Josef; Way, Andy: Evaluating machine translation with LFG dependencies (2008) ioport
  14. Wu, Hua; Wang, Haifeng: Pivot language approach for phrase-based statistical machine translation (2008) ioport
  15. Groves, Declan; Way, Andy: Hybrid data-driven models of machine translation (2005) ioport