The Penn Treebank Project annotates naturally-occuring text for linguistic structure. Most notably, we produce skeletal parses showing rough syntactic and semantic information -- a bank of linguistic trees. We also annotate text with part-of-speech tags, and for the Switchboard corpus of telephone conversations, dysfluency annotation. We are located in the LINC Laboratory of the Computer and Information Science Department at the University of Pennsylvania. All data produced by the Treebank is released through the Linguistic Data Consortium.

  1. Chen, Wenliang; Zhang, Min; Zhang, Yue; Duan, Xiangyu: Exploiting meta features for dependency parsing and part-of-speech tagging (2016)
  2. Deng, Ke; Bol, Peter K.; Li, Kate J.; Liu, Jun S.: On the unsupervised analysis of domain-specific Chinese texts (2016)
  3. Dhillon, Paramveer S.; Foster, Dean P.; Ungar, Lyle H.: Eigenwords: spectral word embeddings (2015)
  4. Martins, André F.T.; Figueiredo, Mário A.T.; Aguiar, Pedro M.Q.; Smith, Noah A.; Xing, Eric P.: $\mathrmAD^3$: alternating directions dual decomposition for MAP inference in graphical models (2015)
  5. Balle, Borja; Carreras, Xavier; Luque, Franco M.; Quattoni, Ariadna: Spectral learning of weighted automata. A forward-backward perspective (2014)
  6. Branco, António; Carvalheiro, Catarina; Costa, Francisco; Castro, Sérgio; Silva, João; Martins, Cláudia; Ramos, Joana: Deepbankpt and companion portuguese treebanks in a multilingual collection of treebanks aligned with the penn treebank (2014) ioport
  7. Deoskar, Tejaswini; Mylonakis, Markos; Sima’an, Khalil: Learning structural dependencies of words in the Zipfian tail (2014) ioport
  8. Silva, Ana Paula; Silva, Arlindo; Rodrigues, Irene: An approach to the POS tagging problem using genetic algorithms (2014) ioport
  9. Soutner, Daniel; Müller, Luděk: Continuous distributed representations of words as input of LSTM network language model (2014) ioport
  10. Xu, Feiyu; Li, Hong; Zhang, Yi; Uszkoreit, Hans; Krause, Sebastian: Parse reranking for domain-adaptative relation extraction (2014) ioport
  11. Yi, Youngmin; Lai, Chao-Yue; Petrov, Slav: Efficient parallel CKY parsing using GPUs (2014) ioport
  12. Han, Aaron Li-Feng; Wong, Derek F.; Chao, Lidia S.; He, Liangye; Li, Shuo; Zhu, Ling: Phrase tagset mapping for French and English treebanks and its application in machine translation evaluation (2013) ioport
  13. Zhou, Guo-Dong; Li, Pei-Feng: Improving syntactic parsing of Chinese with empty element recovery (2013) ioport
  14. Chen, Wenliang; Kazama, Jun’ichi; Uchimoto, Kiyotaka; Torisawa, Kentaro: Exploiting subtrees in auto-parsed data to improve dependency parsing (2012) ioport
  15. Leopold, Henrik; Smirnov, Sergey; Mendling, Jan: On the refactoring of activity labels in business process models (2012) ioport
  16. Nadh, Kailash; Huyck, Christian: A neurocomputational approach to prepositional phrase attachment ambiguity resolution (2012)
  17. Severyn, Aliaksei; Moschitti, Alessandro: Fast support vector machines for convolution tree kernels (2012)
  18. Huang, Minhua; Haralick, Robert M.: Discovering text patterns by a new graphic model (2011) ioport
  19. Nguyen, Dat Quoc; Nguyen, Dai Quoc; Pham, Son Bao; Pham, Dang Duc: Ripple down rules for part-of-speech tagging (2011) ioport
  20. Ponzetto, Simone Paolo; Strube, Michael: Taxonomy induction based on a collaboratively built knowledge repository (2011) ioport

