Declarative information extraction, Web crawling, and recursive wrapping with Lixto. Lixto is a system and method for the visual and interactive generation of wrappers for Web pages under the supervision of a human developer, for automatically extracting information from Web pages using such wrappers, and for translating the extracted content into XML. This paper describes some advanced features of Lixto, such as disjunctive pattern definitions, specialization rules, and Lixto’s capability of collecting and aggregating information from several linked Web pages.

References in zbMATH (referenced in 19 articles )

Showing results 1 to 19 of 19.
Sorted by year (citations)

  1. Gottlob, Georg; Koch, Christoph; Pieris, Andreas: Logic, languages, and rules for web data extraction and reasoning over data (2017)
  2. Han, Wook-Shin; Kwak, Wooseong; Yu, Hwanjo; Lee, Jeong-Hoon; Kim, Min-Soo: Leveraging spatial join for robust tuple extraction from web pages (2014) ioport
  3. Fazzinga, Bettina; Flesca, Sergio; Tagarelli, Andrea: Schema-based Web wrapping (2011) ioport
  4. Schönberg, Christian; Weitl, Franz; Freitag, Burkhard: Verifying the consistency of web-based technical documentations (2011)
  5. Álvarez, Manuel; Pan, Alberto; Raposo, Juan; Bellas, Fernando; Cacheda, Fidel: Finding and extracting data records from web pages (2010) ioport
  6. Björklund, Henrik; Martens, Wim; Schweikardt, Nicole; Schwentick, Thomas: Logik und Automaten: ein echtes Dreamteam (2010) ioport
  7. Eiter, Thomas; Gottlob, Georg; Schwentick, Thomas: The model checking problem for prefix classes of second-order logic: a survey (2010)
  8. Li, Qing; Chen, Jing; Wu, Yipu: Algorithm for extracting loosely structured data records through digging strict patterns (2009) ioport
  9. Braga, Daniele; Campi, Alessandro; Ceri, Stefano; Raffio, Alessandro: Joining the results of heterogeneous search engines (2008) ioport
  10. Mukherjee, Saikat; Ramakrishnan, I. V.: Automated semantic analysis of schematic data (2008) ioport
  11. Barbançon, Francois; Miranker, Daniel P.: SPHINX: Schema integration by example (2007) ioport
  12. Carme, Julien; Gilleron, Rémi; Lemay, Aurélien; Niehren, Joachim: Interactive learning of node selecting tree transducer (2007) ioport
  13. Gottlob, Georg; Koch, Christoph: A formal comparison of visual web wrapper generators (2006)
  14. Li, Zhao; Ng, Wee Keong; Sun, Aixin: Web data extraction based on structural similarity (2005) ioport
  15. Tijerino, Yuri A.; Embley, David W.; Lonsdale, Deryle W.; Ding, Yihong; Nagy, George: Towards ontology generation from tables (2005) ioport
  16. Gottlob, Georg; Koch, Christoph: Monadic Datalog and the expressive power of languages for web information extraction (2004)
  17. Meng, Xiaofeng; Lu, Hongjun; Wang, Haiyan; Gu, Mingzhe: Data extraction from the web based on pre-defined schema. (2002)
  18. Baumgartner, Robert; Flesca, Sergio; Gottlob, Georg: Declarative information extraction, Web crawling, and recursive wrapping with Lixto (2001)
  19. Eiter, Thomas (ed.); Faber, Wolfgang (ed.); Truszczyński, Mirosław (ed.): Logic programming and nonmonotonic reasoning. 6th international conference, LPNMR 2001, Vienna, Austria, September 17--19, 2001. Proceedings (2001)