Automatic web information extraction in the ROADRUNNER system Road Runner is a combined project of the Database Group of Università di Roma Tre and of the Database Group of Università della Basilicata. The project investigates techniques for extracting data from HTML sites through the use of automatically generated wrappers. In fact, many Web-based applications today use wrappers to extract data from HTML pages. These wrappers, however, are usually coded by hand, and therefore their generation and maintenance are difficult and labor intensive. To automate the wrapper generation and the data extraction process, the Road Runner project aims at developing original techniques to automatically generate wrappers. A wrapper generation system has been implemented in a working prototype, which has been used to conduct a number of experiments on real-life data-intensive Web sites. These experiments confirm the feasibility of the approach and. The system prototype has been implemented in Java.

