VFML - a toolkit for mining high-speed time-changing data streams. Welcome to the VFML (Very Fast Machine Learning) toolkit for mining high-speed data streams and very large data sets. VFML is made up of three main components. The first is a collection of tools and APIs that help a user develop new learning algorithms. The second component is a collection of implementations of important learning algorithms. The third component is a collection of scalable learning algorithms that were developed by Pedro Domingos and Geoff Hulten (with the help of several other people see Thanks). VFML is written in standard C (and a bit of Python), and provides a series of tutorials and examples as well as extensive in-source documentation in JavaDoc format. VFML is being distributed under a modified BSD license. VFML provides code to help read and process training data, to gather sufficient statistics from it, ADTs for several important machine learning structures, and various helper code. You can get an overview of what is provided by visiting the Core APIs and Utility APIs sections of the documentation. VFML contains a series of tools for working with data sets: cleaning them, sampling them, splitting them into train/test sets. It also has tools to help you experiment with learning algorithms. See the Other Tools documentation heading for more information. VFML contains tools for learning decision trees, for learning the structure belief nets (aka Bayesian networks), and for clustering. Much of this code is easy to modify or extend (several other researchers have benefited from the bnlearn program, for example), and much of it can scale to learning from very large data sets or from data streams. You can get an overview of all the learners by checking out the Learning Programs section.