Valmet: A new validation tool for assessing and improving 3D object segmentation Extracting 3D structures from volumetric images like MRI or CT is becoming a routine process for diagnosis based on quantitation, for radiotherapy planning, for surgical planning and image-guided intervention, for studying neurodevelopmental and neurodegenerative aspects of brain diseases, and for clinical drug trials. Key issues for segmenting anatomical objects from 3D medical images are validity and reliability. We have developed VALMET, a new tool for validation and comparison of object segmentation. New features not available in commercial and public-domain image processing packages are the choice between different metrics to describe differences between segmentations and the use of graphical overlay and 3D display for visual assessment of the locality and magnitude of segmentation variability. Input to the tool are an original 3D image (MRI, CT, ultrasound), and a series of segmentations either generated by several human raters and/or by automatic methods (machine). Quantitative evaluation includes intra-class correlation of resulting volumes and four different shape distance metrics, a) percentage overlap of segmented structures (R intersect S)/(R union S), b) probabilistic overlap measure for non-binary segmentations, c) mean/median absolute distances between object surfaces, and maximum (Hausdorff) distance. All these measures are calculated for arbitrarily selected 2D cross-sections and full 3D segmentations. Segmentation results are overlaid onto the original image data for visual comparison. A 3D graphical display of the segmented organ is color-coded depending on the selected metric for measuring segmentation difference. The new tool is in routine use for intra- and inter-rater reliability studies and for testing novel automatic machine-segmentation versus a gold standard established by human experts. Preliminary studies showed that the new tool could significantly improve intra- and inter-rater reliability of hippocampus segmentation to achieve intra-class correlation coefficients significantly higher than published elsewhere.