Colt: Concept Lineage Tool for Data Flow Metadata Capture and Analysis. Most organizations are becoming increasingly data-driven, often processing data from many different sources to enable critical business operations. Beyond the well-addressed challenge of storing and processing large volumes of data, financial institutions in particular are increasingly subject to federal regulations requiring high levels of accountability for the accuracy and lineage of this data. For companies like GE Capital, which maintain data across a globally interconnected network of thousands of systems, it is becoming increasingly challenging to capture an accurate understanding of the data flowing between those systems. To address this problem, we designed and developed a concept lineage tool allowing organizational data flows to be modeled, visualized and interactively explored. This tool has novel features that allow a data flow network to be contextualized in terms of business-specific metadata such as the concept, business, and product for which it applies. Key analysis features have been implemented, including the ability to trace the origination of particular datasets, and to discover all systems where data is found that meets some user-defined criteria. This tool has been readily adopted by users at GE Capital and in a short time has already become a business-critical application, with over 2,200 data systems and over 1,000 data flows captured.
Keywords for this software
References in zbMATH (referenced in 1 article )
Showing result 1 of 1.
- Paul Cuddihy; Justin McHugh; Jenny Weisenberg Williams; Varish Mulwad; Kareem S. Aggour: SemTK: An Ontology-first, Open Source Semantic Toolkit for Managing and Querying Knowledge Graphs (2017) arXiv