The research focus “Ontology-guided MEdical Data Analysis” (OMEDA for short) comprises (ontology-based) generic software systems for the acquisition, validation and processing of complex medical research data as well as medical data analytics in the broadest sense.

Domain Expert: The primary goal is not the development of new analysis algorithms, but rather the question of how domain experts (medical researchers, clinical process analysts, QM managers, …), who are usually not IT and data analytics experts, can be allowed to extract knowledge from their data. Process models (Process of Knowledge Discovery in Databases – KDD) are considered and extended as well as algorithms and methods are investigated and extended on a technical level in order to be used by domain experts in a simple but safe way.

Ontology-Guided: The content is mainly worked on a technical level in the area of statistical analysis methods and corresponding ontology-controlled user guidance, as well as in the mathematical area of ontology-based distance calculation of complex structured data.

Medical Data Analysis: Here, concepts are developed in the context of innovation-driven medicine and human-machine symbiosis through the use of ontologies to gain knowledge from large complex structured heterogeneous data sets (“Doctor-in-the-Loop”). The main aspect here is both the preparation and processing of medical data and the secure application and interpretation by medical researchers.

The goal of the “Doctor-in-the-Loop” concept is to allow domain knowledge of medical experts to flow in during the analysis process (e.g. via interactive machine learning or domain ontologies) and to immediately return the results to domain experts who can interpret this knowledge correctly and integrate it into their knowledge pool. Techniques such as Visual Analytics are used. Contexts and differences in the data can be captured visually faster and easier than with complex, numerical results. The basis of clustering algorithms are metrics and distance measures with the intention to generate a meaningful representation of the data. The basis for this lies in the clean preparation and validation of the collected data in order to obtain reliable analysis results both with a good selection of features (feature generation) and the checked prerequisites with regard to data and applied methods.