Cloud computing for genome analysis.
The main objective of the project, funded by the EU in February 2013 under the Marie Curie program “Industry-Academia Partnerships and Pathways” initiative, is people exchange, training and research activities in the field of comparative genomics based on cloud computing and high-performance computing.
RISC Software GmbH is part of an international consortium, which furthermore consists of the University of Malaga (Spain), the Johannes Kepler University of Linz (Austria), Integromics (Spain), Hospital Carlos Haya (Spain), and the Leibniz Computing Centre (Germany).
In particular, the cooperation between academic and industrial partners should be promoted through the exchange of staff in this project.
Through its interdisciplinary approach, the project connects life sciences as an application area, among others, with techniques of bioinformatics and cloud computing. This is necessary due to the planned processing of large amounts of data generated by modern genome sequencing techniques. This also constitutes the main motivation for the development of new applications in this field, since existing software tools are not designed for the processing of complete genomes.
Thematic key points of the proposed work are comparisons and visualizations of large-scale genome sequences up to full genomes, and phylogenetic trees. Another goal is the user-friendly presentation of the result data through visualization techniques for a variety of devices, such as tablet PCs or virtual reality environments.
The visualization techniques used include, for example, dot plots in which genome sequences are created by applying them in pairs onto graph axes and highlighting equivalent partial sequences. To achieve these project objectives, a software solution is being developed, consisting of cloud computing and high-performance computing components, bioinformatics algorithms, as well as modules for data access and visualization on a variety of output devices. These modules can be subsequently assembled into complex workflows.
For the successful implementation of these projects, RISC Software GmbH particularly adds its competence in the areas of symbolic computation and cloud computing in close cooperation with the RISC research institute, and will contribute to the development of cloud computing and visualization components in close collaboration with the Leibniz Computer Centre. In addition, RISC Software GmbH is mainly responsible for public relations in the context of the Mr.SymBioMath project.
In preparation for this project, a data processing framework based on Hadoop (http://hadoop.apache.org) for the comparison of genome sequences was implemented by RISC Software GmbH to explore the technical possibilities in terms of data pre-processing, comparison methods and visualization options.