Success Story: The Estonian COVID-19 Data Portal and KoroGeno-EST

NCC presenting the success story

The NCC Estonia coordinates HPC expertise at national level. Our mission is to analyse, implement and coordinate all necessary activities and offer services to end users to cover their needs: from access to resources, from technological consultancy to the provision of training courses for academia, public administrations and industry.

Industrial Organisations Involved:

The Estonian Elixir node is organised by the University of Tartu. Other partners are Tallinn University of Technology and National Institute for Physics and Biophysics. KoroGeno-EST, funded by the European Union COVID-19 Pandemic Response, is a collaboration between the University of Tartu, Synlab and the Estonian Health Board.

Technical/scientific Challenge:

The aim of the KoroGeno-EST study is to sequence (determine the composition of the genome) and analyze the complete genomes of SARS-CoV-2 that have caused and are causing infections in Estonia and to perform a molecular epidemiological analysis on them. Like living organisms in the wild, viruses can be divided into different groups based on genetic similarity. Grouping with a certain degree of accuracy must be used to obtain a visually perceptible picture of the presence of variants of concern (VOCs). Working with this problem involves a very large-scale data analysis, that can be conducted only by applying high-performace computing resources and relevant methods.

Benefits for further research:

  • Data collected during KoroGeno-EST study will help to bring together relevant datasets for sharing and analysis in an effort to accelerate coronavirus research.
  • The Estonian COVID-19 data portal enables researchers to upload, access and analyse COVID-19 related reference data and specialist datasets as part of the wider European COVID-19 Data Platform.

Solution:

Genome analysis of SARS-CoV-2 variants was performed on the University of Tartu HPC centre’s Galaxy instance. The NCC Estonia has supported the study by providing necessary infrastructure together with in-depth support, code optimization and technical consultancy for the project. The results of the study are available on the Estonian COVID-19 Data portal. The similarity based on the sequences of the samples collected in Estonia can be viewed in the Nextrain auspice application.

Scientific impact of this result/solution:

The Estonian COVID-19 Data Portal provides information, guidelines, tools and services to support researchers to utilise Estonian and European infrastructures for data sharing. The portal is a national node of the European COVID-19 Data Portal. The portal is operated by the ELIXIR Estonia. The NCC Estonia is here continuously helping with workflow optimization and data handling, also assisting with process automatization.

SUCCESS STORY # HIGHLIGHTS:

  • Keywords: Genomics, COVID-19 Data Portal, Open data, Data sharing, HPC
  • Biotechnology/Bioinformatics
  • HPC, HPDA

Source: https://auspice.biit.cs.ut.ee/ncov/est

Contact:

Ülar Allas, ylar.allas@ut.ee

This project has received funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 951732. The JU receives support from the European Union’s Horizon 2020 research and innovation program and Germany, Bulgaria, Austria, Croatia, Cyprus, the Czech Republic, Denmark, Estonia, Finland, Greece, Hungary, Ireland, Italy, Lithuania, Latvia, Poland, Portugal, Romania, Slovenia, Spain, Sweden, the United Kingdom, France, the Netherlands, Belgium, Luxembourg, Slovakia, Norway, Switzerland, Turkey, Republic of North Macedonia, Iceland, Montenegro