Alexandre Bonvin explains how the WeNMR community builds on EOSC principles to support 12,000 structural biology researchers
What are the main research goals/ of your community?
WeNMR is serving the structural biology community at large. Structural biology studies the functions and interactions of proteins, nucleic acids and other biomolecules using experimental methods such as X-ray crystallography, Nuclear Magnetic Resonance (NMR) or cryo-electron microscopy (cryo-EM). All these methods generate data that needs to be processed, analysed and finally converted into three dimensional (3D) structures (or models) of biomolecules using a variety of computational tools and techniques.
Gaining access to 3D structures of biomolecules, their dynamics, and their interactions with other molecules is key to a proper understanding of their function. It also allows you, for example, to rationalise the effect of disease-causing mutations, to engineer better molecules for material, health or food applications and to obtain a starting point for drug design to combat disease. As such, structural biology has a strong socio-economical impact on many application fields from health, to food, to materials.
How many people are involved in WeNMR?
The WeNMR collaboration with EOSC-hub involves Utrecht University in the Netherlands, and the University of Florence and INFN Padova in Italy. They are the partners responsible for the operation, maintenance and further developments of the WeNMR thematic services. Those services could however not be supported without the strong commitment of resource providers giving us access to grid, cloud and data storage computing resources. This support has been formalized by a Service Level Agreement with the EGI Federation. Our user community is however much larger. We have over 12,000 registered users over the years from more than 95 different countries.
What are the services you provide, or want to provide, to this collaboration?
The WeNMR thematic services are a suite of web portals, providing user-friendly access to complex computational workflows and tasks. These allow inexperienced and experienced structural biologists to use state-of-the-art software for their data analysis while benefiting from the computational infrastructure provided through the EOSC-hub project. The services make use of high-throughput computing (HTC) resources, but some are also using accelerated computing (GPGPUs) grid resources and cloud computing. As community we have always been proactive in using new technologies in collaboration with EGI and in the context of European projects. We have, for example, piloted the use of GPGPU resources on the grid using the INDIGO-DataCloud udocker solution. Two of our portals are actively using those.
What are the computational challenges?
We need to provide user-friendly tools to our users, hiding the complexity of grid/cloud computing and ensure sufficient resources to operate those. WeNMR has a long history of using HTC resources under EGI. Maintaining the quality of our services and support, together with continuously adapting and improving them (e.g. to make use of new compute models, or facilitate their use through the implementation of single-sign-on mechanisms) is a constant challenge.
The EOSC is being set up to be Europe’s virtual environment for all researchers to store, manage, analyse and re-use data for research, innovation and educational purposes. How will you interact with this environment?
Within EOSC-hub we are planning to connect some of our portals to data repositories such as the ones offered by EUDAT in order to allow user to directly upload and/or download data/results. The data generated by the WeNMR services are however very specific to a user/application and not globally reusable by third parties. This is very different for example from sky images collected by telescopes. We do aim, however, at facilitating data deposition into public repositories where relevant.
What opportunities will the EOSC open for your community?
Hopefully the landscape of data and compute resources will become much more unified and transparent to our users. Ease of use is key here.
How do you imagine your field in ten years?
In ten years some of the computational approaches that require some level of expertise (and access to resources) will become commodities. The use of the EOSC and its associated HTC/HPC and data resources will become as natural to the new generation of researchers as using a smartphone. We know that some of our portals are already being actively used for education purposes in various bachelor, master and more advanced courses. It is great to see and to know that we are contributing to the training of the next generation of scientists.
The WeNMR portals cover different area of structural biology such as NMR structure calculations and data analysis (e.g. the AMBER, Xplor-NIH and FANTEN portals), the fitting of structural models into cryo-EM maps (PowerFit), the analysis of mass spectrometry cross-links (DisVis) or the integrative modelling of biomolecular complexes (HADDOCK). The tools are powered by High-Throughput Compute capabilities provided by the EGI Federation and enhanced with software components developed by the INDIGO DataCloud project.