This is a website for an H2020 project which concluded in 2019 and established the core elements of EOSC. The project's results now live further in www.eosc-portal.eu and www.egi.eu
This is a website for an H2020 project which concluded in 2019 and established the core elements of EOSC. The project's results now live further in www.eosc-portal.eu and www.egi.eu
The massive streams of high-resolution Earth Observation (EO) data derived from the EU Copernicus Sentinel sensors, have established Europe as the predominant spatial data provider for environmental monitoring applications. This data is made available under open license with an unprecedented frequency and spatial extent.
In principle, these data sources can inspire a wide range of science and monitoring applications from regional to continental scales. In practice, innovation is mostly happening outside Europe by large US IT companies. This leads to the unfortunate situation where EO science user communities need to rely on non-European platform suppliers for the Big Data Analytics they need to scale high-volume use of data streams.
There is world-class expertise in EO analytics in Europe. But we are missing a solution to provide core cloud services coupled to an online long-term data archive of Sentinel.
Europe needs:
A key requirement is a core computing and storage architecture based on principles tailored to handle very large data sets and fast user query response.
EOSC as a solution
The European Open Science Cloud has the potential to become a viable European alternative to Google Earth Engine for the scientific EO community. The federated resources provided through the EOSC-hub project can become the storage and computing infrastructure necessary to enable full scaling across Copernicus data inputs. As a platform, it should lead to a collation of the many European initiatives in EO software, by establishing a common interface to massively parallel server workflow handling applied to an optimally indexed data storage format.
Components for the client API to define workflow graphs can be adopted from existing open source frameworks (e.g. Jupyter notebooks, python and node.js). An interface with European open datasets would demonstrate immediately the advantage of the EOSC infrastructure in practical EO science applications. This impact can be enhanced by creating an open science data sharing environment. Facilitating executable data analysis “papers” that use EOSC as the common platform would boost reproducibility and scaling of EO science results.
Copernicus sustainability
The capability of commercial providers such as Google and Amazon in the big data analytics domain pose a serious risk to the continuation of the successful Copernicus programme, aggravated by the fact that there are no concrete plans for Europe to maintain a full online archive of Sentinel data.
Here there is an opportunity to leverage the existing EOSC infrastructure and capture European EO expertise around it. ESA is already deleting the oldest Sentinel data holdings from the online archive, and while the deleted data can still be retrieved from ‘cold’ storage (e.g. tape archives), this limits applications requiring long time series. To the best of our knowledge there is no European initiative planning to host the long-term Sentinel data archive.
The EOSC could therefore be used to:
Any action taken in this regard should be in collaboration with and build on existing experiences, such as the Copernicus DIAS and other projects.
How it could be set up
So how could we accomplish this?
Conclusion
The federated nature of EOSC makes it a prominent candidate to serve the long term data storage and analytics challenges of the EU’s Copernicus Sentinel program. By leveraging its expertise in other Big Data Analytics domains, it can extend its scope to serve the European Science Earth Observation community.
This article was prepared with contributions from EuroGEOSS and the EC Joint Research Centre.