Data Cube, Source: EO Analytics research group, University of Salzbug
EO data cubes have become one of the most important big EO data midstream technologies because they simplify data access and facilitate information production. They also:
- reduce the entrance barrier for non-expert users because users do not need to know specific file structures or internal directory names.
- can be easily accessed over the Internet by client software.
- support deployment on a scalable system in the cloud allowing EO data analysis on continental or even global scales.
- embrace time as an important dimension, enabling traversing through time similar to traversing through space in an image file in contrast to storing data in image files, where the temporal dimension is represented only by how files are named and organised.
- are tightly coupled with providing analysis-ready data (ARD) to enable expert and non-expert users to access and use calibrated data directly in analysis without requiring laborious pre-processing.
Compared to purely file-based approaches, an EO data cube is a new way of organising EO data and has become very popular in the last few years. Prior to these developments, EO data were simply saved and organised as images, similar to how people organise vacation pictures from their digital camera; they were individual files located in some directories on the hard drive. However, every pixel in remote sensing images is associated with a unique position on Earth in time that can be defined by spatio-temporal coordinates. Widely known software like Google Earth abstracts image file handling from user interactions. On such a digital map or globe, navigating to the exact geographic position of interest is sufficient for viewing relevant image data, which is one reason for its popularity.
In this case, software translates the geographic position on the digital globe to the internal file structure, collects the images, and conveniently presents them as continuous maps. This is possible because every pixel is located at its geographic position. Suppose this is considered a simple case as a set of flat, two-dimensional images. In addition, time can be the third dimension. Then, consecutively aligning the images on top of each other creates an EO data cube. It is possible to interact with an EO data cube using spatial and temporal coordinates instead of file names because the data cube acts as a translation layer to the image files. Alternatively, the data in an EO data cube can be rearranged for specific access patterns, allowing fine-tuning to fulfil specific performance criteria. A community-agreed definition of an EO data cube does not exist, although several suggestions have been made. In general, an EO data cube aims to provide easier access to a large volume, variety, and velocity of EO data and abstract the file handling and management from users. An EO data cube is considered a multi-dimensional structure with at least one non-spatial dimension. Although it can have any number of dimensions, it is referred to as a 'cube'.