Big Earth data at your fingertips becomes a reality

June 24, 2015, CORDIS
Credit: EARTHSERVER

Pushing the boundaries of Big Earth Data processing, the EARTHSERVER project allows researchers access and analyse multi-dimensional data from a wide range of sources.

The , like geology, oceanography and astronomy, generate vast quantities of Big Data. Yet without the right tools scientists either drown in this sea of Big Earth Data or it sits in an archive, barely used.

The vision of the EARTHSERVER is to offer researchers 'Big Earth Data at your fingertips' so that they can access and manipulate enormous data sets with just a few mouseclicks.

'The project was the result of a 'push' and a 'pull',' says project coordinator Peter Baumann, Professor of Computer Science at Jacobs University in Bremen, Germany. 'On the demand side there was a need for new concepts to handle the wave of data crashing down on us. On the supply side we had a data cube technology that is well-suited to this domain.' A data cube is a three- (or higher) dimensional array of values, commonly used to describe a time series of image data.

Data cubes help researchers access and visualise data

EARTHSERVER built advanced data cubes and custom web portals to make it possible for researchers to extract and visualise earth sciences data as 3-D cubes, 2-D maps or 1-D diagrams. The British Geological Survey, for example, used EARTHSERVER technology to drill down through different layers of the earth in 3-D.

'For the user, data cubes hide the unnecessary complexity of the data,' says Professor Baumann. 'As a user, I don't want to see a million files: I want to see a few data cubes.'

The massive data in the earth sciences is represented by sensor, image, simulation, and statistics data, often with a time dimension. The data typically form regular or irregular grid values with space/time coordinates. EARTHSERVER made these arrays available as data cubes.

Aside from ease-of-use, the data cubes also made it possible to integrate data from different disciplines, and scientists could combine measurement data with data generated from simulations.

Building on existing technologies

To handle Big Earth Data efficiently, EARTHSERVER needed to extend existing technologies and standards. The SQL database query language, for example, is more oriented towards the manipulation of alphanumeric data.

To enable data cubes, the project was built upon rasdaman, a new type of database management system specialised in multi-dimensional gridded data, calledrasters or arrays. Rasdaman enables the flexible, fast extraction of data from Big Earth Data arrays of any size.

'Essentially, we have married the SQL database language with image processing,' says Professor Baumann. 'This is now becoming part of the ISO SQL standard.'

In addition, the project has strongly influenced the Big Earth Data standards of the Open Geospatial Consortium and INSPIRE, the European Spatial Data Infrastructure.

EARTHSERVER's researchers also developed a 'semantic parallelisation' technology that sub-divides a single database query into multiple sub-queries. These are sent to other database servers for processing.

This method allows EARTHSERVER to distribute a single incoming query over more than 1 000 cloud nodes and rapidly answer queries on hundreds of Terabytes in less than a second.

Explore further: Massive data management for the Digital Single Market

Related Stories

Massive data management for the Digital Single Market

June 12, 2015

If the Digital Single Market launched by the European Commission in May this year is to become a reality, the fragmented field of data management needs to be addressed quickly. Current cloud computing practices often sacrifice ...

A shiny, new graph query system

October 9, 2014

As computing tools and expertise used in conducting scientific research continue to expand, so have the enormity and diversity of the data being collected. Developed at Pacific Northwest National Laboratory, the Graph Engine ...

Microsoft CEO is driving data-culture mindset

April 16, 2014

(Phys.org) —Microsoft's future strategy: is all about leveraging data, from different sources, coming together using one cohesive Microsoft architecture. Microsoft CEO Satya Nadella on Tuesday, both in a blog posting and ...

Recommended for you

Technology near for real-time TV political fact checks

January 18, 2019

A Duke University team expects to have a product available for election year that will allow television networks to offer real-time fact checks onscreen when a politician makes a questionable claim during a speech or debate.

Privacy becomes a selling point at tech show

January 7, 2019

Apple is not among the exhibitors at the 2019 Consumer Electronics Show, but that didn't prevent the iPhone maker from sending a message to attendees on a large billboard.

China's Huawei unveils chip for global big data market

January 7, 2019

Huawei Technologies Ltd. showed off a new processor chip for data centers and cloud computing Monday, expanding into new and growing markets despite Western warnings the company might be a security risk.

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.