The new technologies needed for dealing with big data

Feb 20, 2014 by Paul Mccarthy, The Conversation
MongoDB co-founder and chairman Dwight Merriman still writes code. Credit: TechCrunch/Flickr

While much focus and discussion of the so-called "Big Data revolution" has been on the data itself and the exciting new applications it is enabling—from Google's self-driving cars through to CSIRO and University of Tasmania's better information systems for oyster farmers—less focus has been on the underpinning technologies and the talent driving these technologies.

At the heart of the Big Data movement is a range of next generation technologies that enable data to be amassed and analysed on a scale and speed hitherto unseen.

Global online services such as Google, Amazon and Facebook that serve billions of people around the world in real time have been made possible due to new technologies that divide tasks and files across banks of thousands of distributed computers.

Storing the data

Traditional database technologies are built around many tables of information like spreadsheets with rows and columns and a way of asking questions of these tables in a structured way.

The structured way of asking a question of these data collections was originally named SEQUEL (Structured English Query Language), later shortened to SQL. This is the technology that Oracle pioneered in the 1970s and it has served them well to become the undisputed king of database technology ever since.

If you are familiar with Excel, you'd be familiar with the type of information this kind of technology is suited to representing. Company accounts, marketing and sales figures over time are of course perfect.

But there are other types of data that isn't so easily stored in this way such as storing the relationships in a social network (Facebook), or index of documents stored on the web (Google), or for large collections of digital music and video (Netflix).

Fortunately there are other ways to store information other than in tables such as in trees, graphs, or in lists with an index. And some of these approaches are much better suited for humungous data sets and for data sets that don't naturally fit into a series of tables.

The growing demand to store and analyse very large bodies of information, and information that is not readily suited to storing in tables (unstructured data), has led to a rapid growth in the popularity of these alternative types of database technologies.

Rising Tide. Credit: Google Trends.

Collectively they've become known as NoSQL technologies. Many of the leading technologies in this category are not developed by one company, such as Oracle or Microsoft, but instead are open source - developed by an open network of companies and independent developers and contributors akin to the way Wikipedia or Linux is developed.

Next-generation database technology

There are five key types of next-generation NoSQL data technologies. They are:

  1. Document Store—suitable for storing large collections of documents
  2. Wide Column Store—for very rapid access to structured or semi structured data
  3. Search Engine—suitable for full text indexing of documents
  4. Key-Value Store – suitable for rapid access to unstructured data
  5. Graph Database – suitable for storing graph type data such as social networks.

And the leading technologies in each of these categories respectively are:

Note Apache Hadoop, which is also a leading technology, is not included in this list as it is a framework and file system and not a database technology (but can support many of these).

Where there's talent there's fire

By looking at the companies around the world who have the most employees with skills in each of these these frontier technologies, we can get a unique insight into organisations at the forefront of next generation applications.

The table (above) looks at 40 leading global organisations that have the greatest number of specialists in each of the top five next-gen database technologies.

The more detailed country-by-country analysis has revealed some organisations such as Sky in the London, Goldman Sachs in NYC are leaders in the number people they have with skills in these emerging areas.

Explore further: Storage system for 'big data' dramatically speeds access to information

add to favorites email to friend print save as pdf

Related Stories

Researchers develop tools to access 'scholarly big data'

Jan 28, 2014

Academic researchers and corporate managers often seek experts or collaborators in a particular field to enhance their knowledge or maximize the talents of their workforce. Harnessing that data, however, can be a challenge. ...

IBM to invest $1b in Linux, open-source

Sep 17, 2013

IBM said Tuesday it would invest $1 billion in new Linux and open source technologies for its servers in a bid to boost efficiency for big data and cloud computing.

Recommended for you

Tech giants look to skies to spread Internet

54 minutes ago

The shortest path to the Internet for some remote corners of the world may be through the skies. That is the message from US tech giants seeking to spread the online gospel to hard-to-reach regions.

Patent talk: Google sharpens contact lens vision

1 hour ago

(Phys.org) —A report from Patent Bolt brings us one step closer to what Google may have in mind in developing smart contact lenses. According to the discussion Google is interested in the concept of contact ...

Wireless industry makes anti-theft commitment

2 hours ago

A trade group for wireless providers said Tuesday that the biggest mobile device manufacturers and carriers will soon put anti-theft tools on the gadgets to try to deter rampant smartphone theft.

Dish Network denies wrongdoing in $2M settlement

12 hours ago

The state attorney general's office says Dish Network Corp. will reimburse Washington state customers about $2 million for what it calls a deceptive surcharge, but the satellite TV provider denies any wrongdoing.

Yahoo sees signs of growth in 'core' (Update)

12 hours ago

Yahoo reported a stronger-than-expected first-quarter profit Tuesday, results hailed by chief executive Marissa Mayer as showing growth in the Web giant's "core" business.

User comments : 0

More news stories

Tech giants look to skies to spread Internet

The shortest path to the Internet for some remote corners of the world may be through the skies. That is the message from US tech giants seeking to spread the online gospel to hard-to-reach regions.

Patent talk: Google sharpens contact lens vision

(Phys.org) —A report from Patent Bolt brings us one step closer to what Google may have in mind in developing smart contact lenses. According to the discussion Google is interested in the concept of contact ...

Wireless industry makes anti-theft commitment

A trade group for wireless providers said Tuesday that the biggest mobile device manufacturers and carriers will soon put anti-theft tools on the gadgets to try to deter rampant smartphone theft.

Astronomers: 'Tilt-a-worlds' could harbor life

A fluctuating tilt in a planet's orbit does not preclude the possibility of life, according to new research by astronomers at the University of Washington, Utah's Weber State University and NASA. In fact, ...