Big data analytics for dummies

June 28, 2018, CORDIS
Big data analytics for dummies
Credit: dani3315, Shutterstock

Big Data is still very much an elite thing: only the most IT-savvy and wealthy businesses have a shot at scratching the surface of its potential. All this could be about to change thanks to a Big Data analytics platform developed under the TOREADOR project, which will automatically handle all major problems related to on-demand data preparation.

"Expectations of Big Data are very high, but the gap between ambition and execution is still large, especially for SMEs," Dr. Ernesto Damiani sighs. And he should know: since early 2016, Dr. Damiani has been leading a 10-strong consortium looking into the reasons for these mixed fortunes and the possible solutions.

If relatively few SMEs have incorporated Big Data analytics into their offerings or internal processes, it's mainly for two reasons. The first is a lack of competence in Big Data analytics, as Dr. Damiani explains. A company willing, for instance, to tailor its offerings to customer behaviour using a free app would have to resort to very expensive consultancy. It's currently the only way to map business goals to a class of data science and technology solutions.

"Concretely, the project brief could be something along the lines of 'collect the events generated by core customers' apps and use them to train a scalable random-forest multi-category classifier of their behaviour to be deployed on a public cloud service'," he says.

The second reason is the long roll-out time and, again, the prohibitive cost of Big Data campaigns even when the data science approach has already been identified. Together, these problems have been keeping SMEs and non-ICT-savvy businesses away from Big Data analytics, although they account for a substantial share of the EU manufacturing backbone.

The TOREADOR (TrustwOrthy model-awaRE Analytics Data platfORm) methodology and toolkit offer a solution to both problems: they automate and commoditise Big Data analytics, while making its tailoring to domain-specific customer requirements much easier than before.

The TOREADOR framework supports two automated transformations. The first one starts from a machine-readable declarative model which collects the data owner goals, and ends in a technology independent semantics-aware procedural model describing the computation to be carried out. Then, the second transformation builds upon the procedural model to compute a technology dependent deployment . The latter can be executed on an Apache platform, at the customer's premises, on commercial cloud services like AWS, as Python code executable on the Azure platform or as a Docker container.

"Our declarative models can interactively collect the business goals of Big Data campaigns and allow the TOREADOR toolkit to provide automatic advice on the feasibility of solutions. Our procedural models then provide an innovative description of the Big Data analytics computation in the OWL/S semantics-aware standards, and our compilers translate these procedural models into fully executable workflows or even into natively parallelised Python code. We're looking at an iterative development process, where non-IT-savvy users can quickly set up a campaign by generating a workflow executable on a public cloud service, and then – if needed – call in developers for generating self-contained Python code," Dr. Damiani explains.

Project partners have already identified four industrial pilots in the fields of predictive aircraft engine maintenance, predictive management of solar power plants, business application logs analysis, and clickstreams analysis for e-commerce applications.

"The TOREADOR platform is available and has been deployed at the four pilot sites. It has also been made available as a free pre-release to selected members of the TOREADOR community, which is composed of European companies (several of them SMEs) recruited with the help of TAIGER (Spain), an innovative SME in the TOREADOR consortium. Details on these early adopters are available on our website. Besides, the TOREADOR methodology has been released to other European projects using Big Data campaigns like EVOTION," Dr. Damiani says.

The project is scheduled for completion at the end of 2018. Until then, the consortium intends to keep enlarging the catalogue of services available in the platform and provide examples of TOREADOR-enabled Big Data campaigns, including training and deployment of advanced machine learning models.

Explore further: IBM unveils a new high-powered analytics system for fast access to data science

Related Stories

Google Cloud Machine Learning is sailing into mainstream

March 24, 2016

Google had an announcement that means strictly business for its push to be known as a leader in cloud services. "Today [Wednesday], "we've taken a major stride forward with the announcement of a new product family: Cloud ...

Recommended for you

Galactic center visualization delivers star power

March 21, 2019

Want to take a trip to the center of the Milky Way? Check out a new immersive, ultra-high-definition visualization. This 360-movie offers an unparalleled opportunity to look around the center of the galaxy, from the vantage ...

Physicists reveal why matter dominates universe

March 21, 2019

Physicists in the College of Arts and Sciences at Syracuse University have confirmed that matter and antimatter decay differently for elementary particles containing charmed quarks.


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.