September 19, 2016

Expressing the value of data science in an ROI framework

Data science is rapidly becoming woven into the fabric of organizations of all sizes and types, and is driving significant societal and economic impact. Organizations are increasingly becoming data driven, investing in infrastructure, people and processes to embrace the data science journey.

In a recent paper published in EPJ Data Science, University of Notre Dame researchers study how organizations can quantify decision making in data science. Doctoral student Saurabh Nagrecha and his adviser, Nitesh Chawla, the Frank M. Freimann Professor of Computer Science and Engineering and director of iCeNSA, advocate that data science is a process and present a solution to quantifying the value of data acquisition and modeling in a return on investment (ROI) framework.

"An ROI-based valuation means that organizations can budget for existing strategies better, readily compare vastly different data strategies, and in a budget-constrained environment even answer the tough questions like 'To achieve the desired outcomes, should I invest money in more data acquisition or more complex modeling or both?'" Chawla said. "We have developed an ROI-based modeling framework, called NPV model in this paper, that can begin to answer such questions."

The NPV model enables users to translate a machine learning-based predictive model's performance over time from traditional empirical measures into dollar values by combining machine learning, data acquisition, operational costs, and investment parameters.

"Typically, success for machine learning models is expressed in accuracy, precision, recall, ROC and other such metrics," Chawla said. "Facets of costs should be incorporated in evaluation, when available, as false negatives might be more costly than false positives, for example. Our paper expands this cost-sensitive classification framework by incorporating costs to acquire external data, modeling costs and operational costs, all of which are essential for the real-world deployment of these machine learning models. Moreover, these predictions don't just happen at once, but instead occur over a timeline—where it is important to consider the time-based valuations under constraints."

Chawla pointed out that a data-driven organization may make predictions on millions of instances of streaming data every day using an in-house predictive model. They have an idea of the cost of a correct prediction, a false positive, a false negative, operational costs, cost of capital for the team, etc. Using the NPV model, they can now ascribe a value to their entire data science operation and strategize for the future.

"If organizations want to investigate the possibility of tying in external data into their operations, they can use our technique, run it on their current data alongside their in-house data, and get the value of the new model," Nagrecha said. "If this new value, minus the switchover costs, is greater than that of the current model, then it means that over time, it is worth getting external data. Using the same process, they can evaluate competing bids for external data, multiple machine learning techniques, etc., on the same strategy board—all on the basis of their respective NPVs, and select the best ones given expected outcomes and budget."

The team's approach is generally applicable to all organizations as they face the decision of becoming increasingly more data-driven and yet constrained for resources. This paper provides a strategy board for organizations to develop a budget and allocate resources on various activities along the data science process. It starts with answering a basic question, "How valuable is the external data that I can acquire today to my future operations?"

More information: Saurabh Nagrecha et al. Quantifying decision making for data science: from data acquisition to modeling, EPJ Data Science (2016). DOI: 10.1140/epjds/s13688-016-0089-x

Journal information: European Physical Journal Data Science

Provided by University of Notre Dame

Citation: Expressing the value of data science in an ROI framework (2016, September 19) retrieved 10 July 2024 from https://phys.org/news/2016-09-science-roi-framework.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

When is big data too big? Making data-based models comprehensible

16 shares

Feedback to editors

A new species of extinct crocodile relative rewrites life on the Triassic coastline

5 hours ago

New method achieves tenfold increase in quantum coherence time via destructive interference of correlated noise

5 hours ago

Mars likely had cold and icy past, new study finds

5 hours ago

Study: Nanoparticle vaccines enhance cross-protection against influenza viruses

5 hours ago

New tools are needed to make water affordable, says study

6 hours ago

Researchers demonstrate how to build 'time-traveling' quantum sensors

6 hours ago

Lion with nine lives breaks record with longest swim in predator-infested waters

7 hours ago

New multimode coupler design advances scalable quantum computing

7 hours ago

High-speed electron camera uncovers new 'light-twisting' behavior in ultrathin material

7 hours ago

Perceived warmth, competence predict callback decisions in meta-analysis of hiring experiments

8 hours ago

Load comments (0)

Expressing the value of data science in an ROI framework

A new species of extinct crocodile relative rewrites life on the Triassic coastline

New method achieves tenfold increase in quantum coherence time via destructive interference of correlated noise

Mars likely had cold and icy past, new study finds

Study: Nanoparticle vaccines enhance cross-protection against influenza viruses

New tools are needed to make water affordable, says study

Researchers demonstrate how to build 'time-traveling' quantum sensors

Lion with nine lives breaks record with longest swim in predator-infested waters

New multimode coupler design advances scalable quantum computing

High-speed electron camera uncovers new 'light-twisting' behavior in ultrathin material

Perceived warmth, competence predict callback decisions in meta-analysis of hiring experiments

Relevant PhysicsForums posts

Is an API Always Necessary for Server-Client Communication?

5 GHz PC WiFi connection Cybersecurity question

I did this POST message configuration damage to my wifi internet, help

Number of Multiplications in the FFT Algorithm

Newbie question about deep learning

Who can find the largest prime number with their own programmed code?

When is big data too big? Making data-based models comprehensible

Looking beyond conventional networks can lead to better predictions

New paper examines the significant social strategies in human communication

System predicts 85 percent of cyber-attacks using input from human experts

Scientists combine satellite data and machine learning to map poverty

Machine Learning techniques and the future of Ecology and Earth Science Research

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Medical Xpress

Tech Xplore

Science X

Expressing the value of data science in an ROI framework

A new species of extinct crocodile relative rewrites life on the Triassic coastline

New method achieves tenfold increase in quantum coherence time via destructive interference of correlated noise

Mars likely had cold and icy past, new study finds

Study: Nanoparticle vaccines enhance cross-protection against influenza viruses

New tools are needed to make water affordable, says study

Researchers demonstrate how to build 'time-traveling' quantum sensors

Lion with nine lives breaks record with longest swim in predator-infested waters

New multimode coupler design advances scalable quantum computing

High-speed electron camera uncovers new 'light-twisting' behavior in ultrathin material

Perceived warmth, competence predict callback decisions in meta-analysis of hiring experiments

Relevant PhysicsForums posts

Related Stories

When is big data too big? Making data-based models comprehensible

Looking beyond conventional networks can lead to better predictions

New paper examines the significant social strategies in human communication

System predicts 85 percent of cyber-attacks using input from human experts

Scientists combine satellite data and machine learning to map poverty

Machine Learning techniques and the future of Ecology and Earth Science Research

Recommended for you

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Newsletter sign up

Donate and enjoy an ad-free experience