December 29, 2015

Random additions efficiently anonymize large data sets

Balancing transparency and freedom of information with the right to privacy lays high demands on data handling methods. So far methods for anonymizing shared data sets have assumed that there is a distinction between details that can be used to identify an individual (quasi-identifiers) and details that are deemed 'sensitive' and private, but this is not always the case. Now Yuichi Sei and Akihiko Ohsuga from the University of Electro- Communications, alongside Takao Takenouchi from NEC Corporation in Japan, have devised an algorithm that efficiently anonymizes data sets without assuming this distinction.

The researchers use hospital lists as an example. A data set may include the name (direct identifier), address and age (quasi-identifier) and sensitive information (a medical condition). Even without giving the name for each entry, someone using the data set could identify entries from the age and address. In addition, anonymization should be resistant to attempts to identify particulars by comparing two anonymized sets for the same data.

One approach to anonymizing data is to add noise to a data set, where the frequency of each possible value for each attribute is presented in a histogram. However as Sei, Ohsuga and Takenouchi point out this can greatly increase the quantity of the data. "Because almost all of the categories have only a few people in the histogram, the noise added to each category of the histogram has a heavy impact."

The UEC-NEC Corporation researchers instead randomised the data set for each attribute and added random values to each entry. "Through simulations of real data sets, we prove that our proposed method can anonymize and reconstruct databases while keeping a high quality of data within a realistic period." The approach may be useful for anonymizing public records such as the census and electronic electoral votes.

More information: (l1, ..., lq)-diversity for anonymizing sensitive quasi-identifiers 2015 IEEE Trustcom/BigDataSE/ISPA 596-603. DOI: 10.1109/Trustcom-BigDataSe-ISPA.2015.424

Provided by University of Electro Communications

Citation: Random additions efficiently anonymize large data sets (2015, December 29) retrieved 24 April 2024 from https://phys.org/news/2015-12-random-additions-efficiently-anonymize-large.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Big Data analyses depend on starting with clean data points

33 shares

Feedback to editors

Japan's moon lander wasn't built to survive a weekslong lunar night. It's still going after 3

1 hour ago

Bioluminescence first evolved in animals at least 540 million years ago, pushing back previous oldest dated example

9 hours ago

Star bars show universe's early galaxies evolved much faster than previously thought

10 hours ago

Scientists study lipids cell by cell, making new cancer research possible

10 hours ago

Squids' birthday influences mating: Male spear squids shown to become 'sneakers' or 'consorts' depending on birth date

10 hours ago

Study finds rekindling old friendships as scary as making new ones

13 hours ago

How light can vaporize water without the need for heat

14 hours ago

Researchers develop eggshell 'bioplastic' pellet as sustainable alternative to plastic

14 hours ago

Previous theory on how electrons move within protein nanocrystals might not apply in every case

15 hours ago

Fruit fly pest meets its evolutionary match in parasitic wasp

15 hours ago

Load comments (0)

Random additions efficiently anonymize large data sets

Japan's moon lander wasn't built to survive a weekslong lunar night. It's still going after 3

Bioluminescence first evolved in animals at least 540 million years ago, pushing back previous oldest dated example

Star bars show universe's early galaxies evolved much faster than previously thought

Scientists study lipids cell by cell, making new cancer research possible

Squids' birthday influences mating: Male spear squids shown to become 'sneakers' or 'consorts' depending on birth date

Study finds rekindling old friendships as scary as making new ones

How light can vaporize water without the need for heat

Researchers develop eggshell 'bioplastic' pellet as sustainable alternative to plastic

Previous theory on how electrons move within protein nanocrystals might not apply in every case

Fruit fly pest meets its evolutionary match in parasitic wasp

Relevant PhysicsForums posts

Passing variables in FORTRAN

My Website For Creating Interactive Visuals Linked To Equations

Number of Multiplications in the FFT Algorithm

Error logging in: onLoginSuccess is not a function

Latest Notable AI accomplishments

Building a homemade Long Short Term Memory with FSMs

Big Data analyses depend on starting with clean data points

Social network analysis privacy tackled

Intelligent data analysis with guaranteed privacy

Research reveals we may need a new definition for privacy

Simple errors limit scientific scrutiny

Yahoo plans to keep search records for 18 months

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Medical Xpress

Tech Xplore

Science X

Random additions efficiently anonymize large data sets

Japan's moon lander wasn't built to survive a weekslong lunar night. It's still going after 3

Bioluminescence first evolved in animals at least 540 million years ago, pushing back previous oldest dated example

Star bars show universe's early galaxies evolved much faster than previously thought

Scientists study lipids cell by cell, making new cancer research possible

Squids' birthday influences mating: Male spear squids shown to become 'sneakers' or 'consorts' depending on birth date

Study finds rekindling old friendships as scary as making new ones

How light can vaporize water without the need for heat

Researchers develop eggshell 'bioplastic' pellet as sustainable alternative to plastic

Previous theory on how electrons move within protein nanocrystals might not apply in every case

Fruit fly pest meets its evolutionary match in parasitic wasp

Relevant PhysicsForums posts

Related Stories

Big Data analyses depend on starting with clean data points

Social network analysis privacy tackled

Intelligent data analysis with guaranteed privacy

Research reveals we may need a new definition for privacy

Simple errors limit scientific scrutiny

Yahoo plans to keep search records for 18 months

Recommended for you

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Newsletter sign up

Donate and enjoy an ad-free experience