Researchers create an AI to help us make sense of privacy policies

March 1, 2018, Carnegie Mellon University
Credit: CC0 Public Domain

If you're anything like the average Internet user, you probably didn't spend the estimated 244 hours it would take to read every privacy policy for every website you visited last year. That's exactly why a team led by Carnegie Mellon University just launched an interactive website aimed at helping users make sense of their privacy on the web.

"We've combined crowdsourcing, machine learning, and techniques to extract annotations from that help answer key questions that users often care about," says Norman Sadeh, the lead principal investigator on the Usable Privacy Policy Project, a School of Computer Science professor in Carnegie Mellon's Institute for Software Research, and a faculty member in the CyLab Security and Privacy Institute.

The team used artificial intelligence (AI) algorithms to crawl 7,000 of the most popular websites' privacy policies and identify those that contain language about data collection and use, third-party sharing, data retention, and user choice, among other . The project website enables people to navigate machine-annotated privacy policies and jump directly to statements of interest to them, including those often buried deep in the text of privacy policies.

The researchers' AI also evaluated each privacy policy for readability. For example, ABC News topped the rankings with a privacy policy written at a "College Graduate" reading level (Grade 26). Google's privacy policy was found to be written at a college reading level (Grade 14), the same as those of YouTube, Reddit and Amazon. Facebook's privacy policy was found to be a tad friendlier, written at a Grade 12 reading level.

If you're anything like the average Internet user, you probably didn't spend the estimated 244 hours it would take to read every privacy policy for every website you visited last year. That's exactly why a team led by Carnegie Mellon University just launched an interactive website aimed at helping users make sense of their privacy on the web. Credit: Carnegie Mellon University College of Engineering Marketing & Communications Office
"We found that the text of the policies is often vague and ambiguous, and people tend to struggle to interpret and determine what personal information is collected, how it's used, and what other entities it's shared with," Sadeh says. "From a legal standpoint, this is problematic."

To "train" their AI, the team asked a group of law students to manually annotate 115 privacy policies. The AI learned from those annotations and then crawled the policies from over 7,000 of the most popular sites on the web.

"While not perfect, our techniques are capable of automatically extracting a large number of privacy statements from the text of privacy policies," says Sadeh. "Eventually, the goal is to make this information available to users via a simple and intuitive browser plug-in that would provide users with personalized summaries highlighting those issues they are most likely to care about."

Explore further: Privacy notices online probably don't match your expectations

More information: explore.usableprivacy.org/

Related Stories

Privacy notices online probably don't match your expectations

June 2, 2015

Consumers often complain that online companies violate their privacy—but the problem may be with the consumers themselves. According to a new study in the Journal of Public Policy & Marketing, there can be a big discrepancy ...

Reading privacy policy lowers trust

May 20, 2014

Website privacy policies are almost obligatory for many online services, but for anyone who reads these often unwieldy documents, trust in the provider is more commonly reduced than gained, according to US researchers.

Nobody reads privacy policies – here's how to fix that

October 10, 2017

Have you ever actually read an app's privacy policy before clicking to accept the terms? What about reading the privacy policy for the website you visit most often? Have you ever read or even noticed the privacy policy posted ...

Health apps and the sharing of information with third parties

March 8, 2016

In a study appearing in the March 8 issue of JAMA, Sarah R. Blenner, J.D., M.P.H., of the Illinois Institute of Technology Chicago-Kent College of Law, Chicago, and colleagues examined the privacy policies of Android diabetes ...

Recommended for you

Coffee-based colloids for direct solar absorption

March 22, 2019

Solar energy is one of the most promising resources to help reduce fossil fuel consumption and mitigate greenhouse gas emissions to power a sustainable future. Devices presently in use to convert solar energy into thermal ...

EPA adviser is promoting harmful ideas, scientists say

March 22, 2019

The Trump administration's reliance on industry-funded environmental specialists is again coming under fire, this time by researchers who say that Louis Anthony "Tony" Cox Jr., who leads a key Environmental Protection Agency ...

The taming of the light screw

March 22, 2019

DESY and MPSD scientists have created high-order harmonics from solids with controlled polarization states, taking advantage of both crystal symmetry and attosecond electronic dynamics. The newly demonstrated technique might ...

1 comment

Adjust slider to filter visible comments by rank

Display comments: newest first

Cusco
1 / 5 (2) Mar 01, 2018
Seems to me that it would be much better to just create half a dozen standardized policies and web sites could pick "Standard Privacy Policy 3".

I'm reminded of a now-defunct webzine that changed its policy to say essentially, "By reading articles or participating in the discussions on the web site you hereby agree to hand over your first born child to the management of the site." After three months no one had noticed.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.