Oh great, Facebook wants to know you're being sarcastic

Sep 26, 2013 by Matthew Higgs, The Conversation
“Cat! No, wait, dog! Oh I give up.” Credit: Travis S.

You might think social networks couldn't possibly gather more information on you than they already do. That in a world where your every move is tagged, flagged and logged, there is nothing more that could possibly be gleaned from your digital footprints. You'd be wrong.

It was revealed this week that Facebook has followed Google into the world of "deep learning", a new frontier in (AI) that uses networks of computers to mimic the to better understand the subtle nuances that make us tick.

Internet companies specialise in tailoring services according to what they believe we like. If you spend a lot of time looking at holidays then you'll get shown a lot of about travel services. If you communicate with particular people more than others on Facebook, they will tailor your news feed to see more of one friend and less of another.

But while this tailoring is remarkable in many respects, it is still clunky in others. Facebook can't detect , for example. If you post on your wall that you "just can't wait for more news" about your long-estranged school friend's as they pursue yet another fad diet, it will take you at your word and send you "helpful" links to celebrity weight loss programmes. Deep learning is an attempt to tap into these more subtle aspects of human thought, building on what we have already achieved in .

Machine can haz lolcatz

The best known example of deep learning came out of Google X last year. A network of computer processors was let lose on YouTube and quickly learned to identify videos of cats without being told what a cat actually looks like.

Deep learning builds on a long history of machine learning. In this field of AI, the concepts you are trying to find often have labels, such as "picture of a cat" or "picture of a dog". These are objective labels that are based on socially accepted facts.

A cat is a cat and a dog is a dog, so when you explain to a machine what a cat looks like, it can search out pictures of them with ease. The idea behind many machine learning algorithms is that if you show the machine enough labelled data, it can learn how to identify the labels of future unlabelled data.

But human nature is more complex. Traits like sarcasm and fickle shifts in emotion and desires all make understanding us a lot harder than recognising a wet nose and whiskers in an online gallery.

We have to question how well an algorithm can learn in this setting. In machine learning, the performance of an algorithm is often judged using its prediction accuracy. The machine is tested by looking at how many cats and dogs it correctly classifies from a test set. But how do we measure accuracy in a more subjective context?

Traditional machine learning is often divided into two distinct types of method: supervised and unsupervised. Supervised methods focus on problems in which there are labels like "cat" or "dog" and work on the assumption that there is a "teacher" supervising the learning process, showing the computer what a cat is in the first place. In unsupervised learning, there are no such labels and the aim is to simply find interesting features and representations of the data.

Deep learning sits somewhere in between these two approaches and utilises "semi-supervised" methods. The main idea is that the information in unlabelled data can be used to leverage the information in the labelled data.

Assume that an unsupervised algorithm separates a set of pictures into two distinct groups, then it is told that one of the data points from one group is a cat and one data point from the other group is a dog. If the data set consists of only pictures of cats and dogs, then it is likely that the label in each of those labelled examples applies to the whole group. This is the fundamental idea in many deep learning applications.

In the same way that different parts of the brain focus on different functions, a deep learning network can be partitioned into different areas of activity. Generally, the lower layers in a deep learning network focus on discovering the interesting features in unlabelled data, while the top layers connect these features to observable labelled data.

This is analogous to the way in which the brain receives and processes visual information through the eye. By learning the interesting features of input information, deep learning networks are able to make better predictions with fewer labels.

You and your feed

Facebook holds a huge amount of information and a certain amount of it indicates the preferences of the user. Deep learning can leverage all the additional information to better predict what the user really wants. Whether or not someone is being sarcastic in their posts depends inherently upon the context of use. Who they are speaking to, the preceding comment, the topic of the thread, even the time of day, are all examples of features that may be important in determining whether someone is being sarcastic or not. To process all this data and to capture high level concepts requires the sophisticated machinery of deep learning. Then, after these features have been learned, maybe a simple variable such as the number of likes to your comment can be used to indicate how well your comment was received.

These methods are not just limited to Facebook. Anywhere we leave a digital footprint, machine learning methods can be applied to find the traits of our behaviour within the digital environment.

This does however open up ethical issues about how our data is collected and used. Companies are becoming better and better at accurately representing us and telling us what they think we want. They can only go so far though, so if you don't like the way AI is heading, you can just beat the system by leading a more spontaneous life.

Explore further: Facebook looking for meaning in user posts with 'deep learning' algorithms

add to favorites email to friend print save as pdf

Related Stories

Google team: Self-teaching computers recognize cats

Jun 26, 2012

(Phys.org) -- At the International Conference on Machine Learning, which starts today in Edinburgh, participants will hear about Google’s results after several years’ work at their big idea laboratory, ...

DARPA envisions the future of machine learning

Mar 20, 2013

Machine learning – the ability of computers to understand data, manage results, and infer insights from uncertain information – is the force behind many recent revolutions in computing. Email spam filters, ...

Recommended for you

Quantenna promises 10-gigabit Wi-Fi by next year

4 hours ago

(Phys.org) —Quantenna Communications has announced that it has plans for releasing a chipset that will be capable of delivering 10Gbps WiFi to/from routers, bridges and computers by sometime next year. ...

New US-Spanish firm says targets rich mobile ad market

4 hours ago

Spanish telecoms firm Telefonica and US investment giant Blackstone launched a mobile telephone advertising venture on Wednesday, challenging internet giants such as Google and Facebook in a multi-billion-dollar ...

Environmentally compatible organic solar cells

5 hours ago

Environmentally compatible production methods for organic solar cells from novel materials are in the focus of "MatHero". The new project coordinated by Karlsruhe Institute of Technology (KIT) aims at making ...

Twitter rules out Turkey office amid tax row

5 hours ago

Social networking company Twitter on Wednesday rejected demands from the Turkish government to open an office there, following accusations of tax evasion and a two-week ban on the service.

User comments : 1

Adjust slider to filter visible comments by rank

Display comments: newest first

chrisp114
not rated yet Sep 26, 2013
This is just one of the reasons I use Ravetree for social networking, DuckDuckGo for search, and HushMail for email. I'm tired of seeing ads and having my privacy violated!

More news stories

Quantenna promises 10-gigabit Wi-Fi by next year

(Phys.org) —Quantenna Communications has announced that it has plans for releasing a chipset that will be capable of delivering 10Gbps WiFi to/from routers, bridges and computers by sometime next year. ...

Floating nuclear plants could ride out tsunamis

When an earthquake and tsunami struck the Fukushima Daiichi nuclear plant complex in 2011, neither the quake nor the inundation caused the ensuing contamination. Rather, it was the aftereffects—specifically, ...

Unlocking secrets of new solar material

(Phys.org) —A new solar material that has the same crystal structure as a mineral first found in the Ural Mountains in 1839 is shooting up the efficiency charts faster than almost anything researchers have ...

Patent talk: Google sharpens contact lens vision

(Phys.org) —A report from Patent Bolt brings us one step closer to what Google may have in mind in developing smart contact lenses. According to the discussion Google is interested in the concept of contact ...

How kids' brain structures grow as memory develops

Our ability to store memories improves during childhood, associated with structural changes in the hippocampus and its connections with prefrontal and parietal cortices. New research from UC Davis is exploring ...