November 28, 2013

Detecting Twitter users' gender, en francais

With 230 million users, Twitter has become a global force in social media. And not just in English.

Data miners have been hard at work trying to figure out the attributes of Twitter users – such as gender and age—that aren't explicitly revealed on Twitter feeds. That information could be hugely valuable to marketers, enabling them to target messages to their desired audience. Nearly all the research done so far, however, has focused on English users and content.

Now, a McGill University research team has conducted one of the first studies designed to figure out the gender of Twitter users who primarily use languages other than English.

Among the key findings: by using a special detector based on French-language syntax, the researchers showed that it is very easy to classify gender for Twitter users in French – and probably for other Romance languages. In particular, the researchers developed an algorithm to look for masculine or feminine adjectives or past participles following the phrase "Je suis" (or variants such as "je ne suis pas").

Based on this construction, the detector was able to determine the gender of users with 90% accuracy – significantly higher than the accuracy rates of 80% to 85% achieved by various algorithms that have been developed to analyze English-language content.

Because French adjectives and past participles have masculine and feminine forms that are often spelled differently, "You don't have to get too fancy" to develop an effective gender detector for Tweets in the language, says Derek Ruths, a McGill computer-science professor who co-authored the study.

Since most individuals include photos of themselves on their Tweets, identifying male and female users might seem as simple as looking at the photos. But sorting through hundreds of millions of tweets is a task for computers, and "computers aren't good at looking at pictures," Ruths notes.

The McGill study was presented at a recent international conference in Seattle organized by the Association for Computational Linguistics. The paper also examines Twitter data sets for Japanese, Indonesian and Turkish. Japanese proved to be the toughest for inferring gender.

The results obtained for French show that some languages have features better suited for certain classification tasks. "Identifying and leveraging such features promises to be an interesting and effective direction for future work," adds McGill linguistics professor Morgan Sonderegger, who co-authored the paper with Ruths and computer-science undergraduate student Morgane Ciot.

More information: Link to the paper: www.derekruths.com/static/publ … rRuths_EMNLP2013.pdf
Link to the conference website: hum.csse.unimelb.edu.au/emnlp2013/

Provided by McGill University

Citation: Detecting Twitter users' gender, en francais (2013, November 28) retrieved 28 April 2024 from https://phys.org/news/2013-11-twitter-users-gender-en-francais.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Age and gender? Dutch develop analyser for Twitter

0 shares

Feedback to editors

Detecting Twitter users' gender, en francais

Global study shows a third more insects come out after dark

Cicada-palooza! Billions of bugs to blanket America

Getting dynamic information from static snapshots

Ancient Maya blessed their ballcourts: Researchers find evidence of ceremonial offerings in Mexico

Optical barcodes expand range of high-resolution sensor

Ridesourcing platforms thrive on socio-economic inequality, say researchers

Did Vesuvius bury the home of the first Roman emperor?

Florida dolphin found with highly pathogenic avian flu: Report

A new way to study and help prevent landslides

New algorithm cuts through 'noisy' data to better predict tipping points

Relevant PhysicsForums posts

(Solved) Change fill-in color in PDF file with Adobe without Pro

Flipped RGB colours in a TV

Fixing Linux kernel not found

Is an invisible LED mouse more accurate than one with a red LED?

AI In Actual Use

Does anyone make zero-flicker computer monitors?

Age and gender? Dutch develop analyser for Twitter

Twitter will mine people's tweets to target ads (Update)

Twitter clocks half-billion users: monitor

Twitter offers users scrapbook of past tweets

Twitter making tweets more 'visual' with overhaul

Twitter plans French, German, Italian and Spanish sites

European Parliament adopts copyright reform in blow to big tech

Facebook's messaging ambitions amount to much more than chat

Apps send intimate user data to Facebook: report

New bug prompts earlier end to Google+ social network

Twitter bots had 'disproportionate' role spreading misinformation in 2016 election: study

Web pioneer wants new 'contract' for internet

Medical Xpress

Tech Xplore

Science X

Detecting Twitter users' gender, en francais

Global study shows a third more insects come out after dark

Cicada-palooza! Billions of bugs to blanket America

Getting dynamic information from static snapshots

Ancient Maya blessed their ballcourts: Researchers find evidence of ceremonial offerings in Mexico

Optical barcodes expand range of high-resolution sensor

Ridesourcing platforms thrive on socio-economic inequality, say researchers

Did Vesuvius bury the home of the first Roman emperor?

Florida dolphin found with highly pathogenic avian flu: Report

A new way to study and help prevent landslides

New algorithm cuts through 'noisy' data to better predict tipping points

Relevant PhysicsForums posts

Related Stories

Age and gender? Dutch develop analyser for Twitter

Twitter will mine people's tweets to target ads (Update)

Twitter clocks half-billion users: monitor

Twitter offers users scrapbook of past tweets

Twitter making tweets more 'visual' with overhaul

Twitter plans French, German, Italian and Spanish sites

Recommended for you

European Parliament adopts copyright reform in blow to big tech

Facebook's messaging ambitions amount to much more than chat

Apps send intimate user data to Facebook: report

New bug prompts earlier end to Google+ social network

Twitter bots had 'disproportionate' role spreading misinformation in 2016 election: study

Web pioneer wants new 'contract' for internet

Newsletter sign up

Donate and enjoy an ad-free experience