September 6, 2012

Mining the blogosphere—Researchers develop tools that make sense of social media

Can a computer "read" an online blog and understand it? Several Concordia computer scientists are helping to get closer to that goal.

Leila Kosseim, associate professor in Concordia's Faculty of Engineering and Computer Science, and a recently-graduated doctoral student, Shamima Mithun, have developed a system called BlogSum that has potentially vast applications. It allows an organization to pose a question and then find out how a large number of people talking online would respond. The system is capable of gauging things like consumer preferences and voter intentions by sorting through websites, examining real-life self-expression and conversation, and producing summaries that focus exclusively on the original question.

"Huge quantities of electronic texts have become easily available on the Internet, but people can be overwhelmed, and they need help to find the real content hiding in the mass of information," explains Kosseim, one of the lead researchers at Concordia's Computational Linguistics Laboratory (CLaC lab).

Analyzing informally-written language poses unique challenges compared to analyzing, for example, a news article. Blogs, forums and the like contain opinions, emotions and speculations, not to mention spelling errors and poor grammar. A summarization tool must address two particular problems, question irrelevance (sentences that are not relevant to the main question), and discourse incoherence, (sentences in which the intent of the writer is unclear).

BlogSum met these challenges with demonstrable efficiency. The researchers developed and tested their tool by examining a set of blogs and review sites. BlogSum used "discourse relations" to crunch the data – ways of filtering and ordering sentences into coherent summaries. BlogSum was measured against prior computational rankings and achieved mostly superior results. In addition, it was evaluated by actual human subjects, who also found it to be superior. Summaries produced by BlogSum reduced question irrelevance and discourse incoherence, successfully distilling large amounts of text into highly readable summaries.

This study is an example of Natural Language Processing (NLP), in which Concordia, through the CLaC lab, is a leader. NLP stands at the intersection of artificial intelligence and linguistics, seeking to enable computers to derive meaning from human language.

"The field of natural language processing is starting to become fundamental to computer science, with many everyday applications – making search engines find more relevant documents or making smart phones even smarter," explained Kosseim.

Provided by Concordia University

Citation: Mining the blogosphere—Researchers develop tools that make sense of social media (2012, September 6) retrieved 4 May 2024 from https://phys.org/news/2012-09-blogosphereresearchers-tools-social-media.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Lillian Lee: Computers not yet able to understand human speech

0 shares

Feedback to editors

Hungry, hungry white dwarfs: Solving the puzzle of stellar metal pollution

12 hours ago

How E. coli get the power to cause urinary tract infections

13 hours ago

Male or female? Scientists discover the genetic mechanism that determines sex development in butterflies

13 hours ago

New study is first to use statistical physics to corroborate 1940s social balance theory

13 hours ago

Stony coral tissue loss disease is shifting the ecological balance of Caribbean reefs

13 hours ago

Assyriologist claims to have solved archaeological mystery from 700 BC

13 hours ago

Scientists show how to treat burns with an environmentally friendly plant-based bandage

13 hours ago

Rising mercury levels may contribute to declining Steller sea lion populations

14 hours ago

Call of the conch: Archaeologists suggest Indigenous Americans used sound to organize local communities

14 hours ago

Aligned peptide 'noodles' could enable lab-grown biological tissues

14 hours ago

Load comments (0)

Mining the blogosphere—Researchers develop tools that make sense of social media

Hungry, hungry white dwarfs: Solving the puzzle of stellar metal pollution

How E. coli get the power to cause urinary tract infections

Male or female? Scientists discover the genetic mechanism that determines sex development in butterflies

New study is first to use statistical physics to corroborate 1940s social balance theory

Stony coral tissue loss disease is shifting the ecological balance of Caribbean reefs

Assyriologist claims to have solved archaeological mystery from 700 BC

Scientists show how to treat burns with an environmentally friendly plant-based bandage

Rising mercury levels may contribute to declining Steller sea lion populations

Call of the conch: Archaeologists suggest Indigenous Americans used sound to organize local communities

Aligned peptide 'noodles' could enable lab-grown biological tissues

Relevant PhysicsForums posts

Parallel processing for loops and pointer defined outside the loop

Passing variables in FORTRAN

User-Defined Functions in Sql Server SSMS

Classifiers, threshold, and ROC curve

My Website For Creating Interactive Visuals Linked To Equations

Number of Multiplications in the FFT Algorithm

Lillian Lee: Computers not yet able to understand human speech

Can automated deep natural-language analysis unlock the power of inference?

Can networked human computation solve computer language comprehension?

Can't Make it to a Meeting? Send a Computer Instead

Mining the language of science

Machines to compare notes online?

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Medical Xpress

Tech Xplore

Science X

Mining the blogosphere—Researchers develop tools that make sense of social media

Hungry, hungry white dwarfs: Solving the puzzle of stellar metal pollution

How E. coli get the power to cause urinary tract infections

Male or female? Scientists discover the genetic mechanism that determines sex development in butterflies

New study is first to use statistical physics to corroborate 1940s social balance theory

Stony coral tissue loss disease is shifting the ecological balance of Caribbean reefs

Assyriologist claims to have solved archaeological mystery from 700 BC

Scientists show how to treat burns with an environmentally friendly plant-based bandage

Rising mercury levels may contribute to declining Steller sea lion populations

Call of the conch: Archaeologists suggest Indigenous Americans used sound to organize local communities

Aligned peptide 'noodles' could enable lab-grown biological tissues

Relevant PhysicsForums posts

Related Stories

Lillian Lee: Computers not yet able to understand human speech

Can automated deep natural-language analysis unlock the power of inference?

Can networked human computation solve computer language comprehension?

Can't Make it to a Meeting? Send a Computer Instead

Mining the language of science

Machines to compare notes online?

Recommended for you

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Newsletter sign up

Donate and enjoy an ad-free experience