Data links quick fix

Feb 13, 2014

Software that can fix 90 percent of broken links in the web of data, assuming the resources are still on the site's server, has been developed by researchers in Iran. The details are reported this month in the International Journal Web Engineering and Technology.

Everyone knows the frustration of following a link to an interesting web site only to discover the target page is no longer there and to be presented with an error page. However, more frustrating and with wider implications for science, healthcare, industry and other areas is when machines communicate and expect to find specific resources that turn out to be missing or dislocated from their identifier. This can cause problems when a computer is processing large amounts of data in a financial or scientific analysis, for instance. If the resource is still on the servers, then it should be retrievable given a sufficiently effective that can recreate the missing links.

Computing engineers Mohammad Pourzaferani and Mohammad Ali Nematbakhsh of the University of Isfahan explain that previous efforts to address the issue of broken links in the web of data have focused on the destination point. This approach has two inherent limitations. First, it homes in on a single point of failure whereas there might be wider issues across a database. Secondly, it relies on knowledge of the destination data source.

The team has now introduced a method for fixing broken links which is based on the source point of links and a way to discover the new address of the digital entity that has become detached. Their method creates a superior and an inferior dataset which lets them create an exclusive data graph that can be monitored over time in order to identify changes and trap missing links as resources become detached.

"The proposed algorithm uses the fact that entities preserve their structure event after movement to another location. Therefore, the algorithm creates an exclusive graph structure for each entity," explains Pourzaferani. This graph consists of two types of entity called 'Superior' and 'Inferior'. Which are entities point to the detached entity and point by it, respectively. When the broken link is detected the algorithm starts its task to find the new location for detached entity or the best similar candidate for it. To this end, the crawler controller module searches for the superiors of each entity in the inferior dataset, and vice versa. After some steps the search space is narrowed and the best candidate is chosen."

The researchers tested the algorithm on two snapshots of DBpedia within which are contained almost 300,000 person entities. Their algorithm identified almost 5,000 entities that changed between the first and second snapshot recorded some time later. The algorithm demonstrated its prowess in relocating 9 out of 10 of the broken links.

Explore further: Network theory to strengthen the banking system

More information: "Repairing broken RDF links in the web of data" in Int. J. Web Engineering and Technology, 2013, 8, 395-411

add to favorites email to friend print save as pdf

Related Stories

Network theory to strengthen the banking system

Dec 09, 2013

Since the beginning of the financial crises that erupted in 2008, numerous governments have injected public funds into the banking system in order to prevent the failure of some entities and avoid the collective ...

Madrid duo fire up quantum contender to Google search

Dec 14, 2011

(PhysOrg.com) -- Two Madrid scientists from The Complutense University think they have an algorithm that may impact the nature of the world's leading search engine. In essence, they are saying Hey, world, ...

Visualizing the secret beauty of the world wide web

Dec 04, 2013

From a distance these stunning images look likes stars exploding, fireworks or simply striking patterns - but what you're actually looking at are the hidden dimensions of the world wide web

Facebook graph reveals who you love

Feb 13, 2014

(Phys.org) —Even if you're not shouting it from the housetops, there's a good chance the structure of your Facebook neighborhood will identify your romantic partner.

Recommended for you

Forging a photo is easy, but how do you spot a fake?

Nov 21, 2014

Faking photographs is not a new phenomenon. The Cottingley Fairies seemed convincing to some in 1917, just as the images recently broadcast on Russian television, purporting to be satellite images showin ...

Algorithm, not live committee, performs author ranking

Nov 21, 2014

Thousands of authors' works enter the public domain each year, but only a small number of them end up being widely available. So how to choose the ones taking center-stage? And how well can a machine-learning ...

Professor proposes alternative to 'Turing Test'

Nov 19, 2014

(Phys.org) —A Georgia Tech professor is offering an alternative to the celebrated "Turing Test" to determine whether a machine or computer program exhibits human-level intelligence. The Turing Test - originally ...

Image descriptions from computers show gains

Nov 18, 2014

"Man in black shirt is playing guitar." "Man in blue wetsuit is surfing on wave." "Black and white dog jumps over bar." The picture captions were not written by humans but through software capable of accurately ...

Converting data into knowledge

Nov 17, 2014

When a movie-streaming service recommends a new film you might like, sometimes that recommendation becomes a new favorite; other times, the computer's suggestion really misses the mark. Yisong Yue, assistant ...

User comments : 0

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.