May 8, 2009 weblog
Wolfram Alpha 'Knowledge Engine' is Like a Modern Farmer's Almanac
(PhysOrg.com) -- Currently, there's a lot of hype and skepticism surrounding the latest "Google rival," a so-called search engine named Wolfram Alpha. In the near future, anyone with Internet access will be able to freely visit www.wolframalpha.com and investigate how it works. In a media webinar earlier today, Wolfram Alpha's creator, Stephen Wolfram, said the site should be launching "in a bit over a week." Wolfram also showed off his ambitious project by demonstrating a variety of search queries and answering questions from journalists.
First, the question everyone is wondering: Wolfram Alpha doesn't know everything, and in fact it's pretty easy to stump it. Unfortunately, many users will probably try to confuse Alpha, and miss out on what it actually can do - which is provide tons of verified data and comparative information, prepared by experts and displayed freely and conveniently for the individual user.
Another big question is, will Alpha out-compete Google? Judging from the demonstration, the answer is definitely not - not anymore than Wikipedia has cut into Google's territory, at least. Alpha is like a Wikipedia of pure, raw data, with almost no word sentences, but full of charts, graphs, tables, maps, figures, etc. A bit nerdy-sounding, but on the whole, it's quite clearly displayed and easy to skim and interpret. As most people who use the Internet a lot know, there are times when you search Google, and times when you search Wikipedia or YouTube - and soon there will probably be times when you know that the type of information you want is more appropriate for an Alpha query.
As Wolfram explained, Wolfram Alpha is a "computable almanac." Like the famous Farmer's Almanac, it can display lots of statistics (that sometimes seem irrelevant from a practical perspective), but can be interesting to peruse, and often essential for research purposes. However, unlike a paper almanac, Wolfram Alpha doesn't store data, it computes data. It's not a search engine, it's a "computational knowledge engine." It deals with systematic knowledge, computing new data from algorithms and generating true facts. In other words, it seems to be less human, and more computer.
Wolfram breaks the concept down into four pillars. First, data curation - data was taken from public and licensed sources, validated by human domain experts, and cited in a link at the bottom of each page called "sources." Second, computational algorithms - developers encoded methods, models, and equations in a form that can be computed, using 5-6 million lines of Mathematica code. Third, linguistic processing - mapping natural language words into symbolic representations that can be computed. Fourth, automated presentation - presenting the results in a way that's useful to people.
These four components are each very challenging, and, as Wolfram explained, far from complete. Alpha is a long-term project, and the company plans to continue working on it and improving on it for years in the future. Wolfram said that Alpha is "built on a Mathematica platform and on A New Kind of Science (ANKS) paradigm" (Wolfram's tome on the nature of computation). Here, simple rules lead to great complexity, and the goal is that all knowledge might be represented by simple rules.
In today's webinar, Wolfram demonstrated many search queries on Alpha (although screen shots were not allowed). For example, "gdp france" generated plots and graphs, with each new set of data displayed in its own module or pod. Then, Wolfram typed "what is the gdp of france / spain" and the program computed a comparison of both countries, with overlapping data in the same chart. (Interestingly, this example is somewhat similar to Google's latest ability of generating synthesized graphs for data such as unemployment rates in multiple cities.)
Other diverse and noteworthy examples include "president of argentina in 1943," which resulted in Ramon S. Castillo; "tide NYC 11/6/2020" which displayed a graph of the tide in New York City that day; "15 flips 10 heads" which generated probability and distribution charts of such an event; "ISS" which displayed a map of the location of the International Space Station in real time; "1.343495843" which computed the possible formulas that give you that particular number; "a__n__g" which gave words that could fill in the blanks of a crossword puzzle (amending and atoning); "earthquakes" which generated a global map highlighting earthquake locations in the past 24 hours; "2 cups oj 1 slice cheddar cheese," which computed a synthesized nutrition label of the snack including vitamins and minerals; and "running 4 mph 30 minutes age 40 male 5'8" 160 lbs" which presented charts of calories, fat burned, heart rate, and race prediction times for different distances.
Just how easy is it to stump Alpha? Journalists' queries that confused the engine included "nutmeg production USA," "teenage pregnancy USA," "area of a soccer pitch" (although it could do "area of a football field"), and, not surprisingly, "what type of sunglasses was Justin Timberlake wearing at the Oscars" (though there was a brief entry on the singer himself). For such queries, users receive the answer "Wolfram Alpha isn't sure what to do with your input."
Wolfram emphasized that the project was a work in progress. Among other things, he hopes to move past an English-only version. When asked why he kept it a secret for so long, he said that he was not sure that it was going to work. He said he had hoped to do "a nice, quiet, soft launch," but it seems more people were interested in the latest Mathematica-backed endeavor than he (admittedly) realized.
Wolfram added that, besides the free version, there will be a professional version for corporations that want to customize the program. While the free version will generate PDFs of the pages, the professional version could export data to spreadsheets, allow users to import their own data and use Alpha's computation abilities to analyze it, and allow users to store their own data and share it within the company. There will also be a method for people to contribute facts, which is not yet available.
© 2009 PhysOrg.com