(Phys.org)—English majors might warm to the question of what they want to be when they graduate. Author? OK. Writer? Fine. Master Compiler? Hmm. "Master Compiler" is not a familiar career path to English majors, but it might describe the unique work of INSEAD professor Philip M. Parker. He has a patented system for algorithmically compiling data into book form. He has brought the automatically generated books into the mainstream with Amazon listing over 100,000 books attributed to Parker, and over 700,000 works listed for his company, ICON Group International. According to reports, a separate entity, EdgeMaven Media, in addition, provides applications for businesses to create their own computer made content. The organizations pay for this service to compile data for their reports.
In general, it has been said that the system can compile an entire book on a subject in about 13, or 20 minutes, to a few hours, depending on the topic. Parker's algorithms were designed to mimic the thought process that an expert would experience during the writing of any one topic. Parker's books have covered a range of subjects from rare diseases to crossword puzzle books for learning foreign languages. A sample here indicates the range: Webster's Slovak-English Thesaurus Dictionary; The 2007-2012 World Outlook for Wood Toilet Seats; The 2009-2014 World Outlook for 60-milligram Containers of Fromage Frais; Ellis-van Creveld Syndrome.
Making the system work involves databases of information, an interface to customize a query about a topic, and templates for information to be packaged. The system's database is filled with genre-relevant content. The templates are coded to reflect domain knowledge, to be written according to an expert in that particular field. Like a human author, the system opens a word document and creates the report, reasons through content, outputting one page after another, copy edits tables, applies content to formats using editorial rules, places headers and footers, and generates summary statements. It saves the document and updates its table of contents.
He noted that physicians welcome the rare-disease books, as many publishers would be reluctant to publish such books. The automatically generated books are produced according to his method that was patented in 2007. The abstract for the U.S. patent issued to him that year described the system:
"The present invention provides for the automatic authoring, marketing, and or distributing of title material. A computer automatically authors material. The material is automatically formatted into a desired format, resulting in a title material. The title material may also be automatically distributed to a recipient. Meta material, marketing material, and control material are automatically authored and if desired, distributed to a recipient. Further, the title may be authored on demand, such that it may be in any desired language and with the latest version and content."
Parker has noted that the automated software concept does not need to be limited to written books only but could be extended to a variety of media formats. He has, for example, been working on a video project involving an online dictionary. Parker's interest is also extended to media such as TV, for which he is exploring content-generation programs. Using 3-D animation and selecting avatars that would be acceptable to people of different age ranges, such work would involve reverse engineering many of the formats we see in television and film, and replacing human actors with 3-D characters.
Explore further: HP, Amazon to sell paperback versions of e-books