A group of researchers affiliated with the University of São Paulo (USP) campus in Ribeirão Preto, Brazil, has used big data tools such as data mining and network analysis to develop a method to identify technological routes, trends and partnerships in any knowledge area. For this purpose, it collected information from patent databases around the world.
In addition to mapping the patents in an area, this novel method identifies the technological routes used by companies and universities in several countries, as well as emerging trends and networks of partnerships between companies and scientific institutions active in the sector.
Described in an article published in Nature Biotechnology, the method resulted from the postdoctoral research of biologist Cristiano Gonçalves Pereira at the University of São Paulo's Ribeirão Preto School of Economics, Administration and Accounting (FEARP-USP), supported by a scholarship from the São Paulo Research Foundation—FAPESP.
Pereira was supervised by Geciane Silveira Porto, a professor at FEARP-USP and coordinator of the Center for Research on Innovation, Technology Management and Competitiveness (InGTeC). The study featured collaborations from Virgínia Picanço-Castro and Dimas Tadeu Covas), respectively researcher and coordinator at the Center for Cell-Based Therapy (CTC), a Research, Innovation and Dissemination Center (RIDC).
Researchers used the recombinant factor VIII research as a practical example to develop the method. Recombinant factor VIII is a protein produced synthetically from human cells to treat patients who suffer from hemophilia A, a blood clotting disorder caused by a genetic mutation in the region of the X chromosome DNA that should produce factor VIII.
The development of studies on technological routes for recombinant factor VIII—as well as for solar energy and biofuels—was supported by FAPESP, which also funded the researchers' use of a database covering patent applications and grants processed by patent offices worldwide.
Advances on hemophilia research
Finding a way to prolong the action of recombinant factor VIII in the organism could considerably improve the lives of hemophiliacs. Employing new molecules in combination would allow extending the half-life of recombinant factor VIII, which in turn would reduce the number of applications of the protein and the cost of treatment.
By analyzing relevant patents, InGTeC and CTC produced knowledge that allowed the researchers to find a molecule called XTEN, which can be considered as a highly promising means of extending the half-life of recombinant factor VIII.
Pereira explained why the group chose mapping patentes instead of scientific articles. "Patent databases are more robust and provide a more accurate picture of the state of the art in the field of interest," Pereira explained. "In addition, citations of scientific articles are a choice: the researcher decides which references to cite in a paper. In the case of patents, the applicant is required to cite all patents used to develop an innovation even if they belong to a competitor."
Among the innovative trends in research and development on recombinant factor VIII detected by the study is large-scale production with higher quality.
"The focus for much R&D in this field is on adding supplements to recombinant factor VIII so that it's more concentrated, robust and refined. We also observed technologies designed to reduce the protein's immunogenicity, meaning its ability to stimulate an immune response in the patient," Pereira said. Immune response is a problem to be solved because, although recombinant factor VIII is produced from human cells, it may be rejected by the patient's organism if the immune system identifies it as a foreign body and attacks it.
Data mining for the project used Derwent Innovation, a platform covering almost 100 patent offices for jurisdictions around the world, including Brazil.
The researchers found 3,424 patents relating to recombinant factor VIII for the period 1997-2016 (patents in this field typically last 20 years). The years 2017 and 2018 were not searched because of secrecy rules governing patent applications in this period.
Having completed the survey, the researchers also needed to read more than 3,400 documents, which would not have been possible without big data tools such as data mining.
They deployed methods such as text analysis and word frequency distribution, as well as social network analysis, which uses computational resources to study the interactions among the people, groups and organizations in any given field.
"Our first concern was to see how cooperation occurs and map the cooperation networks," Pereira said. "We then identified patent assignees, who's cooperating with whom, interaction between universities, research institutions and companies, who's more or less influential in the network, who needs partnerships, and other factors."
The study showed, for instance, that the United States is the leading country in terms of patent production in this field; that there is a group of European companies that collaborate intensely with each other and are therefore more influential; and that Brazil does not feature in any cooperation networks, probably because collaboration among its companies and research institutions is too rare for the method to detect.
The next step was an analysis of technological prospecting to understand emerging trends in recent years and to detect future trends.
"To do this, we studied patent citations and patent themes, analyzing titles and mining the most frequent terms in the last 20 years. We did the same analysis for the last five years to detect emerging trends," Pereira said.
The method generated two network maps, one for cooperation that identified nodes linking companies and research institutions, and another for patent citations. "The patent citation network map shows how one technology helps construct another and tracks the knowledge flows involved," noted the FAPESP-supported researcher.
Because innovations typically result from a combination of several patents, it is possible to conclude that the most frequently cited patent combination indicates a trend in technological routes.
To analyze citations and map technological routes, Pereira adapted a plugin developed by InGTeC to enable Gephi software to calculate the search path link count, which is the number of times the most frequent route has been used.
The study did not result in automatically applicable software or in a specific platform, but according to Pereira, the method can be reproduced to search any field based on the explanations supplied in the article published in Nature Biotechnology.
InGTeC coordinator Silveira Porto stressed the growing importance given to patents in academia now that universities and research institutions are placing more emphasis on application numbers.
"Technological route mapping can help researchers identify emerging technologies in their fields," she said. "When they design a research project, they can use this knowledge to propose studies that are on the scientific and technological frontier."
Technological route mapping is also highly useful for companies. "The method can give a manager or director of research, development and innovation an overview of the emerging technologies in the company's sector or in sectors in which it's planning to invest. It can also be used to compare the stage of technological maturity reached by the company with the maturity of the companies that own the main technologies in the routes identified by the mapping exercise," Porto added.
"Managers can use this information to pursue technological partnerships and technology transfer, or even to acquire or invest in a startup. The method's contribution to corporate strategic intelligence is highly significant."
Explore further: Emicizumab prophylaxis cuts bleeding in hemophilia A
Cristiano Gonçalves Pereira et al, Patent mining and landscaping of emerging recombinant factor VIII through network analysis, Nature Biotechnology (2018). DOI: 10.1038/nbt.4178