Impact Assessment of Innovation Policy

4.4 Additional Data Sources

“Big data” can prove a useful source of additional information for this project, complementing the available data sources. This is particularly the case as researchers have a strong incentive to make their work widely known, also as a way to seek future collaborations. Incorporating these data sources requires careful consideration of the representativeness of the samples obtained as well as adequate interpretations of new data obtained. Box A.2. summarises some of the most interesting data sources. 

Box 1. Additional Data Sources

Platforms storing academic research are probably most relevant initially: This includes Google Scholar, a source that can be used as complementary to the Scopus and Web of Science databases. Certain of its features, such as the option for researchers to publish author pages, can help better identification (e.g. by attributing the author information adequately to a body of researchers). The database is also more comprehensive by including not only publications but also working papers, some of which might be useful as translating more applied research. There are also other platforms (such as IDEAS, SSRN) and specific social networks for academics (such as ResearchGate.Net). These can help go beyond publications information, they provide information on article downloads, they give information on networks of researchers and collaborations, and so on.

An example of how linkages have been explored with web information includes the attempt at using machine learning tools to identify hyperlinks on university pages to identify collaborations, but still require improvements to be useable as valid alternatives. e.g. at scale of web sites, web pages, words in web pages, hyperlinks. Building such networks has to date mostly been used to capture opinions and in the area has been related to the occurrence of university-industry terms, e.g. in line with Google Trends or Twitter debates. At this stage much of the analysis is more experimental.

Several services provide access to linked information about researchers and research institution and their collaborations, including for specific fields of analysis (e.g. Thomson Reuters Research Analytics services, the Centre for Science and Technology Studies at the University of Leiden, etc.).