Project

Back to overview

The Global Structure of Knowledge Networks: Data, Models and Empirical Results

Applicant Lomi Alessandro
Number 167326
Funding scheme NRP 75 Big Data
Research institution Istituto di scienze computazionali (ICS) Facoltà di scienze economiche
Institution of higher education Università della Svizzera italiana - USI
Main discipline Science of management
Start/End 01.07.2017 - 31.08.2021
Approved amount 599'684.00
Show all

All Disciplines (3)

Discipline
Science of management
Information Technology
Sociology

Keywords (8)

Exponential Random Graph Models; Big data; Patent citations; Snowball sampling; Social networks; Dynamic topic modeling; Knowledge networks; Data Retrieval

Lay Summary (German)

Lead
Patente schützen das Recht von Einzelpersonen, ihre Erfindungen zu verwerten. Sie sind zudem eine Art formalisierten Wissens, das der Entwicklung neuen Wissens dient. Dieses Projekt untersucht Patentzitate, um herauszufinden, wie Patente die Wissensgrundlagen für Innovationen schaffen können.
Lay summary

Das Projekt ist in drei Teilprojekte gegliedert, welche die Ursprünge, die Entwicklung und die Veränderung der globalen Struktur von Wissensnetzen untersuchen. Das erste Teilprojekt nutzt aktuelle Techniken und Technologien des Datenbankmanagements, um das vielleicht umfassendste Wissensnetz weltweit aufzubauen und zu verwalten. Das zweite Teilprojekt setzt Algorithmen für die Informationsgewinnung und innovative Modellierungstechniken ein, um das Teilnetzwerk zu identifizieren, das die Schweiz mit den weltumspannenden Wissensnetzen verbindet. Das dritte Teilprojekt soll die derzeit verfügbaren Modelle erweitern und auf beliebig grosse Netzwerke anwenden, indem es sich auf die Ergebnisse der jüngsten Statistikmodelle für soziale Netzwerke stützt.

Durch Patentzitate können Verbindungen zwischen verschiedenen Patenten nachvollzogen werden. So zeigen sie auch Prozesse auf, wie Wissen ensteht. Indem wir die Verbindungen zwischen Patentzitaten erkennen und neu einordnen, tragen wir zum besseren Verständnis der globalen Produktion und Transformation von Wissen bei.

Wir wollen das grösste und umfassendste Wissensnetz aufbauen. Dazu führen wir von der OECD bereitgestellte einzelne Patentdatensätze und Firmendaten zusammen, um die Patente mit firmenspezifischen Daten zu koppeln. Die neuen Datensätze werden zur Entwicklung und zum Testen von innovativen Techniken der Informationsgewinnung und Modellen für die Netzwerkanalyse genutzt. Diese Techniken und Modelle werden dann auf grosse Datensätze mit komplexen, sich verändernden Netzwerkstrukturen angewendet.

Die Schweiz belegt konstant (zuletzt 2016) den Spitzenplatz im Global Innovation Index, dem Ranking der Weltorganisation für geistiges Eigentum und anderer führender Organisationen. Damit die Schweiz das innovativste Land der Welt bleibt, ist es wichtig zu verstehen, wie Wissen entsteht, wie es verknüpft und verändert wird.

Direct link to Lay Summary Last update: 30.09.2017

Lay Summary (French)

Lead
Les brevets sont des outils légaux destinés à protéger les droits des inventeurs sur leur invention. Ils représentent aussi une sorte de savoir formalisé utilisé dans la production d’autres savoirs. Ce projet analyse les citations de brevets afin de déterminer comment ils produisent des connaissances pouvant mener à des innovations. Le projet est divisé en trois sous-projets distincts, mais liés, qui examinent les origines, le développement et les changements dans la structure globale des réseaux de savoir. Le premier sous-projet utilise des techniques et technologies contemporaines de gestion des bases de données afin de créer et gérer ce qui est peut-être le plus grand réseau de savoir disponible.
Lay summary

Le second sous-projet applique des algorithmes d’extraction des informations et des techniques innovantes de modélisation afin d’identifier le sous-réseau qui lie la Suisse à la structure globale des réseaux de savoir. Le troisième sous-projet utilise les résultats de la dernière génération de modèles statistiques pour les réseaux sociaux afin d’extrapoler les modèles actuellement disponibles et de les appliquer à des réseaux de toutes tailles.

Les citations de brevets représentent le lien que tout brevet accordé entretient avec d’autres brevets. Elles constituent ainsi les traces laissées par le processus de production du savoir. Restructurer et comprendre les réseaux de citation de brevets aide à faire la lumière sur la production globale et la transformation du savoir.

Nous prévoyons de construire le réseau de savoir le plus grand et le plus complet actuellement disponible. Nous allons le faire en regroupant différents ensembles séparés de données fournis par l’Organisation de coopération et de développement économiques (OCDE) avec des informations sur des personnes morales, afin de faire le lien entre les brevets et les informations propres aux entreprises. Nous utiliserons les nouveaux ensembles de données pour développer et tester des techniques d’extraction des informations innovantes ainsi que des modèles d’analyse des réseaux. Nous appliquerons ensuite ces techniques et ces modèles d’analyse à de grands ensembles de données avec des structures de réseau complexes et changeantes.

La Suisse est constamment en tête (le plus récemment en 2016) du classement de l’Indice mondial de l’innovation publié par l’Organisation mondiale de la propriété intellectuelle et d’autres institutions de premier plan. Comprendre comment le savoir est produit, combiné et transformé constitue la clé du maintien du leadership mondial de la Suisse dans le domaine de l’innovation.

Direct link to Lay Summary Last update: 30.09.2017

Lay Summary (English)

Lead
Patents are legal instruments used to protect the right of individuals to exploit their inventions. Patents also contained formalized knowledge used in the production of other knowledge. This project analyses patent citations to determine how patents produce knowledge that may eventually lead to innovation.
Lay summary

Patent citations represent the link that any given patent has to other patents. As such, patent citations constitute a trail left by the process of knowledge production. Reframing and understanding patent citation networks helps to shed light on the global production and transformation of knowledge.

We plan to construct the largest and most complete knowledge network currently available. We will do so by merging separate patent data sets provided by the Organization for Economic Co-operation and Development (OECD) with information on corporate entities to link patents to company-specific information. We will use the new data sets to develop and test innovative information retrieval techniques and network analysis models. We will then apply these techniques and analytical models to large data sets with complex and evolving network structures.

 

The project is organized into three distinct but related sub-projects that examine the origins, development and change in the global structure of knowledge networks. The first sub-project employs contemporary database management techniques and technologies to create and manage what is possibly the largest knowledge network available. The second sub-project applies information-retrieval algorithms and innovative modelling techniques to identify the sub-network that links Switzerland to the global structure of knowledge networks. The third subproject employs results from the latest generation of statistical models for social networks to scale up currently available models and apply them to networks of any size.

Switzerland consistently ranks first (most recently in 2016) in the Global Innovation Index published by the World Intellectual Property Organization and other leading entities. Understanding how knowledge is produced, combined and transformed is key to sustaining Switzerland’s global leadership in innovation.

Direct link to Lay Summary Last update: 30.09.2017

Responsible applicant and co-applicants

Employees

Project partner

Publications

Publication
CoolMomentum: a method for stochastic optimization by Langevin dynamics with simulated annealing
Borysenko Oleksandr, Byshkin Maksym (2021), CoolMomentum: a method for stochastic optimization by Langevin dynamics with simulated annealing, in Scientific Reports, 11(1), 10705-10705.
Old is Not Always Gold: Early Identification of Milestone Patents Employing Network Flow Metrics
ChakrabortyManajit, CrestaniFabio (2021), Old is Not Always Gold: Early Identification of Milestone Patents Employing Network Flow Metrics, in Proceedings of the Swiss Text Analytics Conference 2021, Winterthur, SwitzerlandCEUR-WS.org, online.
Patent citation network analysis: A perspective from descriptive statistics and ERGMs
Chakraborty Manajit, Byshkin Maksym, Crestani Fabio (2020), Patent citation network analysis: A perspective from descriptive statistics and ERGMs, in PLOS ONE, 15(12), e0241797-e0241797.
Forecasting Patent Growth by Combining Time-Series Signals Using Covariance Patterns
ChakrabortyManajit, BahrainianSeyed Ali, CrestaniFabio (2020), Forecasting Patent Growth by Combining Time-Series Signals Using Covariance Patterns, in Proceedings of the First Joint Conference of the Information Retrieval Communities in Europe, Samatan, GersCEUR-WS.org, online.
Exponential random graph model parameter estimation for very large directed networks
Stivala Alex, Robins Garry, Lomi Alessandro (2020), Exponential random graph model parameter estimation for very large directed networks, in PLoS ONE, 15(1), e0227804.
Fast Maximum Likelihood Estimation via Equilibrium Expectation for Large Network Data
Byshkin Maksym, Stivala Alex, Mira Antonietta, Robins Garry, Lomi Alessandro (2018), Fast Maximum Likelihood Estimation via Equilibrium Expectation for Large Network Data, in Scientific Reports, 8(1), 11509.

Collaboration

Group / person Country
Types of collaboration
Pennsylvania State University United States of America (North America)
- in-depth/constructive exchanges on approaches, methods or results
CSCS - Swiss National Supercomputing Center Switzerland (Europe)
- Research Infrastructure
University of Ljubljana - Department for theoretical computer science Slovenia (Europe)
- in-depth/constructive exchanges on approaches, methods or results
Microsoft research Great Britain and Northern Ireland (Europe)
- Research Infrastructure
- Industry/business/other use-inspired collaboration
National Science CenterKharkiv Institute of Physics and Technology Ukraine (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
Google Cloud Research United States of America (North America)
- Research Infrastructure
Swinburne University of Technology Australia (Oceania)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
USI, Institute of Computational Sciences Switzerland (Europe)
- in-depth/constructive exchanges on approaches, methods or results

Scientific events

Active participation

Title Type of contribution Title of article or contribution Date Place Persons involved
Swiss Text Analytics Conference 2021 Talk given at a conference Old is Not Always Gold: Early Identification of Milestone Patents Employing Network Flow Metrics 14.06.2021 Winterthur, Switzerland, Switzerland Chakraborty Manajit;
Seminar, Università della Svizzera italiana, Faculty of Informatics Individual talk CoolMomentum: a method for stochastic optimization by Langevin dynamics with simulated annealing 09.06.2021 Lugano, virtual, Switzerland Byshkin Maksym;
Workshop on Bibliometric-enhanced Information Retrieval co-located with 43rd European Conference on Information Retrieval (ECIR 2021) Talk given at a conference PatentQuest: A User-Oriented Tool for Integrated Patent Search 01.04.2021 Lucca, Italy Chakraborty Manajit;
Fifth Annual Australian Social Network Analysis Conference (ASNAC 2020) Talk given at a conference The network structure of success: Evidence from an empirical study of European patents 26.11.2020 Perth, Australia Stivala Alexander; Lomi Alessandro;
Final Costnet conference Talk given at a conference A simple algorithm for scalable Monte Carlo inference 23.09.2020 Munich, Virtual, Germany Lomi Alessandro; Byshkin Maksym;
INSNA Sunbelt XXXIX conference Talk given at a conference ERGM parameter estimation of very large directed networks: implementation, example, and application to the geography of knowledge spillovers 18.06.2019 Montréal, Canada Garry Robins; Lomi Alessandro; Stivala Alexander;
Mitchell Centre seminar, University of Manchester Individual talk ERGM parameter estimation of very large directed networks: implementation, example, and application to the geography of knowledge spillovers 03.04.2019 Manchester, Great Britain and Northern Ireland Stivala Alexander; Garry Robins; Lomi Alessandro;
Data science seminar, Università della Svizzera italiana, Faculty of Informatics Individual talk ERGM parameter estimation of very large directed networks: implementation, example, and application to the geography of knowledge spillovers 26.03.2019 Lugano, Switzerland Lomi Alessandro; Stivala Alexander; Garry Robins;
Future social network studies workshop, ETH Zürich Individual talk ERGM parameter estimation of very large directed networks: implementation, example, and application to the geography of knowledge spillovers 19.03.2019 Zürich, Switzerland Lomi Alessandro; Garry Robins; Stivala Alexander;
Applied Machine Learning Days Poster A Simple Algorithm for Scalable Monte Carlo Inference 26.01.2019 Lausanne, Switzerland Lomi Alessandro; Byshkin Maksym;
INSNA Sunbelt XXXVIII conference Talk given at a conference Maximum Likelihood estimation via Equilibrium Expectation for large network data 26.09.2018 Utrecht, Netherlands Stivala Alexander; Garry Robins; Mira Antonietta; Lomi Alessandro; Byshkin Maksym;
INSNA Sunbelt XXXVIII conference Talk given at a conference Structure and geography of a hospital patient transfer network 26.07.2018 Utrecht, Netherlands Byshkin Maksym; Stivala Alexander; Lomi Alessandro;
International Conference on Computational Social Science (IC2S2) Poster A mixture model for the degree distribution and local dependencies 12.07.2018 Evanston, Illinois, United States of America Stivala Alexander; Garry Robins;
PASC18 Poster Modeling Biological Networks with Exponential Random Graph Models 02.07.2018 Basel, Switzerland Lomi Alessandro; Byshkin Maksym; Garry Robins; Mira Antonietta; Stivala Alexander;
13th International Conference in Monte Carlo & Quasi-Monte Carlo Methods in Scientific Computing Talk given at a conference Fast maximum likelihood estimation via equilibrium expectation for large network data 01.07.2018 Rennes, France Mira Antonietta; Byshkin Maksym; Lomi Alessandro; Stivala Alexander;
INSNA Sunbelt XXXVIII conference Talk given at a conference Maximum Likelihood estimation via Equilibrium Expectation for large network data 26.06.2018 Utrecht, Netherlands Byshkin Maksym; Stivala Alexander; Mira Antonietta; Lomi Alessandro; Garry Robins;
Swiss Numerics Day Talk given at a conference Fast maximum likelihood estimation via equilibrium expectation for large network data 20.04.2018 Zurich, Switzerland Stivala Alexander; Lomi Alessandro; Byshkin Maksym;
Australian Social Network Analysis Conference (ASNAC) 2017 Talk given at a conference Efficient Markov chain Monte Carlo Estimation of Exponential Random Graph Models 28.11.2017 Sydney, Australia Byshkin Maksym; Stivala Alexander; Garry Robins; Mira Antonietta; Lomi Alessandro;
Third European Conference on Social Networks (EUSN 2017) Talk given at a conference Efficient Markov chain Monte Carlo Estimation of Exponential Random Graph Models 26.09.2017 Mainz, Germany Byshkin Maksym; Garry Robins; Lomi Alessandro; Stivala Alexander;
International Conference on Computational Social Science (IC2S2) 2017 Talk given at a conference Efficient Markov chain Monte Carlo Estimation of Exponential Random Graph Models 10.07.2017 Cologne, Germany Stivala Alexander; Byshkin Maksym; Garry Robins; Lomi Alessandro; Mira Antonietta;
11th International Conference on Monte Carlo Methods and Applications (MCM 2017) Talk given at a conference Efficient Markov chain Monte Carlo Estimation of Exponential Random Graph Models 03.07.2017 Montréal, Canada Byshkin Maksym; Stivala Alexander; Garry Robins; Lomi Alessandro; Mira Antonietta;


Knowledge transfer events

Active participation

Title Type of contribution Date Place Persons involved
Match-Making in Big Data with Academia and Industry Talk 24.10.2020 Online, Switzerland Byshkin Maksym; Lomi Alessandro; Crestani Fabio;


Use-inspired outputs

Associated projects

Number Title Start Funding scheme
200778 Advancing the applicability of exponential random graph models (ERGMs) for the analysis of social and other networks: Algorithms, implementation, and applications 01.09.2021 Project funding (Div. I-III)
192549 The Dynamics of Innovation: latent space modelling of patent citations 01.01.2021 Project funding (Div. I-III)

Abstract

Recent research recognizes the social character of knowledge production: ideas are embedded in complex networks connecting them to other ideas. The large space spanned by knowledge networks is not homogeneous: certain areas are more likely than others to produce innovation - unexpected combinations of existing ideas. Understanding how new knowledge is produced by recombination of existing knowledge is the key to understand, and predict technological, scientific and social innovation. This argument finds wide applicability in the analysis of patent data - one clear example of how flows of knowledge exchange, transfer and sharing may become observable and amenable for quantitative analysis. Against this general background, this project starts by merging four separate large datasets made available by the Organization for Cooperation and Economic Development (OECD). By tracking patent citations, he data covers the last 35 years of formalized knowledge production worldwide. The outcome will take the form of a very large patent citation network containing detailed information on patents and inventors. The merged dataset will be then combined further with information on corporate entities in order to link patents to firm-specific information. Financial data and data on international trade will also be merged into a single dataset.The dataset will be used to develop and test new and innovative information retrieval techniques and new network-analytic models for the analysis of very large datasets characterized by complex micro-relational structures. One objective of the project is to make available the next generation of statistical models for the analysis of “big” network data to the community of data scientists interested in the analysis of network structures at a very large scale. The project is organized into three distinct, but related subprojects. The first subproject (Constructing knowledge networks), employs contemporary data base management techniques and technologies to create and manage data produced by the largest and - possibly the most comprehensive - knowledge network available. The second subproject (Developing new computational approaches to the analysis of knowledge structures: Exploring the role of Switzerland), applies innovative data retrieval algorithms and topic modeling techniques to identify and delineate the different subnetworks linking Switzerland to the global structure of knowledge networks, and to represent the evolutionary development of knowledge structures in which Switzerland is embedded. The third subproject (Developing new computational statistics models for representing and analyzing large-scale knowledge networks), forwards the last generation of statistical models for social networks, and scales up currently available models to make them applicable to the analysis of networks of arbitrary size. Together, the three subprojects articulate possible answers to questions about the origins, development and change in the global structure of knowledge networks. The project may be classified in module three of NRP 75 (Applications) because it presents an application that may clearly benefit from the use of big data analytics. However, the project is unique as it focuses on the scientific, rather than the immediately practical, contribution of big data analysis. As such, the project clearly addresses issues of societal relevance (Module 2) in the context of an analysis of patent data. By developing new computational techniques and inferential technologies for the analysis of big data (Module 1). The interdisciplinary nature of the project is demonstrated by its general relevance to the NPR 75 “Big Data.” We believe that the project contributes significantly to building the bridge between technology and society through data that the NPR 75 was clearly designed to encourage. The research will be carried out at the Interdisciplinary Institute of Data Science (IDIDS) directed by Alessandro Lomi at the Universitá della Svizzera italiana (USI). The research team contains multi-disciplinary competences spanning the areas of computer science, social network analysis, management science, economics, data science, and statistics. The research project will benefit from the collaboration with the Institute of Computational Science (ICS) at USI, from the support of the Swiss National Center for Supercomputing (CSCS) in Lugano, and from the concrete support of Microsoft Research, a global commercial research company.
-