Project

Back to overview

Next-Generation Network Analytics for Time Series Data

Applicant Scholtes Ingo
Number 176938
Funding scheme SNSF Professorships
Research institution Institut für Informatik Universität Zürich
Institution of higher education University of Zurich - ZH
Main discipline Information Technology
Start/End 01.08.2018 - 31.07.2022
Approved amount 1'106'925.00
Show all

Keywords (13)

time series analysis; network analytics; data mining; business analytics; network analysis; data science; temporal network; graph analysis; network science; social network analysis; dynamic graph; graph mining; relational data mining

Lay Summary (German)

Lead
Netzwerkanalyseverfahren leisten einen wichtigen Beitrag für unser Verständnis sozialer, technischer und biologischer Systeme. Ihre Anwendung in zeitgestempelten Daten ist jedoch nach wie vor eine Herausforderung. Insbesondere ignorieren existierende Methoden die zeitliche Abfolge von Verbindungen, welche ausschlaggebend ist, um zu entscheiden welche Akteure in einem Netzwerk sich beeinflussen können: Trifft bspw. Alice Bob bevor sich Bob mit Carol trifft so kann ein Gerücht über einen Pfad von Alice über Bob zu Carol gelangen. Dieser Pfad existiert nicht wenn wir die Abfolge der Interaktionen umkehren. Das Fehlen von Netzwerkanalyseverfahren, die den Einfluss zeitlicher Interaktionsmuster auf entstehende Pfadstrukturen berücksichtigen beschränkt unser Verständnis komplexer Systeme.
Lay summary

Ziele des Forschungsprojekts

Ziel unseres Projekts sind neue Verfahren für die Analyse zeitgestempelter Daten zu komplexen Netzwerken. Insbesondere werden wir neue Methoden zur (i) Erkennung von Clusterstrukturen, (ii) Identifizierung wichtiger Knoten und (iii) Visualisierung dynamischer Netzwerke entwickeln. Diese basieren auf einem neuen Ansatz, welcher mittels statistischer Lernverfahren optimale Modelle für Pfadstrukturen in dynamischen Netzwerken berechnet. Die entwickelten Methoden werden der Öffentlichkeit in Form einer freien Software zur Verfügung gestellt.

Wissenschaftlicher und gesellschaftlicher Kontext des Forschungsprojekts

Unser Projekt liefert neue Ansätze zur Analyse von Zeitreihendaten in Wissenschaft, Industrie und Gesellschaft. Beispielanwendungen umfassen soziale Netzwerkanalyse, die Erkennung funktionaler Cluster in biologischen Netzwerken und die Analyse systemischer Risiken in vernetzten Infrastrukturen. Die Anwendung der im Rahmen des Projekts entwickelten Methoden in Industrie und Wissenschaft wird durch eine Zusammenarbeit mit dem Swiss Data Science Center unterstützt.

Direct link to Lay Summary Last update: 17.05.2018

Responsible applicant and co-applicants

Employees

Publications

Publication
HYPA: Efficient Detection of Path Anomalies in Time Series Data on Networks
LaRock Timothy, Nanumyan Vahan, Scholtes Ingo, Casiraghi Giona, Eliassi-Rad Tina, Schweitzer Frank (2020), HYPA: Efficient Detection of Path Anomalies in Time Series Data on Networks, in Proceedings of the 2020 SIAM International Conference on Data Mining, Philadelphia, PASociety for Industrial and Applied Mathematics, Philadelphia, PA.
git2net: Mining Time-Stamped Co-Editing Networks from Large git Repositories
Gote Christoph, Scholtes Ingo, Schweitzer Frank (2019), git2net: Mining Time-Stamped Co-Editing Networks from Large git Repositories, P-294, 259-260, Gesellschaft für Informatik, Bonn P-294, 259-260.
git2net - Mining Time-Stamped Co-Editing Networks from Large git Repositories
Gote Christoph, Scholtes Ingo, Schweitzer Frank (2019), git2net - Mining Time-Stamped Co-Editing Networks from Large git Repositories, in 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), Montreal, QC, CanadaIEEE, New York City.
From networks to optimal higher-order models of complex systems
Lambiotte Renaud, Rosvall Martin, Scholtes Ingo (2019), From networks to optimal higher-order models of complex systems, in Nature physics, 15, 1-1.
HOTVis: Higher-Order Time-Aware Visualisation of Dynamic Graphs
Perri Vincenzo, Scholtes Ingo, HOTVis: Higher-Order Time-Aware Visualisation of Dynamic Graphs, in Graph Drawing and Network Visualization 28th International Symposium, GD 2020, Victoria, British ColumbiaSpringer, Berlin, Heidelberg.

Datasets

git2net - An Open Source Package to Mine Time-Stamped Collaboration Networks from Large git Repositories

Author Gote, Christoph; Scholtes, Ingo; Schweitzer, Frank
Publication date 25.03.2019
Persistent Identifier (PID) 10.5281/zenodo.2587483
Repository Zenodo
Abstract
Short tutorial, as well as supporting data and code to reproduce results for the article "git2net - An Open Source Package to Mine Time-Stamped Collaboration Networks from Large git Repositories" (https://arxiv.org/abs/1903.10180).

Collaboration

Group / person Country
Types of collaboration
Swiss Data Science Center Switzerland (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Industry/business/other use-inspired collaboration
genua GmbH Germany (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Industry/business/other use-inspired collaboration
Network Science Institute/Northeastern University United States of America (North America)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Exchange of personnel
RWTH Aachen / Leibniz-Institut für Sozialwissenschaften (GESIS) Germany (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
IceLab/Umea University Sweden (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Exchange of personnel
Mathematical Institute/University of Oxford Great Britain and Northern Ireland (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication

Scientific events

Active participation

Title Type of contribution Title of article or contribution Date Place Persons involved
NetSci-X 2020 Tutorials Talk given at a conference Higher-Order Network Models for Temporal Network Data 20.01.2020 Tokyo, Japan Hackl Jürgen;
Interdisciplinary Symposium on Resilience and Performance of Networked Systems Talk given at a conference Networks in Space and Time 16.01.2020 Zürich, Switzerland Scholtes Ingo;
Workshop on Applied Machine Learning for Social and Environmental Problems, Universität Zürich Talk given at a conference Artificial urban infrastructure systems 03.10.2019 Zürich, Switzerland Hackl Jürgen;
INFORMATIK 2019 - Jahrestagung der Gesellschaft für Informatik Talk given at a conference git2net - An Open Source Package to Mine Time-Stamped Collaboration Networks from Large git Repositories 25.09.2019 Kassel, Germany Scholtes Ingo;
4th Symposium on Spatial Networks, University of Oxford Talk given at a conference Path data analysis for spatial-temporal data on networks 19.09.2019 Oxford, Great Britain and Northern Ireland Hackl Jürgen;
Antrittsvorlesung, Universität Zürich Individual talk Networks in Space and Time 16.09.2019 Zürich, Switzerland Scholtes Ingo;
NetSci Satellite HONS 2019 Talk given at a conference Detecting Path Anomalies in Time Series Data on Networks 28.05.2019 Burlington, Vermont, United States of America Scholtes Ingo;
NetSci Satellite HONS 2019 Talk given at a conference Multi-order network models based on path data 28.05.2019 Burlington, Vermont, United States of America Scholtes Ingo;
NetSci Satellite HONS 2019 Talk given at a conference Counting Causal Paths in Temporal Networks 28.05.2019 Burlington, Vermont, United States of America Scholtes Ingo;
NetSci 2019 Poster Fast Estimation of Causal Path Statistics in Temporal Networks 27.05.2019 Vermont, United States of America Perri Vincenzo; Scholtes Ingo;
MSR 2019 Talk given at a conference git2net - An Open Source Package to Mine Time-Stamped Collaboration Networks from Large git Repositories 27.05.2019 Montreal, Canada Scholtes Ingo;
NetSci 2019 Talk given at a conference git2net - An Open Source Package to Mine Time-Stamped Collaboration Networks from Large git Repositories 27.05.2019 Burlington, Vermont, United States of America Scholtes Ingo;
NetSci 2019 Talk given at a conference HOTVis: Higher-Order Time-Aware Visualisation of Dynamic Graphs 27.05.2019 Vermont, United States of America Perri Vincenzo; Scholtes Ingo;
Symposium Computational Social Science - Quo Vadis? Talk given at a conference Data, Networks, Social Science 26.04.2019 Zürich, Switzerland Scholtes Ingo;
Behavioral Studies Colloquium, ETHZ Individual talk Data Analytics in Social Organizations 16.04.2019 Zürich, Switzerland Scholtes Ingo;
Brownbag Lunch of Digital Society Initiative at UZH Individual talk Data Analytics in Social Organizations: Lessons from Software Engineering and Scientometrics 21.03.2019 Zürich, Switzerland Scholtes Ingo;
CRiSM Seminar Series Individual talk Optimal Network Models for Time Series Data 14.02.2019 Warwick, Great Britain and Northern Ireland Scholtes Ingo;
14th International Workshop on Mining and Learning with Graphs (MLG 2018) Poster When is a Network a Network? Multi-Order Graphical Model Selection in Pathways and Temporal Networks 20.08.2018 London, Great Britain and Northern Ireland Scholtes Ingo;


Self-organised

Title Date Place
Informatik.2019 - Data Science Track 23.09.2019 Kassel, Germany
Higher-Order Models in Network Science (HONS 2019) 28.05.2019 Burlington, Vermont, United States of America

Knowledge transfer events

Active participation

Title Type of contribution Date Place Persons involved
Workshop at advAIsor AG, Zürich Talk 12.07.2019 Zürich, Switzerland Scholtes Ingo; Perri Vincenzo; Petrovic Luka;
Symposium "Machine learning and Artificial Intelligence", Neurochirurgie, Unispital Zürich Talk 25.04.2019 Zürich, Switzerland Scholtes Ingo;
Internes Meeting mit Vertretern von Honda Research Institute an der UZH Talk 19.02.2019 Zürich, Switzerland Scholtes Ingo;
Workshop at IBM Research Talk 11.10.2018 Rüschlikon, Switzerland Scholtes Ingo;


Communication with the public

Communication Title Media Place Year
Media relations: print media, online media Jahrtausend der Komplexität ada German-speaking Switzerland International 2020
New media (web, blogs, podcasts, news feeds etc.) Soziologie und Informatik können viel voneinander lernen Erium Podcast German-speaking Switzerland International 2020
Media relations: print media, online media Datenströme und wie sie analysiert werden können Westdeutsche Zeitung International 2019
New media (web, blogs, podcasts, news feeds etc.) Informatik und Soziologie: Computational Social Science im Aufschwung GI Radar/GI Blog International 2019

Awards

Title Year
Special Mention in Free and Open Source Software Award (FOSS) awarded at MSR 2019 see https://2019.msrconf.org/track/msr-2019-FOSS-Award#-MSR-2019-winners 2019

Use-inspired outputs

Software

Name Year
git2net 2019
pathpy 2019


Abstract

Network-based data mining techniques are an important foundation for data science applications in computer science, computational social science, economics and life sciences. They help us to extract knowledge from vast corpora of relational data that capture links or interactions between documents, humans, financial institutions, or genes. However, advances in data sensing and collection increasingly provide us with data that not only tell us who is linked to whom, but also when and in which order these links occur. The analysis of such time series data on networks is still a major challenge. A naive application of network analytic methods discards information on the timing and ordering of links, which however determine the causal topology of dynamic networks. Empirical studies show that this leads to wrong results about the importance of nodes, cluster structures, and dynamical processes. It invalidates applications of current network analytic methods in time series data and limits our understanding of networked systems with dynamic topologies.Filling this gap, my goal as an SNSF professor is to develop a new generation of network analytic methods for time series data on networks. We will take an interdisciplinary perspective that combines stochastic models of time series data with graph analytic, algebraic and information-theoretic methods. It enables us to reach three objectives, namely to (i) develop network analytic techniques for sequential relational data capturing pathways in networks, (ii) extend these techniques to time-stamped data on dynamic networks with high temporal resolution, and (iii) implement a software package for the analysis of time series data on networks.A novelty of our approach is the use of multi-order graphical models, i.e. compact static summarisations of time series data that capture the causal topology of dynamic networks. This will help us to advance the theoretical foundation of data science and network analysis. It will lead to scalable, time-aware methods to find important nodes and detect cluster structures in massive streams of relational data. We validate and test them in time series data on software development teams made available by an industry partner. The final deliverable of our project is an Open Source software implementation of our methods. A collaboration with the Swiss Data Science Center will help us to disseminate it in the Swiss data science community. An international network of experts in network science, bioinformatics, computational epidemiology, computational social science, and machine learning will ensure the interdisciplinary impact of our project.
-