time series analysis; network analytics; data mining; business analytics; network analysis; data science; temporal network; graph analysis; network science; social network analysis; dynamic graph; graph mining; relational data mining
LaRock Timothy, Nanumyan Vahan, Scholtes Ingo, Casiraghi Giona, Eliassi-Rad Tina, Schweitzer Frank (2020), HYPA: Efficient Detection of Path Anomalies in Time Series Data on Networks, in Proceedings of the 2020 SIAM International Conference on Data Mining
, Philadelphia, PASociety for Industrial and Applied Mathematics, Philadelphia, PA.
Gote Christoph, Scholtes Ingo, Schweitzer Frank (2019), git2net: Mining Time-Stamped Co-Editing Networks from Large git Repositories
, P-294, 259-260, Gesellschaft für Informatik, Bonn P-294, 259-260.
Gote Christoph, Scholtes Ingo, Schweitzer Frank (2019), git2net - Mining Time-Stamped Co-Editing Networks from Large git Repositories, in 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR)
, Montreal, QC, CanadaIEEE, New York City.
Lambiotte Renaud, Rosvall Martin, Scholtes Ingo (2019), From networks to optimal higher-order models of complex systems, in Nature physics
, 15, 1-1.
Perri Vincenzo, Scholtes Ingo, HOTVis: Higher-Order Time-Aware Visualisation of Dynamic Graphs, in Graph Drawing and Network Visualization 28th International Symposium, GD 2020
, Victoria, British ColumbiaSpringer, Berlin, Heidelberg.
git2net - An Open Source Package to Mine Time-Stamped Collaboration Networks from Large git Repositories
||Gote, Christoph; Scholtes, Ingo; Schweitzer, Frank
|Persistent Identifier (PID)
Short tutorial, as well as supporting data and code to reproduce results for the article "git2net - An Open Source Package to Mine Time-Stamped Collaboration Networks from Large git Repositories" (https://arxiv.org/abs/1903.10180).
Network-based data mining techniques are an important foundation for data science applications in computer science, computational social science, economics and life sciences. They help us to extract knowledge from vast corpora of relational data that capture links or interactions between documents, humans, financial institutions, or genes. However, advances in data sensing and collection increasingly provide us with data that not only tell us who is linked to whom, but also when and in which order these links occur. The analysis of such time series data on networks is still a major challenge. A naive application of network analytic methods discards information on the timing and ordering of links, which however determine the causal topology of dynamic networks. Empirical studies show that this leads to wrong results about the importance of nodes, cluster structures, and dynamical processes. It invalidates applications of current network analytic methods in time series data and limits our understanding of networked systems with dynamic topologies.Filling this gap, my goal as an SNSF professor is to develop a new generation of network analytic methods for time series data on networks. We will take an interdisciplinary perspective that combines stochastic models of time series data with graph analytic, algebraic and information-theoretic methods. It enables us to reach three objectives, namely to (i) develop network analytic techniques for sequential relational data capturing pathways in networks, (ii) extend these techniques to time-stamped data on dynamic networks with high temporal resolution, and (iii) implement a software package for the analysis of time series data on networks.A novelty of our approach is the use of multi-order graphical models, i.e. compact static summarisations of time series data that capture the causal topology of dynamic networks. This will help us to advance the theoretical foundation of data science and network analysis. It will lead to scalable, time-aware methods to find important nodes and detect cluster structures in massive streams of relational data. We validate and test them in time series data on software development teams made available by an industry partner. The final deliverable of our project is an Open Source software implementation of our methods. A collaboration with the Swiss Data Science Center will help us to disseminate it in the Swiss data science community. An international network of experts in network science, bioinformatics, computational epidemiology, computational social science, and machine learning will ensure the interdisciplinary impact of our project.