Project

Back to overview

A Lego System for Transformation Inference

English title A Lego System for Transformation Inference
Applicant Hothorn Torsten
Number 184603
Funding scheme Project funding (Div. I-III)
Research institution Institut für Epidemiologie, Biostatistik und Prävention Universität Zürich
Institution of higher education University of Zurich - ZH
Main discipline Mathematics
Start/End 01.01.2020 - 31.12.2023
Approved amount 504'268.00
Show all

Keywords (5)

probabilistic forecasting; regression; survival analysis; conditional distributions; predictive distributions

Lay Summary (German)

Lead
Statistische Methoden bilden das Rückgrat der empirischenErkenntnisgewinnung. Die komplexe und durch Zufallsprozesse überlagerte Wirklichkeit kann durch den statistischen Ansatzauf das Wesentliche reduziert und somit menschlicher Interpretationzugänglich gemacht werden. Dabei gilt es immer, einen angemessenen Kompromiszwischen Modellkomplexität und Interpretierbarkeit zu finden.
Lay summary
(2)
Als Kompromis zwischen parametrischen Modellansätzen und
Methoden des statistischen oder maschinellen Lernens werden
neue Transformationsmodelle entwickelt, implementiert und
evaluiert. Die Modellkomplexität kann dabei graduell angepasst werden, um innerhalb derselben Modellklasse einen
angemessenen Ausgleich zwischen einfacher Interpretation und
guter Modellanpassung, zum Beispiel für Prognosezwecke, zu finden. Hauptziele
dieses Projektes sind die Entwicklung von Transformationsmodellen für
praktisch wichtige aber bisher wenig betrachtete Messgrössen (Zähldaten,
ordinale Skalen, multivariate Beobachtungen) mittels Verfahren des
statistischen und maschinellen Lernens (insbesondere random forests und boosting).

(3)
Der in Lehre und Forschung dominante Kochbuchansatz in der Vermittlung und
Anwendung von statistischen Methoden ("man hat; man nehme") erschwert das
Verständnis der oft direkt einleuchtenden Grundideen statistischen Denkens.
Transformationmodelle, als sowohl einfache als auch
hochflexible Modellklasse, haben das Potential, die Rezepte im Kapitel "Regression"
des Statistik-Kochbuchs durch ein Baukastensystem zu ersetzen, welches die
Zusammenstellung von für die jeweilige experimentelle Fragestellung massgeschneiderten
Methoden erlaubt und damit zu besser angepassten Modellen und in der Folge
zu besser verständlichen und kommunizierbaren wissenschaftlichen
Erkenntnissen führt.


Direct link to Lay Summary Last update: 03.05.2019

Responsible applicant and co-applicants

Employees

Associated projects

Number Title Start Funding scheme
163456 Model-Based Recursive Partitioning for Stratified Medicine 01.01.2016 Project funding (Div. I-III)
177091 A Party for Model-based Forest Inference 01.08.2017 Scientific Exchanges

Abstract

The world of enlightenment is under attack. Fake news receive attention whilewell-founded scientific findings are discredited or ignored. It is easy toblame populists with this challenge of open societies. This, however,ignores that the scientific community allowed its high standards to betempered with. The recent reproducibility crisis helped (at least partially) to create a climate which allows alternative facts to be communicated andacted upon.Empirical research relies on statistical epistemology. Statistics is themagnifying glass which allows to disentangle relevant effects from noise andthus facts from fiction under appropriate error control. Almost from thebeginning of academic statistics, the field has been subject tomisconception, if not outright fraud (``How to lie with statistics'' waspublished in 1954). In addition to poor design and conduction ofexperiments, misuse and misinterpretation of statistical models areproblematic (Randall and Welser, 2018). On top of it, novel methods aremushrooming and with > 13'000 add-on packages available from the CRANsoftware repository, the statistical illiterate data analyst has manyoptions to get it wrong but often only one to get it right.The call is out to statisticians to provide better ways to get it right. The commoncookbook-style attitude in statistical teaching and research hides the coreideas and concepts underlying statistical reasoning, fails to highlight theconnections between different methods, and has the potential to createmisconceptions. The cookbook only contains a limited number of recipes. To analyse experiments appropriately, it must be replaced by a box of Legobricks, where the data analyst can build the method matching the experiment. This research project aims to provide data analysts and statisticians alikewith a novel Lego system implementing a comprehensive way of understanding,designing, analysing, and communicating empirical investigations.We suggest leveraging the unifying spirit of transformation models toaddress the challenge. These models help to understand the connectionsbetween many classics, such as the normal linear, the proportional hazard, orthe proportional odds model. Transformation models consist of simple Legobricks which can be reassembled in many different, and problem-specific,ways. There is no limit on the models we can built in this framework; verycomplex novel highly nonlinear interaction models for probabilisticforecasting as well as very simple models giving rise to classical ranktests can be understood in the same framework.Based on recently published theoretical and computational core concepts inmaximum-likelihood inference for conditional transformation models, thisproject develops a range of novel methods for specific types of experiments:Transformation forests and boosting for probabilistic forecasts, novelmodels for discrete and multivariate responses, and various novelpermutation score tests and corresponding confidence intervals for relevantparameters in complex designs. Open-source software will provide tools forbuilding tailored analyses those results hopefully provide better answers toresearch questions.
-