Warning: imagejpeg(C:\Inetpub\vhosts\kidney.de\httpdocs\phplern\27698035
.jpg): Failed to open stream: No such file or directory in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 117 Philos+Trans+A+Math+Phys+Eng+Sci
2016 ; 374
(2080
): ä Nephropedia Template TP
gab.com Text
Twit Text FOAVip
Twit Text #
English Wikipedia
Big data need big theory too
#MMPMID27698035
Coveney PV
; Dougherty ER
; Highfield RR
Philos Trans A Math Phys Eng Sci
2016[Nov]; 374
(2080
): ä PMID27698035
show ga
The current interest in big data, machine learning and data analytics has
generated the widespread impression that such methods are capable of solving most
problems without the need for conventional scientific methods of inquiry.
Interest in these methods is intensifying, accelerated by the ease with which
digitized data can be acquired in virtually all fields of endeavour, from
science, healthcare and cybersecurity to economics, social sciences and the
humanities. In multiscale modelling, machine learning appears to provide a
shortcut to reveal correlations of arbitrary complexity between processes at the
atomic, molecular, meso- and macroscales. Here, we point out the weaknesses of
pure big data approaches with particular focus on biology and medicine, which
fail to provide conceptual accounts for the processes to which they are applied.
No matter their 'depth' and the sophistication of data-driven methods, such as
artificial neural nets, in the end they merely fit curves to existing data. Not
only do these methods invariably require far larger quantities of data than
anticipated by big data aficionados in order to produce statistically reliable
results, but they can also fail in circumstances beyond the range of the data
used to train them because they are not designed to model the structural
characteristics of the underlying system. We argue that it is vital to use theory
as a guide to experimental design for maximal efficiency of data collection and
to produce reliable predictive models and conceptual knowledge. Rather than
continuing to fund, pursue and promote 'blind' big data projects with massive
budgets, we call for more funding to be allocated to the elucidation of the
multiscale and stochastic processes controlling the behaviour of complex systems,
including those of life, medicine and healthcare.This article is part of the
themed issue 'Multiscale modelling at the physics-chemistry-biology interface'.