Data science, big data and statistics Articles
Overview
published in
- TEST Journal
publication date
- April 2019
start page
- 289
end page
- 329
issue
- 2
volume
- 28
Digital Object Identifier (DOI)
full text
International Standard Serial Number (ISSN)
- 1133-0686
Electronic International Standard Serial Number (EISSN)
- 1863-8260
abstract
- This article analyzes how Big Data is changing the way we learn from observations.We describe the changes in statistical methods in seven areas that have been shaped by the Big Data-rich environment: the emergence of new sources of information; visu-alization in high dimensions; multiple testing problems; analysis of heterogeneity; automatic model selection; estimation methods for sparse models; and merging net-work information with statistical models. Next, we compare the statistical approachwith those in computer science and machine learning and argue that the convergence of different methodologies for data analysis will be the core of the new field of datascience. Then, we present two examples of Big Data analysis in which several new tools discussed previously are applied, as using network information or combining different sources of data. Finally, the article concludes with some final remarks.
Classification
subjects
- Statistics
keywords
- machine learning; sparse model selection; statistical learning; networkanalysis; multivariate data; time series