Confronting p-hacking: addressing p-value dependence on sample size Articles uri icon

authors

  • GOMEZ DE MARISCAL, ESTIBALIZ
  • JAYATILAKA, HASINI
  • PHILLIP, JUDE M.
  • WIRTZ, DENIS

publication date

  • December 2019

start page

  • 1

end page

  • 38

abstract

  • The p-value is routinely compared with a certain threshold, commonly set to 0.05, to assess statistical null hypotheses. This threshold is easily reachable by either a single p-value or its distribution whenever a large enough dataset is available. We prove that the p-value can be alternatively modeled as a continuous exponential function. The function's decay can be used to analyze the data, assess the null hypothesis, and determine the minimum data-size needed to reject it. An in-depth study of the model in three different experimental datasets reflects the large scope of this approach in common data analysis and decision-making processes.