Confronting p-hacking: addressing p-value dependence on sample size

The p-value is routinely compared with a certain threshold, commonly set to 0.05, to assess statistical null hypotheses. This threshold is easily reachable by either a single p-value or its distribution whenever a large enough dataset is available. We prove that the p-value can be alternatively modeled as a continuous exponential function. The function's decay can be used to analyze the data, assess the null hypothesis, and determine the minimum data-size needed to reject it. An in-depth study of the model in three different experimental datasets reflects the large scope of this approach in common data analysis and decision-making processes.

Confronting p-hacking: addressing p-value dependence on sample size Articles