A Markov chain representation of the multiple testing problem Articles uri icon

publication date

  • February 2018

start page

  • 364

end page

  • 383

issue

  • 2

volume

  • 27

International Standard Serial Number (ISSN)

  • 0962-2802

Electronic International Standard Serial Number (EISSN)

  • 1477-0334

abstract

  • The problem of multiple hypothesis testing can be represented as a Markov process where a new alternative hypothesis is accepted in accordance with its relative evidence to the currently accepted one. This virtual and not formally observed process provides the most probable set of non null hypotheses given the data; it plays the same role as Markov Chain Monte Carlo in approximating a posterior distribution. To apply this representation and obtain the posterior probabilities over all alternative hypotheses, it is enough to have, for each test, barely defined Bayes Factors, e.g. Bayes Factors obtained up to an unknown constant. Such Bayes Factors may either arise from using default and improper priors or from calibrating p-values with respect to their corresponding Bayes Factor lower bound. Both sources of evidence are used to form a Markov transition kernel on the space of hypotheses. The approach leads to easy interpretable results and involves very simple formulas suitable to analyze large datasets as those arising from gene expression data (microarray or RNA-seq experiments).

keywords

  • bayes factors lower bounds; default bayes; gene expression; improper priors; rna-seq