Friday, October 04, 2024

optimality of Likelihood ratio

From wikipedia:
the Neyman–Pearson lemma describes the existence and uniqueness of the likelihood ratio as a uniformly most powerful test in certain contexts.
I recently case across its applications in 2 occasions
  • showing logistic regression is the optimal test to combine several predictors for a binary outcome.
  • showing bonferoni is optimal test for the global null against a sparse alternative (only 1 alternative hypothesis is true among many many true nulls), and fisher's combination test is optimal against the alternative of small distributed effects. In this example, it becomes obvious that optimality depends on the likelihood of alternative

The first example is suspicious because I know throwing all predictors in a linear model often does not improve model perforance obviously. By the way, the link between odds ratios from individual predictors and overall performance is weak (from a publication of Pepe in the follow year, though the paper was mostly about the disconnect in univariate models, covering the increment value of markers near the end: a marker with a statistically significant OR adds little to the AUC)

Initially, I was wondering whether this is because Neyman–Pearson lemma is developed for simple hypothesis. A little investigation suggests that there is no magic property of a simple hypothesis vs a composite one; the key behavior is the likelihood is monotonous (From wiki: "The Karlin-Rubin theorem extends the Neyman–Pearson lemma to settings involving composite hypotheses with monotone likelihood ratios.")

Later, helped with example 2, it becomes clear that the optimality is conditional on the model being correctly specified. There could be a setting where majority voting from the multiple predictors could be the best predictor. If there are interactions, SVM could work better than linear regression...