Saturday, October 26, 2024

how to read CI

Note from reading 'Inference by Eye'. The interpretation of CI figures does not only require the knowledge of what are plotted (SE vs SD vs CI) but alos require the knowledge of experiement design / analysis context (whether it shows group means of independent samples vs pre-post means of repeated measures vs meta analysis). It is important to understand what effect or comparison is the major interest.

CI is just one from an infinite sequence: if the experiment ware repeated many times and a CI calculated for each, in the long run 95% of the CI will include the true mean. Equivalentlyl, a research who routinely reports 95% CI can expect over a lifetime that about 95% of those intervals will catpure the true mean. To interpret CI: CI is a range of plausible values for mean; values outside the CI are relatively implausible.

The width of CI is the largest error of estimation we are likely to make.

for a comparison of two independent means, p<=0.05 when the overlap of the 95%CI is no more than about half the average width of CI, that is, when proportion overlap is about half. In addition, p<=.01 when the two CI do not overlap. If we see SE, and consider the relationship between SE and 95% CI, P<=0.05 when the gap between the SE bars is at lease about the size of the average SE(of the 2 groups). This rule does not work at all for paired data, because the width of CI for the difference is sensitive to the correlation between the pairs; positive correlation will reduce the width of CI for the mean difference.

Friday, October 04, 2024

optimality of Likelihood ratio

From wikipedia:
the Neyman–Pearson lemma describes the existence and uniqueness of the likelihood ratio as a uniformly most powerful test in certain contexts.
I recently case across its applications in 2 occasions
  • showing logistic regression is the optimal test to combine several predictors for a binary outcome.
  • showing bonferoni is optimal test for the global null against a sparse alternative (only 1 alternative hypothesis is true among many many true nulls), and fisher's combination test is optimal against the alternative of small distributed effects. In this example, it becomes obvious that optimality depends on the likelihood of alternative

The first example is suspicious because I know throwing all predictors in a linear model often does not improve model perforance obviously. By the way, the link between odds ratios from individual predictors and overall performance is weak (from a publication of Pepe in the follow year, though the paper was mostly about the disconnect in univariate models, covering the increment value of markers near the end: a marker with a statistically significant OR adds little to the AUC)

Initially, I was wondering whether this is because Neyman–Pearson lemma is developed for simple hypothesis. A little investigation suggests that there is no magic property of a simple hypothesis vs a composite one; the key behavior is the likelihood is monotonous (From wiki: "The Karlin-Rubin theorem extends the Neyman–Pearson lemma to settings involving composite hypotheses with monotone likelihood ratios.")

Later, helped with example 2, it becomes clear that the optimality is conditional on the model being correctly specified. There could be a setting where majority voting from the multiple predictors could be the best predictor. If there are interactions, SVM could work better than linear regression...