Tuesday, June 10, 2014

mean, median and mode

this page explains a mean minimizes the $\ell^2$ norm of the residual:$\min_{m_2} \sum_i (m_2-d_i)^2$ 
a median minimizes its $\ell^1$ norm and a mode minimizes the zero norm of the residual, namely $\ell^0=\vert m_0-d_i\vert^0$.See the wikipedia page about median.

from here, it was further explained that
Inder Jeet Taneja’s book draft has a nice survey of the results: if you fix the upper and lower boundary, and maximize entropy, you’ll get the uniform distribution. If you fix the mean and the expected L2 norm (d^2) between the mean and the distribution, maximizing the entropy you’ll get the Gaussian. If you fix the expected L1 norm (|d|) between the mean and the distribution, maximizing the entropy you’ll get the Laplace (also referred to as Double Exponential). Moreover, log(1+d^2) norm will yield the Cauchy distribution – a special case of the standard heavy-tailed Student distribution.

Thursday, June 05, 2014

check points when reviewing a genetic screening report

  1. title and footnote, ensuring it describes the analysis population, the outcome variable and the class of genetic markers; 
  2. eyeball examples:
    • 1 example of x chr snp
    • 1 example of autosomal snp with only 2 genotypes
    • 1 example of top association 
    • 1 example of a random association
  3. use the excel output to check the value ranges for each column, pay attention to
    • extreme values
    • empty cells
    • characters indicating missing: -,NA, 0
  4. Cosmetic issues
    • decimal places
    • check line ends for character cut off