Dots

Thursday, January 26, 2012

Hello, Bugs world!

Here is my first Bayesian code modified from the sample code for the Chapter 3 of Bayesian Adaptive Methods for Clinical Trials. The R code depends on BRugs that interfaces with Openbugs (manual is here).

IBS alternative

The publicly available IBS (Identity by State) macro in SAS is quite slow, and has a bug that IBS is not adjusted for call rate. I made an alternative implementation which is about 1/2 or 1/3 faster. However, SAS apparently handles the standard correlation calculations much much faster, and resulting Pearson correlation, Kendall correlation and Spearman correlation may give similar measurements to IBS (code is here, data include about 1000 markers clustered in candidate genes in 400 subjects, primary Caucasians). Therefor I guess in certain Kernel application, we can use usual correlation to replace IBS.

Thursday, January 12, 2012

64bit perl debugger

It turns out Active Perl does not support 64bit debugger in 64 bit windows under PDK. From here:

The PDK's debugger was written for 32-bit Windows only. It's not available on ANY other platform, hence the 64-bit Windows PDK does not include a debugger. This has been documented in the release notes since the first 64-bit Windows PDK appeared (PDK 8.0).

There are no plans to update the PDK debugger, since a viable (and better) alternative exists in Komodo IDE. If you want a debugger for 64-bit, the multi-platform, multi-language Komodo IDE debugger is our suggested choice.

But still we can install 32 bit Active Perl for windows and use an older version of PDK (version 6 works).

sample code - MMRM

the following code implements the analyses suggested at the lower right corner of page 10 (i.e., page 312) of this paper for genetic predictors.

pairs() and more

Getting genetics done has a piece of code to make a more informative pairwise scatter plot than pairs()

# panel.smooth function is built in.
# panel.cor puts correlation in upper panels, size proportional to correlation
panel.cor <- function(x, y, digits=2, prefix="", cex.cor, ...)
{
    usr <- par("usr"); on.exit(par(usr))
    par(usr = c(0, 1, 0, 1))
    r <- abs(cor(x, y))
    txt <- format(c(r, 0.123456789), digits=digits)[1]
    txt <- paste(prefix, txt, sep="")
    if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt)
    text(0.5, 0.5, txt, cex = cex.cor * r)
}
 
# Plot #2: same as above, but add loess smoother in lower and correlation in upper
pairs(~Sepal.Length+Sepal.Width+Petal.Length+Petal.Width,

data=iris,lower.panel=panel.smooth, upper.panel=panel.cor, 
      pch=20, main="Iris Scatterplot Matrix")

From the comments, we see alternatives are pairs.panels() from package psych and chart.Correlation() from package PerformanceAnalytics.

Saturday, December 24, 2011

cgm to ps

http://technologytales.com/tag/ralcgm/
RALCGM is a handy command line tool that can covert from CGM to Postscript on its own without any need for ImageMagick at all.

Monday, October 17, 2011

NGS reading list (Oct 2011)

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2581791/
http://cs.calstatela.edu/wiki/images/5/5f/Sequence_census_methods_for_functional_genomics.pdf
http://raindancetech.com/documents/Next-Generation-Sequencing_Nature-Genetics_Jan-2010.pdf

Wednesday, October 05, 2011

23 Sure-Fire Ways To Improve Your Presentations

from here
BEFORE YOU START

test linear hypothesis in proc phreg

The idea of LSmean becomes so integrated part of statistical reporting that we feel so clumsy in proc phreg from sas 9.1.3 or earlier, where only numerical predictors are allowed. In the following test, a few examples of TEST statement for some common linear hypotheses are presented. The results have been comfirmed using proc phreg in sas 9.2, where a CLASS statement is available. Assuming a genotype effect with 3 levels (AA, AB and BB) and a treatment effect with 2 levels (0, 1) (and it seems p-value for the genetic effect across treatment arm will be different if the 2 treatment levels are codes as 1 and 2), a Cox proportional hazards model of survival time as a function of genotype, treatment and genotype-by-treatment interaction is fitted.

Comparison of general and generalized linear models

wiki

Friday, September 09, 2011

conditional exact logistic regression

Conditional Logistic Regression:

Taking the stratification into account by "conditioning out" (and not estimating) the stratum-specific intercepts gives consistent and asymptotically normal MLEs for the slope coefficients. If your nuisance parameters are not just stratum-specific intercepts, you can perform an exact conditional logistic regression.

The likelihood function to be maximized feels really like that in proc phreg with a discrete option. Each item (for a stratum) in the likelihood function is the conditional probability of observing subject 1 to h who actually had an event conditioning on h event happened (for any h subjects) in the stratum. The concept of sufficient statistic is not explicitly. Conditional logistic regression is called with the strata statement. More examples are from here and here.

Exact Conditional Logistic Regression
Exact conditional inference is based on generating the conditional distribution for the sufficient statistics of the parameters of interest. This distribution is called the permutation or exact conditional distribution. One example from sas suggests that we move from conditional regression to exact conditional regression when "you believe the data set is too small or too sparse for the usual asymptotics to hold.":

by "sparse", they probably refer to the problem of complete separation of quasi-complete separation, and the exact method is one way to handle this, especially when the firth option is only available in sas 9.1.3;
by "usual asymptotics", they probably refer to the issue of large-sample asymptotic normality when we do not have a large sample size compared to the number of parameters to estimate.

Thursday, August 25, 2011

%plotit macro

a good document with graphic examples are here.
this one is more technical.

Friday, August 19, 2011

use %scan function to cycle through a list of items

There are many examples on line. But I have difficulty to break from the loop following them in one SAS system.

model selection for mmrm

it seems glmselect cannot handle this. See a proposed procedure here.

Tuesday, August 02, 2011

Geometric Statistics, geometric CV, intra-subject variation

found from onbiostatistics and here. Additional discussion can be found here.
also for a log normal variable x, there are

y=log(x)~N(mu, sigma**2)
E(x)=exp(mu+sigma**2/2) and
sd(x)=exp(mu+sigma**2/2)*sqrt(exp(sigma**2)-1)

cv(x)=sqrt(exp(sigma**2)-1)

Tuesday, July 05, 2011

left join A and B may still lose obs in A

Imaging data set A has the demog information, and data set B has multiple outcome variable and covariate information. B has less subjects than A. To create data set C which is merged from A and B and contains all subjects in A (even the subject is not in B), the following sql statement would be fine

proc sql,
create table c (drop=subjid) as;
select a.*, b.* from a left join b (rename=(pat_id=subjid))
on a.pat_id=b.subjid;
quit;

However, since B has multiple outcome variables, there can be an intent to filter out one outcome variable like the following

proc sql,
create table c (drop=subjid) as;
select a.*, b.* from a left join b (rename=(pat_id=subjid))
on a.pat_id=b.subjid
where b.outcomeName='outcome A';
quit;

This is wrong, because the field of outcomeName will be missing (and filtered out) for subjects in A but not in B. The solution can be either to filter in the separate data step, or the following sql statement

proc sql,
create table c (drop=subjid) as;
select a.*, b.* from a left join b (rename=(pat_id=subjid) where=(outcomeName='outcome A'))
on a.pat_id=b.subjid;

quit;

Thursday, January 26, 2012

Monday, January 16, 2012

Thursday, January 12, 2012

Tuesday, January 10, 2012

Saturday, December 24, 2011

Monday, October 17, 2011

Wednesday, October 05, 2011

Thursday, September 22, 2011

Tuesday, September 20, 2011

Friday, September 09, 2011

Thursday, August 25, 2011

Friday, August 19, 2011

Monday, August 15, 2011

Tuesday, August 02, 2011

Tuesday, July 05, 2011

Sunday, June 12, 2011

Friday, May 20, 2011

Thursday, May 19, 2011

Labels

Followers

Blog Archive