Monday, October 11, 2010

run regression in R

Gelman: I really hate to think that there are people out there running regressions in R and not using display() and coefplot() to look at the output.

Wednesday, October 06, 2010

Metric MDS starting from eigen()

this is an exercise to figure the details of MDS, or more specifically, what the coordinates are that are used in plotting. More explanations can be found here.

geometric interpretation of vector operatiom

here

Tuesday, October 05, 2010

average heterzygosity

from Ascertainment bias in studies of human genome-wide polymorphism
A simple comparison of the HapMap and Perlegen genotype data was done by considering the 5682 windows of 500 kb across the entire genome and, for each window, tallying the SFS and calculating summary statistics such as average heterozygosity for each population and FST for each population pair and for the trio of samples.

The average uncorrected heterozygosity within the three population groups for the HapMap data were 0.281, 0.247, and 0.268 for the Yoruban, Chinese, and European samples. The corresponding figures for the uncorrected Perlegen data are 0.251, 0.211, and 0.229 for the African American, Chinese, and European samples.

histograms are like this.

Monday, October 04, 2010

2D plotting in SAS

This example shows a regression plot with prediction and confidence limits.
proc sgplot data=sashelp.class;
  reg x=height y=weight / CLM CLI;
run;

Tuesday, September 28, 2010

R inferno

Common mistakes in R programming
The R Inferno

Wednesday, September 22, 2010

Critical Chain Project Management

In CCPM two durations are estimated for each: an aggressive duration based on how long the task would take given full focus on the task and no problems, and a “safe” duration given full focus and typical  variation with each task. The differences between aggressive and “safe” durations for each critical task contribute to a pooled “project buffer” which is adjusted for the project as a whole. The end of the project buffer is the team’s “commit date” and the buffer protects the project from uncertainty
Managers and leadership need to provide clear project and task priorities and a work environment that enables single-task focus, so that each task can be completed quickly and with high quality.

Tuesday, September 21, 2010

meta analysis

Jadad scale to measure methological quality of a clinical trial
tool: CMA
publication standards: quorum (eg) and moose (eg)

proc mixed can be used in meta analysis.

Wednesday, September 15, 2010

histogram alternatives

Beanplot

hist() + rug(): add one dimensional scatter plot below the histograms

for discrete data: barplot(table(a)), where a is a discrete vector. Or barplot (tabulate (a))

ecdf (Empirical CDF) summarizes the data into something like a smooth CDF line while graphing all the data points.

dhist in ggplot2

more discussion from Gelman here and also here

Wednesday, August 25, 2010

Pseudoautosomal regions gene nomenclature

http://www.genenames.org/genefamily/par.php

main effect of a continous variable

In both proc mixed and glimmix (see the code example below), the "Solution for Fixed Effects" generated by option /SOLUTION for the continous variable 'binary' does not estimate the main/marginal effect when the value of binary changes from 0 to 1. It is because of the interaction term between binary and visit. To find the main/marginal effect, we can code the variable 'binary' as a class/categorical variable and find this LSMEANS.

Wednesday, June 23, 2010

distances between vector elements

distance<-function(x,y) {(x-y)^2}
 outer(A,A,distance)

Friday, June 18, 2010

sas missing data categories

http://studysas.blogspot.com/2010/04/special-missing-values.html

Thursday, May 20, 2010

SAS command line

"C:\Program Files\SAS\SAS 9.1\sas" 1.sas -log "1.log.txt" -print "1.result.txt"

SAS IO

/*===========================
export;
===========================*/

Thursday, May 13, 2010

matching program

Case control matching, probably implementing the idea of propensity scores
R
matchIT
Matching
optmatch optmatch presentation1; optmatch presentation2;

SAS
gmatch,vmatch,dist
a sugi paper 165-29: Performing a 1:N Case-Control Match on Propensity Score

PaperOn the Estimation and Use of Propensity Scores in Case-Control and Case-Cohort Studies
[using] cases plus controls in a case-control study... should give consistent estimates of the true propensity score under the null hypothesis, but not otherwise.

Tuesday, May 11, 2010

good clinical trial simulation practices

http://cdds.ucsf.edu/research/sddgpreport.php#_Toc457223476

link function for proc logistic

SAS use the following options to explicitly decide whether the endpoint is ordinal or nomial

Monday, May 10, 2010

A Draft Sequence of the Neandertal Genome.

http://www.researchblogging.org/post/gotourl/id/213933
http://www.researchblogging.org/post/gotourl/id/213689
http://www.researchblogging.org/post/gotourl/id/213509