Dots: R tricks

Tricks about graphics, apply, etc

Logit transformation: qlogis(x)

Inverse logit transformation: plogis(x)
R color tricks

Transparent colors in R

It can be as simple as:
> plot( rnorm(1000), rnorm(1000), col="#0000ff22", pch=16,cex=3)

Where the color is "#RRGGBBAA" and the AA portion is the opacity/trasparency. Of course this only works if you are using a graphics device that supports transparency. The windows device under R 2.6.0 does (but not any previous versions), the Cairo device does, the PDF device does (as long as you set the correct options), and I think the default device on Macs does, I don't know which others do.

suppose I want to check the version of 'xcms' I am using:

installed.packages()['xcms',]

ANOVA using R

http://personality-project.org/r/r.anova.html

multiple graphic windows

1) Click history-->record, then to play back, click previous or next under
history
2) windows()

add Greek symbol to a graph

xlab=expression(pi)

match() and merge() are similar when there is no duplicated obs

apply functions from stackoverflow

apply - When you want to apply a function to the rows or columns of a matrix (and higher-dimensional analogues).

# Two dimensional matrix
M <- matrix(seq(1,16), 4, 4)

# apply min to rows
apply(M, 1, min)
[1] 1 2 3 4

# apply min to columns
apply(M, 2, max)
[1]  4  8 12 16

# 3 dimensional array
M <- array( seq(32), dim = c(4,4,2))

# Apply sum across each M[*, , ] - i.e Sum across 2nd and 3rd dimension
apply(M, 1, sum)
# Result is one-dimensional
[1] 120 128 136 144

# Apply sum across each M[*, *, ] - i.e Sum across 3rd dimension
apply(M, c(1,2), sum)
# Result is two-dimensional
     [,1] [,2] [,3] [,4]
[1,]   18   26   34   42
[2,]   20   28   36   44
[3,]   22   30   38   46
[4,]   24   32   40   48

If you want row/column means or sums for a 2D matrix, be sure to investigate the highly optimized, lightning-quick colMeans, rowMeans, colSums, rowSums.

lapply - When you want to apply a function to each element of a list in turn and get a list back.
This is the workhorse of many of the other *apply functions. Peel back their code and you will often find lapply underneath.

   x <- list(a = 1, b = 1:3, c = 10:100) 
   lapply(x, FUN = length) 
   $a 
   [1] 1
   $b 
   [1] 3
   $c 
   [1] 91

   lapply(x, FUN = sum) 
   $a 
   [1] 1
   $b 
   [1] 6
   $c 
   [1] 5005

sapply - When you want to apply a function to each element of a list in turn, but you want a vector back, rather than a list.
If you find yourself typing unlist(lapply(...)), stop and consider sapply.
```
   x <- list(a = 1, b = 1:3, c = 10:100)
   #Compare with above; a named vector, not a list 
   sapply(x, FUN = length)  
   a  b  c   
   1  3 91

   sapply(x, FUN = sum)   
   a    b    c    
   1    6 5005 
```
In more advanced uses of sapply it will attempt to coerce the result to a multi-dimensional array, if appropriate. For example, if our function returns vectors of the same length, sapply will use them as columns of a matrix:
```
   sapply(1:5,function(x) rnorm(3,x))
```
If our function returns a 2 dimensional matrix, sapply will do essentially the same thing, treating each returned matrix as a single long vector:
```
   sapply(1:5,function(x) matrix(x,2,2))
```
Unless we specify simplify = "array", in which case it will use the individual matrices to build a multi-dimensional array:
```
   sapply(1:5,function(x) matrix(x,2,2), simplify = "array")
```
Each of these behaviors is of course contingent on our function returning vectors or matrices of the same length or dimension.

vapply - When you want to use sapply but perhaps need to squeeze some more speed out of your code.
For vapply, you basically give R an example of what sort of thing your function will return, which can save some time coercing returned values to fit in a single atomic vector.

x <- list(a = 1, b = 1:3, c = 10:100)
#Note that since the adv here is mainly speed, this
# example is only for illustration. We're telling R that
# everything returned by length() should be an integer of 
# length 1. 
vapply(x, FUN = length, FUN.VALUE = 0) 
a  b  c  
1  3 91

mapply - For when you have several data structures (e.g. vectors, lists) and you want to apply a function to the 1st elements of each, and then the 2nd elements of each, etc., coercing the result to a vector/array as in sapply.
This is multivariate in the sense that your function must accept multiple arguments.
```
#Sums the 1st elements, the 2nd elements, etc. 
mapply(sum, 1:5, 1:5, 1:5) 
[1]  3  6  9 12 15
#To do rep(1,4), rep(2,3), etc.
mapply(rep, 1:4, 4:1)   
[[1]]
[1] 1 1 1 1

[[2]]
[1] 2 2 2

[[3]]
[1] 3 3

[[4]]
[1] 4
```

rapply - For when you want to apply a function to each element of a nested list structure, recursively.
To give you some idea of how uncommon rapply is, I forgot about it when first posting this answer! Obviously, I'm sure many people use it, but YMMV. rapply is best illustrated with a user-defined function to apply:

#Append ! to string, otherwise increment
myFun <- function(x){
    if (is.character(x)){
    return(paste(x,"!",sep=""))
    }
    else{
    return(x + 1)
    }
}

#A nested list structure
l <- list(a = list(a1 = "Boo", b1 = 2, c1 = "Eeek"), 
          b = 3, c = "Yikes", 
          d = list(a2 = 1, b2 = list(a3 = "Hey", b3 = 5)))


#Result is named vector, coerced to character           
rapply(l,myFun)

#Result is a nested list like l, with values altered
rapply(l, myFun, how = "replace")

tapply - For when you want to apply a function to subsets of a vector and the subsets are defined by some other vector, usually a factor.
The black sheep of the *apply family, of sorts. The help files use of the phrase "ragged array" can be a bit confusing, but it is actually quite simple.
A vector:
```
   x <- 1:20
```
A factor (of the same length!) defining groups:
```
   y <- factor(rep(letters[1:5], each = 4))
```
Add up the values in x within each subgroup defined by y:
```
   tapply(x, y, sum)  
    a  b  c  d  e  
   10 26 42 58 74 
```
More complex examples can be handled where the subgroups are defined by the unique combinations of a list of several factors. tapply is similar in spirit to the split-apply-combine functions that are common in R (aggregate, by, ave, ddply, etc.) Hence its black sheep status.

Dots

Wednesday, September 27, 2006

R tricks

No comments:

Labels

Followers

Blog Archive