Saturday, March 19, 2011

R - Good programming practice

  1. Instead of x[ind ,], use x[ind, , drop = FALSE]
    from here: R has a flaw in terms of how it behaves when you subscript a matrix and the new matrix has a dimension length of 1 for one (or more dimensions). For example, if a = array (0, dim = c( 1, 2, 4 )), then a[, , 1] is no longer a matrix but instead is a vector and dim(a[, , 1]) is NULL. This can cause all sorts of mysterious bugs. Sometimes adding drop=FALSE will prevent this unpleasant behavior. If b = matrix(0, 2, 2), dim( b[, 1 , drop = FALSE]) is c(2, 1) while dim( b[, 1] ) is NULL. drop = FALSE works great with 2-dimensional matrices, but with 3-dimensional matrices it doesn't work. If a = array (0, dim = c(1, 2, 4)), dim( a[, , 1, drop = FALSE] ) is c(1, 2, 1), instead of c(1, 2).
  2. tricky because of NAs: For data frames (and vectors): Use subset(x, ...) instead of x [, \dots]
  3. Use ’1:n’ only when you know that n is positive: Instead of 1:length(obj), use seq(along.with = obj)
  4. Do not grow objects.
    Replace
    rmat <− NULL
    for ( i in  1:n ) {
    rmat <− rbind ( rmat , long.computation ( i , . . . . . ) )
    }
    with
    rmat <− matrix ( 0, n, k )
    for ( i in  1:n ) {
    rmat [i, ] <− long.computation ( i , . . . . . )
    }
  5. Use with(, ......) and do not attach data frames
  6. TRUE and FALSE, not ‘T’ and ‘F’ !
  7. know the difference between ‘|’ vs ‘||’ and ‘&’ vs ‘&&’ and inside if ( .... ) almost always use ‘||’ and ‘&&’ !
  8. use which.max(), . . . , findInterval()
  9. use mat[i, j] <− if (A) A.expression  else  B.expression
    instead of
    if (A) mat[ i, j ] <− A.expression
    else   mat[ i, j ] <− B.expression

No comments: