Tuesday, June 21, 2011

R: removing columns in bulk (xts, matrix, data frame, etc.)

I've an R object, z, with a number of intermediate-working columns I want to delete. In this case it is a quantmod object, but the solution I will show applies for xts and zoo objects, and also for matrices (which is what xts, etc. are under the surface). It also applies to data frames.

(By the way, I use = for assignment, not <-, in the below.) Here is the long-winded approach; setting a column to NULL removes it:
  z$close1=NULL
  z$close2=NULL
  z$close3=NULL
  z$close4=NULL
  z$close5=NULL
  z$open1=NULL
  z$open2=NULL
  z$open3=NULL
  z$open4=NULL
  z$open5=NULL
However using a string for the column name won't work. In other words, these don't work:
  z['open1']=NULL   #This one does work for data frames though.
  z[['open1']]=NULL
The solution I eventually worked out is:
  z=z[, setdiff(colnames(z),c('close1','close2','close3','close4','close5','open1','open2','open3','open4','open5')) ]
colnames(z) gets the current list of columns.
I then use setdiff() to subtract from that the list of columns I want to remove.
The z[,...] syntax means take a copy with just these columns (keeping all rows).

If you don't understand my motivation, I should first say I had more like 30 fields, not just the 10 shown in the example above. But also I like doing things programmatically, as it can avoid introducing typos. For instance, the above solution could be rewritten as:
  #Remove the openN and closeN columns
  z=z[, setdiff(colnames(z),c(paste('close',1:5,sep=''),paste('open',1:5,sep='') )) ]

2 comments:

Anonymous said...

Awesome!

Anonymous said...

Awesome! Really helps clean up code, and I was thinking this wouldn't be possible without converting to and from a dataframe.