Monday, October 15, 2012

Various R Tips

#1   How to validate a commandline argument only uses allowed values?


Say we have a commandline R script, where the first argument is allowed to be a csv list. But that csv list can only contain 'A', 'B' or 'C'. E.g. "myscript.R A" is valid, as is "myscript.R C,B", as is "myscript.R A,C,A,B,C", but not "myscript.R X", or "myscript.R A,B,X,C,B,A"

    my_list=strsplit(argv[1],',')[[1]]
    if(! all (my_list %in% c('A','B','C') ) ) quit()

The first line is the idiom for cracking open a csv list, and getting a vector of character strings. The [[1]] at the end is just noise, live with it.

The middle part of the second line is:
    my_list %in% allowed_values

It returns a vector of logical values the same length as my_list. What we want is for all the elements to be TRUE (which means all the items in our csv list are in the allowed_values list). If there is even a single FALSE (meaning the user gave at least one bad value in the csv list) then we fail, and in this example we call quit().  (See next tip for something less crude that calling quit() with no explanation!)