As I'm coming from experience in C++, PHP, Javascript, python, java, (the list goes on), I use = in my R code. Except in one case, which I'll come to in a moment. I've used = in almost all my R-related blog posts and StackOverflow questions and answers and no-one has ever taken offence (AFAIK), shunned me (AFAIK), or done an edit to change them all to <-. So it appears to be acceptable.
Downsides
But there are three downsides I can think of, one important, one social and one bogus.
The first downside is R also uses the single equals sign for a named parameter assignment in a function call. This generally doesn't matter because using assignment in parameters is bad form in C-like languages, so I don't get confused. Just about the only time it ever matters to me is timing code. I *have* to write:
timing = system.time( x <- do_calculation(a=1,b=2) )
If I write the following then x won't get assigned:
timing = system.time( x = do_calculation(a=1,b=2) )
The second downside is the R community considers <- to be standard. So all packages use it, all books use it. If you want to be part of the in-crowd you have to use it too.
The third downside is it is easier to use search-and-replace to convert all your "<-" to "=", than it is to go the other way. But this is suspicious: the above example shows why you cannot do "<-" to "=" completely automatically anyway.
Upsides
Are there any upsides to preferring = to compensate? Yes, though they'll sound petty to people who believe using <- is the Only Way. First it is one less keystroke. Second, in comparisons, this does not work:if(x<-5)cat("x is less than minus five\n")
So you must put spaces around < and > in R; it is not just a style thing as it is in other languages.
The third reason is when I'm using <- it is communicating intent: I'm deliberately doing an assignment to a variable in the parameter list of a function call. As that is considered bad form in most languages, I like how it stands out.
I've saved the biggest upside for last: familiarity to programmers coming from any of the other C-like languages. (R is a C-like language too.)
Comparisons
You thought I'd mention ==, and how it can be confused with = ? That potential confusion exists in all C-like languages, and we just know to look out for it. And, anyway, it still exists in R:f(x=1) vs. f(x==1)
When I write one of those, did I mean the other?
Language Design
If I was designing R from scratch, how would I do it differently? I love how I can assign to named parameters in R - it is perhaps the most beautiful feature of the language. But it is an overload of the = sign, and one not found in other C-like languages, so I'd be tempted to change it to use ":" (it looks a bit like object notation in Javascript, but without the curly brackets). So the above example would become:timing = system.time( x = do_calculation(a:1,b:2) )
Notice how I can use "x=" safely, because it now has no other meaning. Also notice how this helps with the f(x=1) vs. f(x==1) confusion too. But I'm contradicting my comment above, about liking the way <- stands out. So maybe I want it to look like this:
timing = system.time( x <- do_calculation(a:1,b:2) )
Now a lint tool can warn about use of a single equals sign in a function call, because there should never be one. Hhhmmm, I'm not convinced but it is something to chew on.
Do you have any thoughts or constructive criticism? Please add a comment.
2 comments:
Obviously the : character is already bound. Up until recently with S4 slots, you could have gotten away with @, which sorta makes sense:
xyplot(y~x, data@mydata).
I agree that overloading the = operator is really annoying, and that miswriting == as = is a surprisingly frequent cause of bugs, at least in my own experience. I guess I just don't really *get* why people make such a big deal of it, except for the fact that it is annoying, and so by religiously picking *one* way the cognitive dissonance is removed.
Do you know of any evidence of Chambers commenting on this particular quirk of the language?
Regarding my suggestion to use colon to specify a named parameter, I was just reminded that this is how it is done in C#4, see http://weblogs.asp.net/scottgu/archive/2010/04/02/optional-parameters-and-named-arguments-in-c-4-and-a-cool-scenario-w-asp-net-mvc-2.aspx
(I'd forgotten this Most Excellent feature: previous C# projects I've done had been limited to C#3.5.)
However I don't think even C#4 has that other cool feature of R, which is "..." acts like a parameter. So you can pass "..." straight on to another function, and then you find yourself writing stuff like:
f=function(a,b,...,verbose=F){}
Post a Comment