Showing posts with label threads. Show all posts
Showing posts with label threads. Show all posts

Monday, July 4, 2011

R: foreach parallelization

I was experimenting with the foreach and doMC packages in R, which make your code multi-threaded. It actually took a bit of refactoring to use foreach(), as there seems to be no shared memory between the threads - they each took their own copy of the in-scope variables. So I had to return my results as a one-row data frame from each iteration, and combine them when the foreach loop had finished.

Here are my results; the user CPU column was giving me strange numbers, so I'm just going to list total time spent (wall clock time). I have three different loops, so here are the base timings (running the foreach loop in sequential mode, without doMC loaded) for each:
4.33         7.72         16.14
My CPU has 4 cores, but the OS sees it as 8 virtual cores. Here are my results for 1, 2, 4 and 8 threads:
1   4.36         7.90         16.20
  2   2.50 [2.18]  4.30 [3.95]   8.80 [8.10]  [8-15% slower]
  4   1.46 [1.09]  2.50 [1.98]   5.06 [4.05]  [25-35% slower]
  8   1.32 [0.55]  2.30 [0.99]   4.30 [2.02]  [110-140% slower]

All times are in seconds, and this loop represents most of the time spent in my script, so while the results are a long way from linear, they represent a useful speed-up. The numbers in square brackets show the speeds if I had got linear improvement.

By the way, my foreach loop had 200 to 250 iterations. The above results tell me that when each foreach loop iteration does more work we get better efficiency. This is fairly course parallelization, which suggests to me that there is lots of room for improvement in the doMC() code.
UPDATE: When running with top, and cores set to 4, I notice 4 new instances of R appear each time it hits the foreach loop, and then they disappear after. I.e. it appears to be using processes, not threads, and creating them on-demand! No wonder my results indicate so much overhead!