Playing around with PLINQ and IO-bound tasks
I recently downloaded Visual Studio 2010 beta, and took the chance to play with PLINQ. PLINQ, for those of you in the dark ages of .Net Framework 2.0, is parallel LINQ – an extension to the famous query language that makes it easy to write parallel code (essential to programming in the 21th century, in the age of the many-core).
A code sample, as usual, is the best demonstration:
public static int CountPrimes(IEnumerable<int> input) { return input.AsParallel().Where(IsPrime).Count(); } private static bool IsPrime(int n) { for (int i = 2; i*i < n; ++i) if (n % i == 0) return false; return true; } |
This code sample, regardless of using an inefficient primality test, is fully parallel. PLINQ will utilize all your cores when running the above code, and I didn’t have to use any locks, queues, threadpools or any of the more complex tools of the trade. Just tell PLINQ “AsParallel()”, and it works.
I hit some gotcha when I tried to compare the parallel performance with the sequential one. Do you spot the problem in the following code?
public static void CountPrimesTest(IEnumerable<int> input) { // parallel benchmark var timer = new Stopwatch(); timer.Start(); CountPrimes(input.AsParallel()); timer.Stop(); Console.WriteLine("Counted primes in parallel took " + timer.Elapsed); // sequential benchmark timer = new Stopwatch(); timer.Start(); CountPrimes(input); timer.Stop(); Console.WriteLine("Counted primes sequentially took " + timer.Elapsed); } |
This is all fine and dandy when the task at hand is CPU bound, but works pretty miserabbly when your task is IO bound, like downloading a bunch of web pages. Next, I simulated some IO-bound tasks (I used Sleep() to emulate IO – basically not using a lot of CPU for every task):
[ThreadStatic] private static Random _random; public static List<string> FindInterestingDomains(IEnumerable<string> urls) { // select all the domains of the interesting URLs return urls.AsParallel().Where(SexFilter). Select(url => new Uri(url).Host).ToList(); } public static bool SexFilter(string url) { if (_random == null) _random = new Random(); // simulate a download Thread.Sleep(1000); var html = "<html>" + _random.Next() + "</html>"; return html.Contains("69"); } |
Testing this with a list of 10 URLs took 5 seconds, meaning LINQ again spun only two cores, which is the number of cores on my machine. This really sucks for IO bound tasks, because most of the time the threads are idle, waiting on IO. Let’s see if we can speed this up:
// Use WithDegreeOfParallelism to specify the number of threads to run return urls.AsParallel().WithDegreeOfParallelism(10).Where(SexFilter). Select(url => new Uri(url).Host).ToList(); |
This appeared not to work at first, because WithDegreeOfParallelism is just a recommendation or upper bound. You can ask PLINQ nicely to run with ten threads, but it will only allocate two if it so chooses. This is yet another example of C# being more magical than Java – compared to Java’s rich ExecutorService, PLINQ offers less fine grained control.
However, further testing revealed the damage is not so horrible. This is what happened when I put the above code in a while(true):
Tested 10 URLs in 00:00:05.0576333 Tested 10 URLs in 00:00:03.0018617 Tested 10 URLs in 00:00:03.0013939 Tested 10 URLs in 00:00:03.0013175 Tested 10 URLs in 00:00:04.0018983 Tested 10 URLs in 00:00:03.0024044 Tested 10 URLs in 00:00:01.0004407 Tested 10 URLs in 00:00:01.0007645 Tested 10 URLs in 00:00:01.0007280 Tested 10 URLs in 00:00:01.0003358 Tested 10 URLs in 00:00:01.0003347 Tested 10 URLs in 00:00:01.0002470
After some trial and error, PLINQ found that the optimal number of threads needed to run this task under its concurrency guidelines is ten. I imagine that if at some point in the future the optimal number of threads change, it will adapt.
P.S.
If you found this interesting, wait till you read about DryadLINQ – it’s LINQ taken to the extreme, run over a cluster of computers.