Skip to main content

Parallel LINQ (PLINQ) - Intro

.Net 4.0 supports parallel LINQ or PLINQ, PLINQ is a parallel implementation of LINQ.
PLINQ has the same characteristics has LINQ, in that it executes queries in a differed manner.
However, the main difference is that with PLINQ, your data source gets partitioned and each chunk is processed by different worker threads (taking into account the number of processor cores that you have) , making your query execute much faster in certain occasions.

Running a query in parallel is just a matter of calling the AsParallel() method of the data source, this will return a ParallelQuery<T> and your query will execute parallel.

Let's look a code sample...

var query = from num in source.AsParallel()
where num % 3 == 0
select ProcessNumber(num);

Now, when this query is iterated our a foreach loop or when you call ToList() etc..the query will be run in different worker threads.

Although, you have parallelized your query execution, if you want to do something with that result within a loop, then that processing will happen serially although you query executed in a parallel way.

You can achieve this parallelism by running the loop using a Parallel.ForEach() or you can use the ForAll method like this....

var query = from num in source.AsParallel()
where num % 3 == 0
select ProcessNumber(num);


query.ForAll
( x => { /*Do Something*/ } );

In the code above the query will run in parallel as well as the result will be processed parallely.

Running LINQ queries in parallel is does not always gives you best performance, this is basically due to the fact that the initialization and partitioning outwits the cost of actually running the query in parallel.
Hence, it's necessary for you to compare which option is best LINQ or PLINQ.
MSDN documents that PLINQ will first see if the query can be run in parallel, then sees the cost of running this query in parallel vs sequentially, if the cost of running this in parallel is more then running it sequentially, then the runtime will run this query in a sequential manner.
I tried it out, but could not actually see the difference :)

Another good option that you might want to run your query if you are thinking of running it in a ForAll method is running the parallel query with a ParallelMergeOptions.
By default, although the query executes parallely, the runtime would have to merge the results from different worker threads into one single result if your running the query over a foreach loop or doing a ToList(), this sometimes causes a partial buffering.

However if you are iterating the query over a ForAll() you can take the benefit of not buffering the record and processing it once the result return from the thread without buffering...here is a code sample on how to do this...

var query = from num in source.AsParallel().WithMergeOptions(ParallelMergeOptions.NotBuffered)
where num % 3 == 0
select ProcessNumber(num);


query.ForAll
( x => { /*Do Something*/ } );

Although using a ForAll() consumes the items when it returns from the thread, I saw some noticeable difference when running the query with a ParallelMergeOption.

Comments

Popular posts from this blog

Hosting WCF services on IIS or Windows Services?

There came one of those questions from the client whether to use II7 hosting or windows service hosting for WCF services. I tried recollecting a few points and thought of writing it down. WCF applications can be hosted in 2 main ways - In a Windows service - On IIS 7 and above When WCF was first released, IIS 6 did not support hosting WCF applications that support Non-HTTP communication like Net.TCP or Net.MSMQ and developers had to rely on hosting these services on Windows Services. With the release of IIS 7, it was possible to deploy these Non-Http based applications also on IIS 7. Following are the benefits of using IIS 7 to host WCF applications Less development effort Hosting on Windows service, mandates the creating of a Windows service installer project on windows service and writing code to instantiate the service, whereas the service could just be hosted on IIS by creating an application on IIS, no further development is needed, just the service implementa

The maximum nametable character count quota (16384) has been exceeded

Some of our services were growing and the other day it hit the quote, I could not update the service references, nor was I able to run the WCFTest client. An error is diplayed saying " The maximum nametable character count quota (16384) has been exceeded " The problem was with the mex endpoint, where the XML that was sent was too much for the client to handle, this can be fixed by do the following. Just paste the lines below within the configuration section of the devenve.exe.config and the svcutil.exe.config files found at the locations C:\Program Files\Microsoft Visual Studio 9.0\Common7\IDE , C:\Program Files\Microsoft SDKs\Windows\v6.0A\bin Restart IIS and you are done. The detailed error that you get is the following : Error: Cannot obtain Metadata from net.tcp://localhost:8731/ Services/SecurityManager/mex If this is a Windows (R) Communication Foundation service to which you have access, please check that you have enabled metadata publishing at the specified address. F

Finalization, Know What it Costs

This is a post about object finalization in .NET. Finalization is not as inexpensive as we think, it increases the pressure put on GC.All objects that need finalization are moved into a finalizable queue and the actual finalization happens in a separate thread. Because the objects full state may be needed, the object itself and all the object it points to are promoted to the next generation (this is needed so that GC does not clean these objects off in the current round), and these objects are cleaned up only after the following GC. Due to this reason, resources that need to be released should be wrapped in as small a finalizable object as possible, for instance if your class needs a reference to an unmanaged resource, then you should wrap the unmanaged resource in a separate finalizable class and make that a member of your class and furthermore the parent should be a non-finalizable class. This approach will assure that only the wrapped class (class that contains the unmanaged resourc