Parallel Programing, PLINQ and Globalization

I’m going to start with a simple code snippet which sorts an array of strings using LINQ.

IEnumerable<string> line = new[] {"Z","A","Ä"};
var result = line.OrderBy(letter => letter);
Console.WriteLine("{0}", string.Join(" ", result));

The result might look like this:

A Ä Z

… or not. It depends on the thread culture the sorting is running in. The string order is culture aware (unlike char order which is culture invariant), so if we switch for instance on one of the Norwegian cultures by adding this line Thread.CurrentThread.CurrentCulture = CultureInfo.GetCultureInfo("nn-NO"); before calling sort, we will get following output instead:

A Z Ä

As next I extended my code snippet to create 4 arrays and sort each of them parallely.

Thread.CurrentThread.CurrentCulture = CultureInfo.GetCultureInfo("nn-NO");
Console.WriteLine("Main thread-{0} \t Culture-'{1}'", Thread.CurrentThread.ManagedThreadId, Thread.CurrentThread.CurrentCulture);
Console.WriteLine(new string('-', 80));

List<string[]> list = new List<string[]>();
for (int i = 0; i < 3; i++)
{
    list.Add(new[] { "Ä", "A", "Z" });
}

var result =
    list
        .Select(
            line => line
                .OrderBy(letter => letter));

Parallel.ForEach(result,
    line =>
        Console.WriteLine(
            "Thread-{0} \t Culture-'{1}' \t {2}",
            Thread.CurrentThread.ManagedThreadId,
            Thread.CurrentThread.CurrentCulture,
            string.Join(" ", line)));

Console.WriteLine();
Console.WriteLine("Press any key to quit");
Console.ReadKey();

The result looks like this:

Main thread-1    Culture-'nn-NO'
------------------------------------------------

Thread-1         Culture-'nn-NO'         A Z Ä
Thread-5         Culture-'de-DE'         A Ä Z
Thread-3         Culture-'de-DE'         A Ä Z
Thread-4         Culture-'de-DE'         A Ä Z

Press any key to quit

Line 4 sorting order differs from line 5. The sorting was splited up into 4 threads one main and 3 new threads.

All three newly created threads got the default culture of my system – not the culture of the main thread which was set manually.

The culture is a property of the executing thread. When a thread is started, its culture is initially determined by using GetUserDefaultLCID from the Windows API. There is no way that I know manipulate this. See CultureInfo.CurrentCulture property at MSDN.

The same result if you use PLINQ syntax:

    list
        .AsParallel()
        .Select(
            line => line
                .OrderBy(letter => letter))
        .ForAll(
            line =>
                Console.WriteLine(
                    "Thread-{0} \t Culture-'{1}' \t {2}",
                    Thread.CurrentThread.ManagedThreadId,
                    Thread.CurrentThread.CurrentCulture,
                    string.Join(" ", line)));

The same query without parallel execution delivers consistent output, all four sequences are sorted in the same order.

The solution is to pass a specific culture aware comparer across into the OrderBy method.

var norvegianIgnoreCaseComparer = StringComparer.Create(CultureInfo.GetCultureInfo("nn-NO"), false); 
list
    .AsParallel()
    .Select(
        line => line
            .OrderBy(letter => letter, norvegianIgnoreCaseComparer))
    .ForAll(
        line =>
            Console.WriteLine(
                "Thread-{0} \t Culture-'{1}' \t {2}",
                Thread.CurrentThread.ManagedThreadId,
                Thread.CurrentThread.CurrentCulture,
                string.Join(" ", line)));

Well, but what about foreach and LINQ legacy code which can be paralelized with simple replacement of a single line by Parallel.ForEach() or adding AsParallel(). The result might be unpredictable and difficult to figure out. So if I would be the author of .NET or PLINQ I would take over the culture of the main thread into the child threads, thus the data come from the main thread, the split-up takes place implicitly and in most cases results are merged back into the main thread back to be consumed there.

Similar issues might occur in queries using any of culture aware calculations, for instance DateTime formatting and parsing.

So if you are targeting systems having different regional settings it is a good idea to pass CultureInfo or Culture specific staff (like comparers) into every PLINQ query and parallel call.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s