Properly Comparing Strings with Globalization and Performance in .NET

code-performance-book-cover-2-5x-1In Microsoft .NET there are many ways to compare strings. I would say that most of the code I analyze, I see it done one of these two ways:

bool result = email1 == email2;
bool result = email1.Equals(email2);

Is this the best way to compare strings? The quick answer is no. While this works, it doesn’t take into consideration localization and globalization. I’ve seen many developers convert the strings to lower or uppercase characters which does affect performance  and might not have the results they expect. So, let us see how to properly compare strings while thinking about globalization and performance.

Here is how Wikipedia defines localization and globalization:

In computing, internationalization and are means of adapting computer software to different languages, regional peculiarities and technical requirements of a target locale. Internationalization is the process of designing a software application so that it can be adapted to various languages and regions without engineering changes. Localization is the process of adapting internationalized software for a specific region or language by translating text and adding locale-specific components.

I would say that over 90% of the code projects that I analyze when a company hires me does not take this into account. In this global economy that we live in, this must be built into every line of code, especially dealing with strings that the user ends up seeing. Project Managers and developers who don’t do this from the beginning of the project, are naïve and will end up costing the company a lot of money later. Changing a project to handle multiple languages and locals later will be a very painful, costly and delay the project for many months… even for small projects. I’ve been through this process many times since the ’90s.

Let’s set a baseline for performance that I will refer to later. Here is the performance when using == or Equals() to compare two strings.

Code Perf-String Equals

As you can see, the performance of these two ways of comparing strings is very close in performance.

String.Compare()

Using string.Compare() for globalization works well and based on the benchmarking I’ve done for my book about code performance, and the best on performance too. Before I show you an example, I need to show the different string comparison globalization choices that can be used.

CurrentCulture Compare strings using culture-sensitive sort rules and the current culture.
CurrentCultureIgnoreCase Compare strings using culture-sensitive sort rules, the current culture, and ignoring the case of the strings being compared.
InvariantCulture Compare strings using culture-sensitive sort rules and the invariant culture.
InvariantCultureIgnoreCase Compare strings using culture-sensitive sort rules, the invariant culture, and ignoring the case of the strings being compared.
Ordinal Compare strings using ordinal (binary) sort rules.
OrdinalIgnoreCase Compare strings using ordinal (binary) sort rules and ignoring the case of the strings being compared.

I would say that for most of the strings I’m comparing, I use either CurrentCultureIgnoreCase or IvariantCultureIgnoreCase. Here is an example:

bool result = string.Compare(email1,
              email2,
              StringComparison.CurrentCultureIgnoreCase) == 0;

There are 16 overloaded methods for string.Compare() that makes it very flexible for a variety of comparison tasks, including sorting.

Performance

Now, let’s look at the performance of string.Compare().

Code Perf-String Compare-Globalization

As you can see from these benchmark tests, using Ordinal is close to the speed of using one of the equals. I hope that the .NET team continues to work on performance since this type of comparison is used a lot in applications.

Summary

The take away from this article is even though using Compare() with a string comparison is less performant that one of the other methods mentioned, it’s very important to code this way for globalization. I highly recommend benchmarking your code to see what works best for your project and requirements.

You can pick up a copy of my code performance book by going here: http://bit.ly/dotNetDaveBooks. I have also written articles about code performance on my blog: http://bit.ly/dotNetTipsPerf.

If you have any comments or questions, please make them below.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.