How to use HashSet in C#

Take advantage of high-performance HashSet collections for storing unique elements to speed up searches in your applications.

How to use HashSet in C#
Thinkstock

A HashSet is an optimized collection of unordered, unique elements that provides fast lookups and high-performance set operations. The HashSet class was first introduced in .NET 3.5 and is part of the System.Collection.Generic namespace. This article talks about how we can work with HashSets in C#.

To work with the code examples provided in this article, you should have Visual Studio 2019 installed in your system. If you don’t already have a copy, you can download Visual Studio 2019 here.

Create a .NET Core console application project in Visual Studio

First off, let’s create a .NET Core Console Application project in Visual Studio. Assuming Visual Studio 2019 is installed in your system, follow the steps outlined below to create a new .NET Core Console Application project in Visual Studio.

  1. Launch the Visual Studio IDE.
  2. Click on “Create new project.”
  3. In the “Create new project” window, select “Console App (.NET Core)” from the list of templates displayed.
  4. Click Next.
  5. In the “Configure your new project” window shown next, specify the name and location for the new project.
  6. Click Create.

This will create a new .NET Core console application project in Visual Studio 2019. We’ll use this project to work with HashSet in the subsequent sections of this article.

What is a HashSet?

A HashSet — represented by the HashSet<T> class pertaining to the System.Collections.Generic namespace — is a high-performance, unordered collection of unique elements. Hence a HashSet is not sorted and doesn’t contain any duplicate elements. A HashSet also doesn’t support indices — you can use enumerators only. A HashSet is usually used for high-performance operations involving a set of unique data.

The HashSet<T> class implements several interfaces as shown below:

public class HashSet<T> : System.Collections.Generic.ICollection<T>,
System.Collections.Generic.IEnumerable<T>,
System.Collections.Generic.IReadOnlyCollection<T>,
System.Collections.Generic.ISet<T>,
System.Runtime.Serialization.IDeserializationCallback,
System.Runtime.Serialization.ISerializable

Since HashSet contains only unique elements, its internal structure is optimized for faster searches. Note that you can store a single null value in a HashSet. So, HashSet is a good choice when you want a collection that contains unique elements and the elements in the collection can be searched quickly.

Search an item in a HashSet in C#

To search an item in a HashSet you can use the Contains method as shown in the code snippet given below:

static void Main(string[] args)
        {
            HashSet<string> hashSet = new HashSet<string>();
            hashSet.Add("A");
            hashSet.Add("B");
            hashSet.Add("C");
            hashSet.Add("D");
            if (hashSet.Contains("D"))
                Console.WriteLine("The required element is available.");
            else
                Console.WriteLine("The required element isn’t available.");
            Console.ReadKey();
        }

HashSet elements are always unique

If you attempt to insert a duplicate element in a HashSet it would simply be ignored but no runtime exception will be thrown. The following code snippet illustrates this.

static void Main(string[] args)
{
   HashSet<string> hashSet = new HashSet<string>();
   hashSet.Add("A");
   hashSet.Add("B");
   hashSet.Add("C");
   hashSet.Add("D");
   hashSet.Add("D");
   Console.WriteLine("The number of elements is: {0}", hashSet.Count);
   Console.ReadKey();
}

When you execute the program, the output will be as shown in Figure 1.

hashset csharp 01 IDG

Figure 1.

Now consider the following code snippet that illustrates how duplicate elements are eliminated:

string[] cities = new string[] {
                "Delhi",
                "Kolkata",
                "New York",
                "London",
                "Tokyo",
                "Washington",
                "Tokyo"
            };
            HashSet<string> hashSet = new HashSet<string>(cities);
            foreach (var city in hashSet)
            {
                Console.WriteLine(city);
            }

When you execute the above program, the duplicate city names would be removed.

hashset csharp 02 IDG

Figure 2.

Remove elements from a HashSet in C#

To remove an item from a HashSet you should call the Remove method. The syntax of the Remove method is given below.

public bool Remove (T item);

If the item is found in the collection, the Remove method removes an element from the HashSet and returns true on success, false otherwise.

The code snippet given below illustrates how you can use the Remove method to remove an item from a HashSet.

string item = "D";
if(hashSet.Contains(item))
{
   hashSet.Remove(item);
}

To remove all items from a HashSet you can use the Clear method.

Use HashSet set operations methods in C#

HashSet has a number of important methods for set operations such as IntersectWith, UnionWith, IsProperSubsetOf, ExceptWith, and SymmetricExceptWith.

IsProperSubsetOf

The IsProperSubsetOf method is used to determine if a HashSet instance is a proper subset of a collection. This is illustrated in the code snippet given below.

HashSet<string> setA = new HashSet<string>() { "A", "B", "C", "D" };
HashSet<string> setB = new HashSet<string>() { "A", "B", "C", "X" };
HashSet<string> setC = new HashSet<string>() { "A", "B", "C", "D", "E" };
if (setA.IsProperSubsetOf(setC))
   Console.WriteLine("setC contains all elements of setA.");
if (!setA.IsProperSubsetOf(setB))
   Console.WriteLine("setB does not contains all elements of setA.");

When you execute the above program, you should see the following output at the console window.

hashset csharp 03 IDG

Figure 3.

UnionWith

The UnionWith method is used for set addition as illustrated in the code snippet given below.

HashSet<string> setA = new HashSet<string>() { "A", "B", "C", "D", "E" };
HashSet<string> setB = new HashSet<string>() { "A", "B", "C", "X", "Y" };
setA.UnionWith(setB);
foreach(string str in setA)
{
   Console.WriteLine(str);
}

When you execute the above piece of code, the elements of setB are copied into setA. So setA will now include "A", "B", "C", "D", "E", "X", and "Y". 

IntersectWith 

The IntersectWith method is used to represent the intersection of two HashSets. Here’s an example to understand this.

HashSet<string> setA = new HashSet<string>() { "A", "B", "C", "D", "E" };
HashSet<string> setB = new HashSet<string>() { "A", "X", "C", "Y"};
setA.IntersectWith(setB);
foreach (string str in setA)
{
    Console.WriteLine(str);
}

When you run the above program, only the elements common to the two HashSets will be displayed at the console window. The output would look like this: 

hashset csharp 04 IDG

Figure 4.

ExceptWith

The ExceptWith method represents mathematical set subtraction and is an O(n) operation. Assume you have two HashSets setA and setB and you specify the following statement:

setA.ExceptWith(setB);

This would return the elements of setA that are not present in setB. Let’s understand this with another example. Consider the code snippet given below.

HashSet<string> setA = new HashSet<string>() { "A", "B", "C", "D", "E" };
HashSet<string> setB = new HashSet<string>() { "A", "X", "C", "Y" };
setA.ExceptWith(setB);
foreach (string str in setA)
{
   Console.WriteLine(str);
}

When you execute the above program, the elements "B", "D", and "E" will be printed at the console window as shown in Figure 5.

hashset csharp 05 IDG

Figure 5.

SymmetricExceptWith 

The SymmetricExceptWith method is used to modify a HashSet to contain only the unique elements of two HashSets, i.e., the elements that are not common to both HashSets. Consider the following code snippet that illustrates this.

HashSet<string> setA = new HashSet<string>() { "A", "B", "C", "D", "E" };
HashSet<string> setB = new HashSet<string>() { "A", "X", "C", "Y" };
setA.SymmetricExceptWith(setB);
foreach (string str in setA)
{
  Console.WriteLine(str);
}

When you execute the above code, only the unique elements of setA and setB — i.e., the elements that are present in setA but not in setB, and the elements that are present in setB but not in setA — will be displayed at the console window as shown in Figure 6.

hashset csharp 06 IDG

Figure 6.

While the average complexity for accessing an element in an array is O(n), where n represents the number of elements in the array, the complexity is just O(1) for accessing a particular element in a HashSet. This makes HashSet a good choice for fast searches and for performing set operations. You can use a List if you would like to store a collection of items in a certain order, and maybe include duplicates as well. 

How to do more in C#:

Copyright © 2020 IDG Communications, Inc.