- RISE FROM THE ASHES – Sanjib Nandi - November 1, 2021
- साथ में मेरे दोस्त खड़े थे - August 1, 2021
- BenchmarkDotNet: Advanced Features - June 20, 2021
Code used in this article is available here on github for download and for your experiment.
It has always been confusion among developers that when to use which collection. Sometimes we don’t know about all the available collections and their purpose and consequently use them wrongly and end up with performance issues or rework.
By the end of this article, your all queries and doubts will be gone and next time you will have fair idea that which collection will be most appropriate in the given scenario. I admit it is a long article and still, I didn’t split it as in doing that momentum would have gone. At the same time, you can finish it quickly.
Outline of the Article –
First, We will see the decision tree about choosing the right collection and after
This self-explanatory decision tree gives a comprehensive guide to use the right collection in the given scenario. PDF version of this is available here.
How to navigate in the Decision Tree (For Example)
Scenario – We need a generic collection where we can access the items using key and index both and all items should be sorted by default.
Solution – Start -> Only String Storage -> NO -> Access in Sequence -> NO -> Access by Index (only) -> NO -> Access by Key (only) -> NO -> Access by Key or Index or both -> YES -> Default Sorting needed -> YES and the right collection is SortedList<TKey, TValue>. Similar technique can be applied for any situation.
Now, We will explore more all categories and respective collections
Category #1: Storing Strings only
1.1: StringCollection –
Namespace – System.Collections.Specialized
Basic characteristics –
- Only string ca be added to this collection
- It allows duplicates values to be inserted.
- We can store NULL values as well.
- String comparisons are case sensitive.
- Values can also be accessed by zero based indexing.
- Values are stored in the order of insertion.
Code Example –
public static void LearnStringCollection()
{
Console.WriteLine("From LearnStringCollection Method");
System.Collections.Specialized.StringCollection strCollection
= new System.Collections.Specialized.StringCollection();
String[] arString = new String[] { "ATUL", "THERAN" };
strCollection.AddRange(arString);
strCollection.Add("atul"); //Adding same in another case
strCollection.Add("THERAN"); //Adding duplicate
foreach (var item in strCollection)
{
Console.WriteLine(item);
}
for (int i = 0; i < strCollection.Count; i++)
{
Console.WriteLine($"Value at index # {i} is {strCollection[i]}");
}
}
OUTPUT
Read more on MSDN
1.2: StringDictionary
Namespace – System.Collections.Specialized
Basic Characteristics
- Key implements the hash table.
- Stores only string values in Key Value pair.
- Key can’t be null or duplicate while value can be duplicate.
- It may have one key as String.Empty or “”.
- Key is case insensitive and stored in lower case.
- Values are not shown in the order they are stored, while they are shown in the order of the hashed value of the key. (See the example)
Code Example –
public static void LearnStringDictionary()
{
Console.WriteLine("From LearnStringDictionary Method");
System.Collections.Specialized.StringDictionary strDict = new System.Collections.Specialized.StringDictionary();
strDict.Add("Title Case", "Atul");
strDict.Add("Lower Case", "atul");
strDict.Add(string.Empty, "ATUL");
foreach (System.Collections.DictionaryEntry item in strDict)
Console.WriteLine(" {0,-25} {1}", item.Key, item.Value);
}
Output –
Read more on MSDN
Category #2: Access in Sequence
2.1: Last in First Out (LIFO)
Stack – it comes with two version. Stack class and Stack generic class
Namespace – System.Collections
Characteristics
- Basically, it has 3 main operations
- Push – adding a new item to Stack
- Pop – gets the newest item from the stack and delete it
- Peek – Next item to be deleted (For stack, this is the newest item added in the stack)
- Stack class adds all items in object type while Queue generic class becomes specific to the type specified in the declaration.
Code example
static void LearnStackCollection()
{
Console.WriteLine("from LearnStackCollection Method");
Stack<string> stack = new Stack<string>();
stack.Push("Atul"); // Adding first item
stack.Push("Kumar"); // Adding second item
stack.Push("Sharma"); // Adding third item
stack.Pop(); // Pop
Console.WriteLine($"Peek value ={stack.Peek()}"); //top most item but no delete
foreach (var item in stack)
{
Console.WriteLine(item.ToString());
}
}
Output
Read more on Stack and Generic Stack
2.2: First in First Out (FIFO)
Queue – it comes in two version. Queue class and Queue generic class
Namespace – System.Collections
Characteristics
- Basically, it has 3 main operations
- Enqueue – adding a new item to queue
- Dequeue – gets the oldest item from the queue and delete it
- Peek – Next item to be deleted (For Queue, it is the oldest item at the moment)
- Queue class adds all items in object type while Queue generic class becomes specific to the type specified in declaration.
Code example
static void LearnQueueCollection()
{
Console.WriteLine("from LearnQueueCollection Method");
Queue<string> queue = new Queue<string>();
queue.Enqueue("Atul"); // Adding first item
queue.Enqueue("Kumar"); //Adding second item
queue.Enqueue("Sharma"); //Adding third item
queue.Dequeue(); // Dequeue and delete from the collection
Console.WriteLine($"Peek value ={queue.Peek()}"); //returns the old item but doesn't delete it
foreach (var item in queue)
{
Console.WriteLine(item.ToString());
}
}
Output
Read more on Queue and Generic Queue.
2.3: First to Last or Last to First
Linkedlist<T> – It represent the doubly linkedlist
Namespace System.Collections.Generic
Characteristics
- LinkedList<T> accepts null as a valid Value property for reference types and allows duplicate values
- It is doubly LinkedList so all nodes will have next and previous.
- If the LinkedList<T> is empty, the First and Last properties contain null.
Code Example
static void LearnLinkedList()
{
Console.WriteLine("from LearnLinkedList Method");
//NOTE - No Knowledge to Tom Cruise Mission Impossible Series wont affect your learning
List<string> movies = new List<string>() {"MI-1", "MI-2","MI-3","MI-Rougue Nation",};
LinkedList<string> movieSeries = new LinkedList<string>(movies);
Console.WriteLine("Initial List");
foreach (var item in movieSeries)
{
Console.WriteLine(item);
}
// Tasks to do - I have missed Ghost Protocol at # 4 and So far the last movie # 6 Fallout in the series
LinkedListNode<string> currentMovie = movieSeries.FindLast("MI-Rougue Nation");
movieSeries.AddBefore(currentMovie, "MI-Ghost Protocol");
movieSeries.AddAfter(currentMovie, "MI-Fallout");
Console.WriteLine("Corrected List");
foreach (var item in movieSeries)
{
Console.WriteLine(item);
}
}
Output
Read More – MSDN
Category #3: Access by Index
3.1: StringCollection – We already covered in Category 1 i.e. 1.1
3.2: Array
Namespace – System
Basic Characteristics –
- Array is not part of System.Collection but it implements IList.
- It can store same type of items.
- It needs to declare the size with declaration.
- Increasing the size of an array can be costly.
- It works on zero based index.
- By Default, maximum size of an array can be 2 GB though it can increased even more.
Code Example
static void LearnTraditonalArray()
{
Console.WriteLine("from LearnTraditonalArray Method");
string[] writers = new string[5] { "Atul", "Tuba", "Anonymous", "Theran", "Jay" };
for (int i = 0; i < writers.Length; i++)
{
Console.WriteLine($"{writers[i]} joined at number # {i + 1}");
}
}
Output
Read more on MSDN
3.3: ArrayList
Namespace – System.Collection
Basic Characteristics –
- It can store any type of item in it.
- Capacity defines the count of items it can store without resizing.
- All items are converted in to object before storing.
- While fetching, they need to be converted back to the specific type from object.
- Boxing unboxing is major reason for performance pitfall of array list.
Code Example
static void LearnArrayList()
{
Console.WriteLine("from LearnArrayList Method");
ArrayList arInfo = new ArrayList();
arInfo.Add("Till now, Atul has written "); // Adding string
arInfo.Add(10); //Adding integer
arInfo.Add(" Articles");
Console.WriteLine($"Currently, It has {arInfo.Count} items and it has Capacity of {arInfo.Capacity}"); //Adding string
for (int i = 0; i < arInfo.Count; i++)
{
Console.Write(arInfo[i].ToString());
}
Console.WriteLine();
}
Output
Read More on MSDN
3.4: List<T>
Namespace – System.Collections.Generic
Characteristics
- This is generic version of ArrayList with more features and flexibilities
- It can only store the defined type.
- By default, they are not guaranteed to be sorted. Need to explicitly call sort method.
- Capacity defines the count of items it can store without resizing.
- It allows duplicates.
- It also accepts Null value for refences type
Code Example
static void LearnGenericList()
{
Console.WriteLine("from LearnGenericList Method");
List<int> iCounts = new List<int>();
iCounts.Add(1);
iCounts.Add(12);
iCounts.Add(13);
Console.WriteLine($"Capacity {iCounts.Capacity}" );
for (int i = 0; i < iCounts.Count; i++)
{
Console.WriteLine(iCounts[i]);
}
}
Output
More Study – MSDN
Category #4: Access by Key
4.1: Hash table –
Namespace – System.Collection
Characteristics –
- Each element in Key Value pair is stored as DictionaryEntry object.
- Data is searched and retrieved based on the hash code of the key, not the order they are inserted.
- Key object must be immutable if it is used as a key.
- Key and Value must be object type, it means boxing and unboxing of added items happens.
Code Example –
static void LearnHashTable()
{
Console.WriteLine("from LearnHashTable Method");
Hashtable htData = new Hashtable();
htData.Add("Name", "Atul"); //Inserting string
htData.Add("Age", 100); // inserting integer
htData.Add("IsSeniorCitizen", true); // Inserting boolean
htData["Country"] = "USA"; //Another way to inserting the data
foreach (DictionaryEntry item in htData)
{
Console.WriteLine($"Value for Key {item.Key} is {item.Value}");
}
}
Output –
Read More on MSDN
4.2: Sorted List – It will be covered in Category # 5 i.e. 5.4
4.3: ListDictionary –
NameSpace – System.Collections.Specialized
Basic Characteristics –
- It is faster than Hashtable as it doesn’t generate the hash code of the key.
- Faster and well suited for search with key.
- It is not appropriate for the large amount of data.
- Here Key – value pair are stores as DictionaryEntry object.
- It also accepts Key and Value to be object type, hence boxing and unboxing overhead exists.
Code Example-
static void LearnListDictionary()
{
Console.WriteLine("from LearnListDictionary Method");
System.Collections.Specialized.ListDictionary lstData = new System.Collections.Specialized.ListDictionary();
lstData.Add("Name", "Atul"); //Inserting string
lstData.Add("Age", 100); // inserting age as integer, but not correct
lstData.Add("IsSeniorCitizen", true); // Inserting boolean
lstData["Country"] = "USA"; //Another way to inserting the data
foreach (DictionaryEntry item in lstData)
{
Console.WriteLine($"Value for Key {item.Key} is {item.Value}");
}
}
Output –
Read More on MSDN
4.4: StringDictionary – Already covered in Category # 1 i.e. 1.2
4.5: Dictionary<TKey,TValue>
Namespace – System.Collections.Generic
Basic Characteristics
- It is the generic version of ListDictionary.
- Key is hash coded as it is implemented as Hash Table, different from ListDictionary.
- All keys should be unique and not null.
- If keys are string, they will be case sensitive.
- It doesn’t allow to add any other type than declared.
- Key/Values pairs are stored as KeyValuePair object.
Code Example
static void LearnGenericDictionary()
{
Console.WriteLine("from LearnGenericDictionary Method");
Dictionary<string, int> dictExperiences = new Dictionary<string, int>();
dictExperiences.Add("Atul", 12);
dictExperiences.Add("Theran", 16);
dictExperiences.Add("Tuba", 8);
dictExperiences.Add("tuba", 5); //Keys are Case sensitive So it will be OK
foreach (var item in dictExperiences)
{
Console.WriteLine($"{item.Key} has {item.Value} years on experience");
}
}
Output
Read More on MSDN
4.6: SortedDictionary<TKey,TValue>
Namespace – System.Collections.Generic
Basic Characteristics
- It is same as Dictionary.
- Items are sorted by the key.
Code Example
static void LearnSortedDictionary()
{
Console.WriteLine("from LearnSortedDictionary Method");
SortedDictionary<int, string> srtMembers =
new SortedDictionary<int, string>();
srtMembers.Add(1, "Theran");
srtMembers.Add(4, "Tuba");
srtMembers.Add(2, "Jay");
srtMembers.Add(3, "Atul");
//Items are inserted in different order (1,4,2,3) but sorted in 1,2,3,4
foreach (var item in srtMembers)
{
Console.WriteLine($"Member # {item.Key} is {item.Value}");
}
}
Output
Read more on MSDN
Category #5: Access by Key or Index or Both
5.1: NameValueCollection
Namespace – System.Collections.Specialized
Characteristics –
- It can store strings only as key and Value pair.
- It allows duplicate keys and stores the value together.
- No Particular ordering or sorting guaranteed.
- It accepts the null as a key or as a value.
Code Example –
static void LearnNameValueCollection()
{
Console.WriteLine("from LearnNameValueCollection Method");
NameValueCollection nvCityCountry = new NameValueCollection();
nvCityCountry.Add("IND", "Mumbai");
nvCityCountry.Add("USA", "New York");
nvCityCountry.Add("USA", "Chicago"); //Duplicate key and see the output
nvCityCountry.Add("USA", "Chicago"); // Duplicate key and value both
Console.WriteLine("Access By index");
for (int i = 0; i < nvCityCountry.Count; i++)
{
Console.WriteLine(" [{0}] {1,-10} {2}", i, nvCityCountry.GetKey(i), nvCityCountry.Get(i));
}
Console.WriteLine("Access By Key");
foreach (string item in nvCityCountry.AllKeys)
{
Console.WriteLine(" {0,-10} {1}", item, nvCityCountry[item]);
}
}
Output –
Read more here at MSDN
5.2: SortedList <TKey, TValue>
Namespace – System.Collections.Generic
Characteristics
- It has all same properties like SortedDictionary<TKey, TValue>
- SortedList uses less memory than SortedDictionary.
- SortedDictionary has faster insertion and removal operations for unsorted data, O(log n) as opposed to O(n) for SortedList.
- If the list is populated all at once from sorted data, SortedList is faster than SortedDictionary.
Code Example
static void LearnSortedListGeneric()
{
Console.WriteLine("from LearnSortedListGeneric Method");
SortedDictionary<int, string> classRoll =
new SortedDictionary<int, string>();
classRoll.Add(1, "Atul");
classRoll.Add(4, "Jay");
classRoll.Add(3, "Theran");
classRoll.Add(2, "Tuba");
//By default it sorts
foreach (var item in classRoll)
{
Console.WriteLine(item.Value);
}
//Access by Index
Console.WriteLine(classRoll[2]);
//Access by Key
foreach (KeyValuePair<int, string> kvp in classRoll)
{
Console.WriteLine("Key = {0}, Value = {1}",
kvp.Key, kvp.Value);
}
}
Output
More study – MSDN
5.3: NameObjectCollectionBase
Namespace – System.Collections.Specialized
Characteristics –
- This is an abstract class so can be used directly like other collections.
- Key is stored as string while Value can be any object which is accessible as Key or Index.
- Underlying structure for this class is Hash Table.
Code Example , output and further details are available Here on MSDN
5.4: KeyedCollection<TKey, TValue>
Namespace – System.Collections.ObjectModel
Characteristics
- Like NameObjectCollectionBase, is this also an abstract class.
- It is generic version of NameObjectCollectionBase.
- Unlike other dictionary collection, in KeyedCollection values are NOT stored in Key value pair rather they are stored as one value where key is embedded in to value.
Code Example , output and further details are available Here on MSDN
I hope after this long read, I could be able to explain and give fair idea of which collection to use in which scenario. What do you feel, Please, share your feedback. Code used in this article is available here on github for download and for your experiment.