Friday, November 6, 2015

Generating an Array of Consecutive Intergers in C#

Recently I had to generate an array of consecutive integers from 0 to n-1 for a given number n. I found three ways of doing this in C#, some of which are more elegant and succinct than others. Now I would like to find out which of them is the most efficient and why.

The three methods are:

  1. Using a conventional loop
  2. Using a clever variant of the LINQ Select query
  3. Using the Range method of the Enumerable class
1. The first method uses a plain loop in a straightforward way.

public static int[] PlainLoop(int n)
{
    //create an array of n integers
    var arr = new int[n];
    // in a loop set each element of the array to be equal to its index
    for (int i = 0; i < n; i++) arr[i] = i;
    //return results
    return arr;
}

We instantiate an array of n integers, all elements by default being 0. Then we loop through the array and make each element to be equal to its index in the array. So the element with index 0 is 0, the element with index 1 is 1, and so on until we come to the last element, which has index n-1. So basically the problem is reduced to returning the array of indices of an array of length n!

2. Then I thought, why not implement this idea as a one-liner, using LINQ (Language Integrated Query), and specifically the LINQ Select query with index:

public static int[] LinqSelect(int n)
{
    //create an array of n integers, 
    //then use Linq Select with Index, to get the indexes of the array, 
    //and create an array based on this select query
    return new int[n].Select((x, ind) => ind).ToArray();
}

So again we create an array of n integers, then select the indices of the array into a separate array. This looks pretty nice and quite straightforward.

3. Another method, which I think has been designed specifically for this purpose, is to use the static Range method of the System.Linq.Enumerable class. This gives us a perfect one-liner:

public static int[] EnumerableRange(int n)
{
    //use the Range query of the Enumerable class
    //and create an array based on this select query 
    return Enumerable.Range(0, n).ToArray();
}

The Range method generates a sequence of integers, whereby you can specify the number to start with (in our case 0) and how many numbers you want (in our case n).

So far so good. Now let's look at the performance of these three methods, by executing them in a profiler. I created a unit test that exercises each of these methods with n equal to 1 million:

[TestMethod]
public void GenArrayOfConsInts_PerfTest()
{
    int n = 1000000;
    var arr1 = GenConsecInts.PlainLoop(n);
    var arr2 = GenConsecInts.LinqSelect(n);
    var arr3 = GenConsecInts.EnumerableRange(n);
}

In the Visual Studio Text Explorer, we can run this test through the performance profiler:


Let us examine the call tree trace of the profiler:


We see that the plain loop method is the most efficient one taking only 2.67% of the total processing time. Then comes the Enumerable.Range method with 9.86% (almost 4 times slower), and then comes the LINQ Select method with the staggering 87.44% (about 30 times as slow as the plain loops). We also notice that a select in this method is called 1,000,000 times, for each array element taking quite some time to execute.

So the conclusion is pretty clear, using plain loops in this way is very fast, Enumerable.Range is also OK, and in addition very elegant (!), but LinqSelect is way too slow. The question is of course: why is that?

I will update this post when I figure out the why!

1 comment:

  1. Wow. I really like this informative post. Thank you so much for sharing.

    Buy Custom Website

    ReplyDelete