Lazy LINQ and Enumerable Objects

Wednesday, October 1, 2008

Someone asked me why LINQ operators return an IEnumerable<T> instead of something more useful, like a List<T>. In other words, in the following code:

List<Book> books = new List<Book>();
// ...
IEnumerable<Book> filteredBooks = 
    books.Where(book => book.Title.StartsWith("R"));

... we started with a List<Book>, so why isn’t the Where operator smart enough to return a new List<Book>, or modify the existing list by removing books that don’t match the Where condition?

Let’s talk about modifying the original list.

I hope you’ll agree that it would be odd for a query to modify a data source. Imagine sending a SELECT statement to a database and finding out later your SELECT removed all but one record from a table. Although the Where operator is just a method call that could change the underlying list of books, it’s better to return something new and leave the original list intact. You won’t find any LINQ operators that modify input , and this behavior produces many benefits. One obvious benefit is that you can think of the above code as a query, and it won’t surprise you by removing books from your original list.

What about creating a new List<T>?

It turns out that creating a new list can be quite expensive, but only because we often don’t need a List<T> returned. Think about the number of lists created in the following query (if each operator created a new list).

var filteredBooks =
        books.Where(book => book.Title.StartsWith("R"))
             .OrderBy(book => book.Published.Year)
             .Select(book => book.Title);

Here we would have three lists created (one each by the Where, OrderBy, and Select operators). We’d only needed a single list of titles as the result, but since the operators cannot modify their underlying data source (for the reasons outlined above), they would be forced to each create a new list, and most of that work would be wasted as two of the lists are immediately discarded.

Imagine if you only needed to count the number of books whose title starts with the letter R – in that case you wouldn’t want any of these lists being created and destroyed when you computed the result. It would all be wasted work.

Laziness Isn’t Always Bad

One of the principles of LINQ is to be lazy. A LINQ query won’t do any work unless you force the query to do the work. Even when a query does perform work – it does the least amount of work possible. If you wanted a List<Book> as a result, you’d have to force LINQ to create the list:

List<Book> filteredBooks =
               books.Where(book => book.Title.StartsWith("R"))
                    .ToList();

 

Instead of lists, LINQ works with a beautifully pure abstraction called IEnumerable<T>. It’s defined like so:

public interface IEnumerable<T> : IEnumerable
{
    IEnumerator<T> GetEnumerator();
}

The only thing you can do with an IEnumerable<T> is ask for an enumerator. An enumerator is something that knows how to visit each item in a collection. Some languages call these enumerator things “iterators”, because they iterate over a collection of objects, returning each object and moving to the next.

The in-memory LINQ operators, like Where, OrderBy, and Select, all work on inputs that implement IEnumerable<T>. That means the operators work on anything that can be enumerated over in a one-by-one fashion. The beauty is that there are so many data sources that these operators can work on, because IEnumerable<T> has such simple demands. Arrays are enumerable, lists are enumerable, dictionaries, trees, stacks, queues, files in a directory, elements in an XML document - all enumerable. Even a simple string is enumerable, since it is composed from a sequence of individual characters.

These same operators return IEnumerable<T>, because it’s the lowest common denominator for everything you’d ever need from a query. Plus, it’s lazy. You want to count the results? The Count operator will get the enumerator and move through the results, one by one, to sum the total number of items. You want to make a concrete list from the results? The ToList operator will get the enumerator and move through the results, one by one, adding each item to a new list it creates. Do you want just the first item in the results? Then the enumerator does just a little bit of work to find that first item. In most cases it does not need to iterate the entire collection to find the first item. Enumerators are lazy, too.

The important point is that the enumerator itself doesn’t perform any useful work. It’s you, or the other LINQ operators, that use the enumerator to iterate through the result and produce something meaningful. In the odd case that you never need to look at the result - no enumeration work is performed at all. No lists area created. Pure laziness!

Summary

The beauty of IEnumerable<T> is that it only says “you can get something to enumerate this”. To return something that offers the possibility of enumeration is very little work. And no work is needed unless you actually count the results, create a list from the results, or bind the results to a control for display.

The interface IEnumerable<T> is so wonderfully lazy it inspired me to write a short, short story. If you came to read about LINQ, skip the story as the words are entirely uninteresting and mostly devoid of meaning. 

The Lazy Leopard and the List

lazy leopardThe scientist approached the big cat with a notepad and a pencil in her hands. She was worried, of course. The cat was a predator, and likely to be hungry at this hour of the day. “I need to know”, she asked the cat, “do you keep a list of things to do each day?”

The cat stirred. He was a snow leopard with dark rosettes blotted onto his thick, cream colored fur. The big cat’s eyes were only half open, but he turned and focused them on her.

“I don’t keep a list, dear lady”, he said, followed by a rumbling yawn. “I keep an enumeration”.

“An enumeration?”, she asked.

“Yes, an enumeration”, he replied. “Lists are like the gold bracelet on your wrist, dear lady. Very tangible – very concrete things, lists are. Keeping a list of everything I might want to do is a burden and chore. I’d need to carry your paper, and your pencil”, he said, with his eyes focusing on her hands.

The scientist’s pencil raced across her notebook as she transcribed every word the leopard spoke. She glanced at him as he began to stare, and instinctively pulled herself a little further away.

The leopard continued. “With a list I’d have to add things, and remove things, and constantly reorder the things I want to do. Too much work”, he said, shaking his head. “Do you know what I can do with an enumeration?”, he asked.

She paused at the leopard’s question, and pushed her hair back  - she wore glasses when she worked. After some thought, she asked, “Enumerate it?”

“Yes, dear lady”, said the leopard. “I can enumerate it. I enumerate the possibilities one by one, and find the perfect fit for this moment in my life. If I’m thirsty, I’ll find water. If I’m sleepy, I’ll find a place to sleep.” He tilted his head slightly to the right. “If I’m hungry, I’ll find food”, he said.

She finished writing the leopard’s last words and glanced up. Was that a tooth showing? Was he hungry now?

The cat started speaking again.

“One of the wonderful things about enumerations is they theoretically last forever. Lists have a beginning and an end – an Omega for every Alpha. With enumerations, you can keep asking for the next thing, over and over and over again. I ask for them when I’m ready to do something. If I’m tired of doing, they’ll still be there tomorrow. You might say it’s unpredictable behavior, I say I’m just being lazy. Either way, I can’t help it, it’s in my genes”. His soft voice trailed off with a tired tone.

“You intended to live forever?”, she asked. The leopard snarled. Or smiled. She couldn’t quite tell.

“No, dear lady”, he said. “I said the enumeration theoretically lasts forever. One day I’m sure my enumeration will run out of things to give me, or maybe I’ll just be too tired to ask for the next thing, so I’ll sleep forever. I don’t know how it ends. Maybe I should ask you.”

She looked at him again. She felt uneasy now, being here with a leopard. He seemed nice enough, as leopards go, and he certainly gave her interesting topics for research, but he was still a leopard. A carnivore. He was not a beast to be trifled with. She could never let her guard down again.

“I don’t know how it ends either”, she said, and closed her notepad. She tucked her pencil behind her ear, backed away from the cage, and left the leopard alone with his enumeration.


Comments
Rabidtommy Thursday, October 2, 2008
What a touching story *wipes tear*
meisinger Friday, October 10, 2008
the title of the picture should be: "the lazy scream"

nice post
nice art
nice

Gokhan Demir Wednesday, October 22, 2008
like the story :) thank you, really very good post !
Comments are now closed.
by K. Scott Allen K.Scott Allen
My Pluralsight Courses
The Podcast!