Demystifying Yield return in C#

Despite some concepts in program languages being not new, I think it is crucial to revisit and share knowledge on specific features of C# in order to help new developers who are starting their carrier recently. Even experienced developers may not be familiar with some subjects, such as the correct use of async/await and others, which is pretty standard.

In this short article, I’d like to talk about the yield return in C#, a topic that is not that clear for many developers. First of all, let me give some code as an example, what we usually find in day-to-day development tasks:

class Program    {
    static void Main(string[] args)        
    {
        SyncFlightTickets();        
    }
    
    static void SyncFlightTickets()        
    {
        var tickets = GetFlightTickets();
        
        foreach(var ticket in tickets)            
        {
            Console.WriteLine($"Ticket {ticket.Id}");            
        }        
    }
    
    static IEnumerable<FlightTicket> GetFlightTickets()        
    {
        List<FlightTicket> flightTickets=newList<FlightTicket>();
        
        for(int i = 0; i < 20000; i++)            
        {
            flightTickets.Add(newFlightTicket() 
            { 
                Company = "Test", 
                Id = i, 
                Passenger=$"Passsenger test {i}" 
            });            
            }
            
            return flightTickets;        
         }    
     }
     
     class FlightTicket    
     {
         public int Id { get; set; }
         public string Passenger { get; set; }
         public string Company { get; set; }    
     }

In this Console Application there is a method called SyncFlightTickets that has the following operations:

Get the list of flight tickets (20.000 tickets)
After getting the list, it loops through each ticket and prints in the Console

This is a pretty standard way to implement usual systems:

Retrieve a list from a database or service
Loop each of the items and make some operation

I’d say 99% of the systems any developer worked on have something similar to that. But, is there any problem with the implementation? I’d say it depends on the number of items on the list that the loop is iterating and, it is conditionally dependent on what is done within each iteration for each item. When we are talking about millions of records and heavy processes being applied for each record within a loop, it can be really problematic in terms of memory usage and performance in general. In that case, what would be the alternative? The use of “yield return” is one good option, however, it is not the only one.

Refactoring our code to use “yield,” it would look like as follows:

static IEnumerable<FlightTicket> GetFlightTickets()        
{
    for(int i = 0; i < 20000; i++)            
    {
        yield return new FlightTicket() 
        { 
            Company = "Test", 
            Id = i, 
            Passenger = $"Passsenger test {i}" };            
        }        
    }

Note that an instance of List<FlightTicket> is not created at the beginning of the method anymore. Instead, the yield return statement is used, returning a single instance of FlightTicket object. What does that mean?

The “tickets” object will not contain the 20.000 tickets. Each time an iteration is done on the foreach statement, the GetFlightTickets method returns the next ticket for the iteration, as you can see in the following image:

At the first moment, the “tickets” collection does not have the actual items. However, if we keep debugging in the foreach loop, something magical will happen:

The breakpoint jumped from SyncFlightTickets method to the yield return statement on the other method (GetFlightTicket), which is returning a single ticket. It means that for each ticket on the list being used by the parent method (GetFlightTickets), only the current item on the list is going to be returned by GetFlightTickets method, one by one, keep only one item in the memory, it does not matter how many items the full collection would have.

If we take a look into the tickets list after the second iteration, that’s the result:

The tickets list contains only one item, which is the ticket Id number “1”, therefore the second item in the list considering it starts from Id number “0” in our example. That’s fantastic. We should use that more often in our applications when we have to loop through a large number of items.

Thank you for reading this quick article until the end. If you have any questions or comments, please leave it here. I’ll be glad to chat about it. I’m glad to announce that I have my first book published. It is a deep dive hands-on through the most common Design Patterns used in .NET applications. The book contains hundreds of code samples and explanations based on real-world scenarios. It also has many examples on Object-Oriented Programming, SOLID principles, and all the path to get yourself familiar with .NET 5 and C#.

Source: Medium - Alexandre Malavasi

The Tech Platform

Demystifying Yield return in C#

Recent Posts

Share

Learn

Ask

Contact US