A generator function

A generator function creates and returns an iterator. This way of creating an iterator is often simpler than the way we did it previously.

Documentation

  1. Generators in the Python Tutorial
  2. Generators in the Functional Programming HOWTO
  3. generator function and generator iterator in the Python Glossary
  4. yield statement
  5. StopIteration exception

A function that prints a range of floats

Let’s build an iterable that’s like a range, but that can iterate through fractions as well as whole numbers.

The difference end-start is the distance from start to end. The fraction i/n ranges from 0 to 1 as the for loop iterates. Therefore (end-start) * i/n is the distance from the start to the next number to be printed. The function floatRange returns None.

def floatRange(start, end, n):
    "Print n+1 equally spaced numbers, from start to end inclusive."
    assert isinstance(n, int) and n > 0

    for i in range(n + 1):
        print(start + (end - start) * i / n)


floatRange(0.0, 1.0, 10)
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0

A function that yields a range of floats

A function that contains a yield statement is called a generator function. For example, the following floatRange is a generator function.

It looks like the generator function executes its for loop and then returns nothing, since the generator function has no return statement. (More precisely, it looks like the generator function returns None.) But the generator function does return a value. Moreover, the generator returns this value immediately, even before executing the for loop. The value is a certain kind of iterator called a generator iterator.

#Exactly the same function, except that I changed print to yield.

def floatRange(start, end, n):
    "Yield n+1 equally spaced numbers, from start to end inclusive."
    assert isinstance(n, int) and n > 0

    for i in range(n + 1):
        yield start + (end - start) * i / n


it = floatRange(0.0, 1.0, 10)   #it is a generator iterator.
print(f"type(it) = {type(it)}")

#Demonstrate that it is an iterator.

print(it.__next__())
print(it.__next__())
print(it.__next__())
print(it.__next__())
print(it.__next__())
print(it.__next__())
print(it.__next__())
print(it.__next__())
print(it.__next__())
print(it.__next__())
print(it.__next__())
print(it.__next__()) #This call to __next__ raises a StopIteration exception.
type(it) = <class 'generator'>
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Traceback (most recent call last):
  File "/Users/mark/python/junk2.py", line 25, in <module>
    print(it.__next__()) #This call to __next__ raises a StopIteration exception.
StopIteration

Now that we have demonstrated that the generator function floatRange returns an iterator, we can give this return value to a for loop.

def floatRange(start, end, n):
    "Yield n+1 equally spaced numbers, from start to end inclusive."
    assert isinstance(n, int) and n > 0

    for i in range(n + 1):
        yield start + (end - start) * i / n


for f in floatRange(0.0, 1.0, 10):
    print(f)
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0

So it’s very easy to make your own generator function. Just follow these two steps:

  1. Write a function such as the original floatRange that prints the desired sequence of values, and then returns.
  2. Change each print statement in the function to a yield statement.

Eager versus lazy

The function eagerFloatRange creates and returns a new list. This new list is completely created up front, before we begin to go around the bottom for loop.

def eagerFloatRange(start, end, n):
    "Return a list of n+1 equally spaced floats, from start to end inclusive."
    assert isinstance(n, int) and n > 0

    myList = []
    for i in range(n + 1):
        print(f"About to append {start + (end - start) * i / n}")
        myList.append(start + (end - start) * i / n)

    return myList


for f in eagerFloatRange(0.0, 1.0, 10):
    print(f"f = {f}")
About to append 0.0
About to append 0.1
About to append 0.2
About to append 0.3
About to append 0.4
About to append 0.5
About to append 0.6
About to append 0.7
About to append 0.8
About to append 0.9
About to append 1.0
f = 0.0
f = 0.1
f = 0.2
f = 0.3
f = 0.4
f = 0.5
f = 0.6
f = 0.7
f = 0.8
f = 0.9
f = 1.0

The following function floatRange does not create any list. It creates and returns a generator iterator. floatRange then goes around its for loop very reluctantly. It gets stuck each time it hits the yield statement, and remains stuck there until the for loop at the bottom of the program calls __next__.

def floatRange(start, end, n):
    "Yield n+1 equally spaced numbers, from start to end inclusive."
    assert isinstance(n, int) and n > 0

    for i in range(n + 1):
        print(f"About to yield {start + (end - start) * i / n}")
        yield start + (end - start) * i / n


for f in floatRange(0.0, 1.0, 10):
    print(f"f = {f}")
About to yield 0.0
f = 0.0
About to yield 0.1
f = 0.1
About to yield 0.2
f = 0.2
About to yield 0.3
f = 0.3
About to yield 0.4
f = 0.4
About to yield 0.5
f = 0.5
About to yield 0.6
f = 0.6
About to yield 0.7
f = 0.7
About to yield 0.8
f = 0.8
About to yield 0.9
f = 0.9
About to yield 1.0
f = 1.0

A generator function saves time and memory by not creating a list. And if we break out of the for loop (the for f in floatRange) loop prematurely with a break statemnt, the generator iterator will compute only the values that are actually used by the for loop.

Package the generator function in a module.

Instead of inventing the name floatRange, let’s gave the generator the same name as the existing range. To make it possible for floatdemo.py to use both ranges, we give the generator the last name float by puting it in a module named float.

One complication with having two function with the first name: float.py had to import builtins so that it could call the built-in function range (in line 24) as well as the range we created (in line 31).

float.py
floatdemo.py

0
1
2
3
4
5
6
7
8
9
10

0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0

.7 is in the float.range.

Things to try

  1. Another generator:

    week.py
    weekdemo.py

    Tue 2019-12-31
    Wed 2020-01-01
    Thu 2020-01-02
    Fri 2020-01-03
    Sat 2020-01-04
    Sun 2020-01-05
    Mon 2020-01-06
    Tue 2020-01-07
    
  2. Another generator:

    inspection.py

    """
    useinspection.py
    
    Loop through the sequence of restaurant inspections yielded by
    inspection.inspections.
    """
    
    import sys
    import inspection  #the inspection.py file I wrote
    import datetime
    import textwrap
    
    for inspection in inspection.inspections(41_320_866):   #Wo Hop, 17 Mott Street
        assert isinstance(inspection, list)   #just for documentation
        assert len(inspection) == 26
        assert int(inspection[0]) == 41_320_866
    
        #Convert the inspection date field into an object of class datetime.date
        #so we can print it nicely.
        d = datetime.datetime.strptime(inspection[8], "%m/%d/%Y").date()
        print(inspection[1], d.strftime("%A, %B %-d, %Y"))
    
        #Split the violation description field into lines of at most 80 characters.
        lines = textwrap.wrap(inspection[11], width = 80)
        for line in lines:
            print(line)
    
        print()
    
    sys.exit(0)
    
    WO HOP 17 Monday, June 19, 2017
    Cold food item held above 41º F (smoked fish and reduced oxygen packaged foods
    above 38 ºF) except during necessary preparation.
    
    WO HOP 17 Monday, June 19, 2017
    Non-food contact surface improperly constructed. Unacceptable material used.
    Non-food contact surface or equipment improperly maintained and/or not properly
    sealed, raised, spaced or movable to allow accessibility for cleaning on all
    sides, above and underneath the unit.
    
    WO HOP 17 Friday, May 25, 2018
    Food not cooked to required minimum temperature.
    
    WO HOP 17 Friday, May 25, 2018
    Non-food contact surface improperly constructed. Unacceptable material used.
    Non-food contact surface or equipment improperly maintained and/or not properly
    sealed, raised, spaced or movable to allow accessibility for cleaning on all
    sides, above and underneath the unit.
    
    WO HOP 17 Thursday, May 23, 2019
    Food contact surface not properly maintained.
    
    WO HOP 17 Thursday, May 23, 2019
    Non-food contact surface improperly constructed. Unacceptable material used.
    Non-food contact surface or equipment improperly maintained and/or not properly
    sealed, raised, spaced or movable to allow accessibility for cleaning on all
    sides, above and underneath the unit.
    

    Try another CAMIS, e.g. 50036220 for Roof at Park South.

    ROOF AT PARK SOUTH Wednesday, May 23, 2018
    Cold food item held above 41º F (smoked fish and reduced oxygen packaged foods
    above 38 ºF) except during necessary preparation.
    
    ROOF AT PARK SOUTH Wednesday, May 23, 2018
    Food not protected from potential source of contamination during storage,
    preparation, transportation, display or service.
    
    ROOF AT PARK SOUTH Wednesday, May 23, 2018
    Sanitized equipment or utensil, including in-use food dispensing utensil,
    improperly used or stored.
    
    ROOF AT PARK SOUTH Friday, September 21, 2018
    Food worker does not use proper utensil to eliminate bare hand contact with food
    that will not receive adequate additional heat treatment.
    
    ROOF AT PARK SOUTH Friday, September 21, 2018
    Non-food contact surface improperly constructed. Unacceptable material used.
    Non-food contact surface or equipment improperly maintained and/or not properly
    sealed, raised, spaced or movable to allow accessibility for cleaning on all
    sides, above and underneath the unit.
    
    ROOF AT PARK SOUTH Saturday, May 25, 2019
    Cold food item held above 41º F (smoked fish and reduced oxygen packaged foods
    above 38 ºF) except during necessary preparation.
    
    ROOF AT PARK SOUTH Saturday, May 25, 2019
    Non-food contact surface improperly constructed. Unacceptable material used.
    Non-food contact surface or equipment improperly maintained and/or not properly
    sealed, raised, spaced or movable to allow accessibility for cleaning on all
    sides, above and underneath the unit.
    
    

    Unfortunately, each time we call inspection.inspections, we have to wait for the CSV file to be downloaded from data.cityofnewyork.us. Make a smarter inspection.inspections that downloads the CSV file only the first time it is called, and stores the records in a list of approximately 400,000 items, each of which is a list of 18 items. Subsequent calls to inspection.inspections should get their data from this list.

  3. float.range is a function, but float.range(0.0, 1.0, 10) is a generator. (float.range(0.0, 1.0, 10) is the value returned by the float.range function when given those three arguments.)
    import float
    
    print("type(float.range) =", type(float.range))
    print("type(float.range(0.0, 1.0, 10)) =", type(float.range(0.0, 1.0, 10)))
    
    type(float.range) = <class 'function'>
    type(float.range(0.0, 1.0, 10)) = <class 'generator'>
    
  4. See how yield can create a context manager here.