Iterators

Documentation

  1. Iterators in the Python Tutorial
  2. iterator and iterable in the Python Glossary
  3. Iterators in the Functional Programming HOWTO
  4. Iterator types in the Python Standard Library
    1. __iter__
    2. __next__
  5. Emulating container types in the Python Language Reference
    1. __iter__
  6. Standard Library functions and exception:
    1. iter
    2. next
    3. StopIteration

Iterables

A for loop can loop through values of many types:

s = "hello"

for c in s:                  #s is an object of class string
    print(c)
myList = [20, 10, 30]

for item in myList:          #myList is an object of class list
    print(item)
myDict = {
    "yes":   "maybe",        #In the real world, the real meaning of "yes" is "maybe".
    "maybe": "no"
}

for key in myDict:           #myDict is an object of class dict
    print(key, myDict[key])
r = range(10) #The object returned by the range function is of class range.

for i in r:
    print(i)
import sys

#The object returned by the open function is of class io.TextIOWrapper,
#assuming that the file is a text file.

try:
    lines = open(filename)
except BaseException as error:
    print(error, file = sys.stderr)
    sys.exit(1)

for line in lines:
    print(line)

lines.close()
import sys
import urllib.request

url = "http://oit2.scps.nyu.edu/~meretzkm/python/string/romeo.txt"

try:
    lines = urllib.request.urlopen(url)
except urllib.error.URLError as error:
    print(error, file = sys.stderr)
    sys.exit(1)

for line in lines:   #line is a sequence of bytes.
    try:
        s = line.decode("utf-8") #Convert sequence of bytes to string of characters.
    except UnicodeError as error:
        print(error, file = sys.stderr)
    sys.exit(1)

    print(s, end = "")           #s already ends with a newline.

lines.close()

A value that you can loop through with a for loop is called an iterable, although I think a better name would be a forable.

#Test if a value is iterable.

infile = open(filename)

try:
    it = iter(infile)
except TypeError:
    print("The value is not iterable.")
else:
    print("The value is iterable.")

for loop

The following for loop iterates through a range of ints. We will build a loop that iterates through a range of fractions.

for i in range(10):
    print(i)
0
1
2
3
4
5
6
7
8
9

What the for loop really does with the above range(10)

The following call to the iter function returns an object of class range_iterator. A range_iterator is an example of an iterator. Since the iter function returned a valid iterator, we say that the range object is iterable.

it = iter(range(10))

while True:
    try:
        i = next(it)
    except StopIteration:
        break;

    print(i)

The iter function does its work by calling the __iter__ method. The next function does its work by calling the __next__ method. The above code therefore does the following.

it = range(10).__iter__()

while True:
    try:
        i = it.__next__()
    except StopIteration:
        break;

    print(i)

Create our own range

The following MyRange is an iterable because it has an __iter__ method. The following MyRange_iterator is an iterator because it has a __next__ method.

This simple MyRange requires all three arguments (start, stop, step), and can only count upwards. I should have error checked the arguments of the methods to make sure that start, stop, step are ints, and to make sure that stop ≥ start, etc.

class MyRange(object):
    def __init__(self, start, stop, step):
        self.start = start
        self.stop = stop
        self.step = step

    def __iter__(self):
        return MyRange_iterator(self.start, self.stop, self.step)


class MyRange_iterator(object):
    def __init__(self, start, stop, step):
        self.i = start
        self.stop = stop
        self.step = step

    def __next__(self):   #takes no arguments except self
        if self.i >= self.stop:
            raise StopIteration
        nextValue = self.i
        self.i += self.step
        return nextValue


for i in MyRange(0, 10, 1):
    print(i)

print()

if 7 in MyRange(0, 10, 1):
    print("7 is in the range.")
0
1
2
3
4
5
6
7
8
9

7 is in the range.

Two loops at the same time

The outer for loop creates an iterator for this range. Then the inner for loop creates another iterator for the same range. Since each iterator contains its own i, the first iterator can keep holding 0 while the second iterator runs through the ints from 0 to 9 inclusive.

r = MyRange(0, 10, 1)

for outer in r:
    print(f"{outer} is just one of the items in the range ", end = "")
    for inner in r:
        print(inner, end = " ")
    print()
0 is just one of the items in the range 0 1 2 3 4 5 6 7 8 9
1 is just one of the items in the range 0 1 2 3 4 5 6 7 8 9
2 is just one of the items in the range 0 1 2 3 4 5 6 7 8 9
3 is just one of the items in the range 0 1 2 3 4 5 6 7 8 9
4 is just one of the items in the range 0 1 2 3 4 5 6 7 8 9
5 is just one of the items in the range 0 1 2 3 4 5 6 7 8 9
6 is just one of the items in the range 0 1 2 3 4 5 6 7 8 9
7 is just one of the items in the range 0 1 2 3 4 5 6 7 8 9
8 is just one of the items in the range 0 1 2 3 4 5 6 7 8 9
9 is just one of the items in the range 0 1 2 3 4 5 6 7 8 9

Pass an iterator to a for loop

We already know that this works:

for i in MyRange(0, 10, 1):
    print(i)

Let’s make this work too:

it = iter(MyRange(0, 10, 1))

for i in it:
    print(i)
Traceback (most recent call last):
  File "/Users/myname/python/myprog.py", line 26, in <module>
    for i in it:
TypeError: 'MyRange_iterator' object is not iterable

Simply add the following trivial __iter__ method to class MyRange_iterator. This makes the class iterable.

class MyRange(object):
    def __init__(self, start, stop, step):
        self.start = start
        self.stop = stop
        self.step = step

    def __iter__(self):
        return MyRange_iterator(self.start, self.stop, self.step)


class MyRange_iterator(object):
    def __init__(self, start, stop, step):
        self.i = start
        self.stop = stop
        self.step = step

    def __iter__(self):
        return self

    def __next__(self):   #takes no arguments except self
        if self.i >= self.stop:
            raise StopIteration
        nextValue = self.i
        self.i += self.step
        return nextValue


for i in MyRange(0, 10, 1):
    print(i)

A range of fractions

self.end - self.start is the distance from start to end.
self.i / self.n is the fraction of that distance that has already been covered.
It is a fraction in the range from 0 to 1 inclusive.

class FloatRange(object):
    "A range of n+1 equally spaced floats."

    def __init__(self, start, end, n):
        assert isinstance(n, int)
        self.start = start
        self.end = end
        self.n = n

    def __iter__(self):
        return FloatRange_iterator(self.start, self.end, self.n)


class FloatRange_iterator(object):
    def __init__(self, start, end, n):
        self.start = start
        self.end = end
        self.n = n
        self.i = 0

    def __iter__(self):
        return self

    def __next__(self):
        if self.i >= self.n + 1:
            raise StopIteration
        result = self.start + (self.end - self.start) * self.i / self.n
        self.i += 1
        return result


for f in FloatRange(0.0, 1.0, 10):
    print(f)

print()

if .7 in FloatRange(0.0, 1.0, 10):
    print(".7 is in the range.")
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0

.7 is in the range.

Define the new classes in a separate module.

The following module lets the user write float.range(0.0, 1.0, 10) instead of FloatRange(0.0, 1.0, 10).

"""
This module is float.py.
"""

class range(object):
    "A range of n+1 equally spaced floats, from start to end inclusive."

    def __init__(self, start, end, n):
        if not isinstance(start, int) and not isinstance(start, float):
            raise TypeError(f"start must be int or float, not {type(start)}")
        if not isinstance(end, int) and not isinstance(end, float):
            raise TypeError(f"end must be int or float, not {type(end)}")
        if end <= start:
            raise ValueError("start must be > end")
        if not isinstance(n, int):
            raise TypeError(f"n must be int, not {type(n)}")
        if n <= 0:
            raise ValueError(f"n must be positive, not {n}")
        self.start = start
        self.end = end
        self.n = n

    def __iter__(self):
        return iterator(self.start, self.end, self.n)


class iterator(object):
    def __init__(self, start, end, n):
        self.start = start
        self.end = end
        self.n = n
        self.i = 0

    def __iter__(self):
        return self

    def __next__(self):
        if self.i >= self.n + 1:
            raise StopIteration
        result = self.start + (self.end - self.start) * self.i / self.n
        self.i += 1
        return result


if __name__ == "__main__":
    import sys
    for f in range(0.0, 1.0, 10):   #the range we just defined here in float.py
        print(f)
    sys.exit(0)

Test out the module before you try to import it.

python3 -m float
import sys
import float

for i in range(10):                 #the range in the Python Standard Library
    print(i)

print()

for f in float.range(0.0, 1.0, 10): #the range we defined in float.py
    print(f)

sys.exit(0)
0
1
2
3
4
5
6
7
8
9

0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0

Things to try

  1. Instead of implementing this float.range class as shown above, can you use one of Python’s more exotic iterators such as itertools.count?
  2. Another iterable:

    week2.py         Please rename this file week.py.

    import sys
    import datetime
    import week
    
    startingDate = datetime.date(2019, 12, 31) #startingDate is an object of class datetime.date.
    
    for d in week.range(startingDate):         #d is an object of class datetime.date.
        print(d.strftime("%a %Y-%m-%d"))
    
    sys.exit(0)
    
    Tue 2019-12-31
    Wed 2020-01-01
    Thu 2020-01-02
    Fri 2020-01-03
    Sat 2020-01-04
    Sun 2020-01-05
    Mon 2020-01-06
    Tue 2020-01-07
    
  3. Create an iterable for looping through the lines read from a text file. The iterable should open and close the file.
    for line in lines(filename):
        print(line)
    
  4. Create an iterable for looping through the words in a string.
    """
    This module is words.py
    """
    
    class Words:
        "Iterate through the whitespace-separated words in a string."
        def __init__(self, s):
            if not isinstance(s, str):
                raise TypeError(f"s must be of type str, not {type(s)}")
            self.s = s
    
        def __iter__(self):
            return Words_iterator(self.s)
    
    
    class Words_iterator:
        def __init__(self, s):
            self.words = s.split()
    
        def __iter__(self):
            return self
    
        def __next__(self):
            if len(self.words) == 0:
                raise StopIteration
            return self.words.pop(0) #Remove and return the first item in the list.
    
    
    if __name__ == "__main__":
        import sys
        s = "We hold these truths to be self-evident"
        for word in Words(s):
            print(word)
        sys.exit(0)
    
    import sys
    import words
    
    s = "We hold these truths to be self-evident"
    
    for word in words.Words(s):
        print(word)
    
    sys.exit(0)
    
    We
    hold
    these
    truths
    to
    be
    self-evident