4.4. Loops

Albert Einstein is supposed to have once said: “The definition of insanity is doing the same thing over and over again and expecting different results”.

But a very important kind of computation works like this: To compute x, do the same thing many times, each time expecting a slightly different result. The last time, the result is x. This important – in fact, indispensable – kind of computation is called looping. More formally, since it involves executing the same statements over and over again, it’s called iteration or just iterating. [1] Thus, if Einstein has the definition right (and who’s going to argue with Albert Einstein?), then computation is the height of insanity. (well, at least looping is, and looping is an indispensable part of computing many things)

Now how does this work, if it does? The answer is that in looping what we do has the effect of changing the computational context. As a result, each time, the same action has a different result,

An example: Consider adding together a sum of numbers. Let’s call the first step Step 0. Here’s Step 0:

x = 0
L = [1,2,3,4]

After Step 0, each step in the computation will look like this:

x = x + L[0]
L = L[1:]

The key idea is that each time we do this we change the value of x and L; each time we add a number to x , the value of x changes. Ultimately, the value of x will be the sum we want to compute. We could write this as follows:

   x = x + L[0]
   L = L[1:]

Where the idea is: just keep doing this over and over. As a result, we would have:

\text{step} & x & \text{L}\\
0 & 0 &  [1,2,3,4]\\
1 & 1 &  [2,3,4]\\
2 & 3 &  [3,4]\\
3 & 6 &  [4]\\
4 & 10 & []

The insight: breaking down a problem into a series of simple repetitive steps greatly increases the power of our problem solving tools. A great deal of programming has to do with deciding how to formulate your problem as a series of repeated actions, each performed in a slightly changed context.

This is the idea of a loop.

4.4.1. While loops

The example above used a block of code we could call a repeat block. Python doesn’t actually have a repeat block. One issue is that this kind of block doesnt tell you how to stop. Getting the right answer in the example above depended on knowing the exact length of L. The computation ended at Step 4 because L was exactly 4 elements long. But suppose we had gone on to a step 5

\text{step} & x & \text{L}\\
0 & 0 &  [1,2,3,4]\\
1 & 1 &  [2,3,4]\\
2 & 3 &  [3,4]\\
3 & 6 &  [4]\\
4 & 10 & []\\
5 & \text{IndexError} &  \\

That is, when we try to do:

x = L[0]

on the empty list, we get an IndexError. Thus, we want a construction that implements repeating, but includes a condition telling us when to stop. This is the idea of a while loop. One way to write the computation above in standard Python is:

1>>> x = 0
2>>> L = [1,2,3,4]
3>>> while len(L) > 0:
4      x = x + L[0]
5      L = L[1:]

What follows the Python keyword while is a test (as in an if clause). As long as that test succeeds, we keep looping. On each iteration, L necessarily gets smaller. When L is empty, its length is 0, the test for re-entering the loop fails, and we stop. At that point, x has the value we want:

>>> x

4.4.2. For loops

Next we introduce the kind of loop we will use most often: looping through a container using a for loop.

Note that in order for while loop used above to work, we had to keep re-setting the value of L to make L shorter. But all we were trying to do was to loop through the contents of a container, doing the same thing to each element. Python provides a much more natural idiom for this, the for loop. Using a for loop the example above becomes:

>>> x = 0
>>> L = [1,2,3,4]
>>> for y in L:
      x = x + y

Here is another example that shows each step:

>>> L = [3,2,1,4]
>>> for x in L:
       print x

This results in the following output:


So x is successively set to each member of the list, in the order in which they occur in the list, and each time its value is printed. We call x the loop variable. The rules for what can be a loop variable are the same as they are for variables in general. All the names that can be variables can be loop variables. And just as pairs of variables can occur on the left hand side of an assignment:

(x,y) = (2,3)

so pairs of variables can be loop vars:

>>> L = [('a',3),('b',2),('c',1),('d',4)]
>>> for (let,num) in L:
       print let, num

This results in the following output:

a 3
b 2
c 1
d 4

The same principles works for all containers, lists, tuples, strings, sets. For example:

>>> S = set([3,2,1,4])
>>> for x in S:
        print x

1 2 3 4

Note that the order of the elements in the list used to construct S has no effect on how S is printed out. In general, the order in which the elements of a set will be iterated through is unpredictable, and should not be relied on.

A for loop will also work with a dictionary, but in this case, the loop variable will be set to the keys of the dictionary. Again, this happens in no particular order:

1>>> dd = dict(b=0,a=1,d=2,c=3)
2>>> for x in dd:
3        print x

Note that this is consistent with how we use in as an operator in dictionaries:

>>> x in dd

tests whether x is a key of dd.

>>> b in dd
>>> 0 in dd

Suppose we want to print out both the keys and the values of the dictionary. One way to do that is this:

1>>> dd = dict(b=0,a=1,d=2,c=3)
2>>> for x in dd:
3        print x, dd[x]
5a 1
6c 3
7b 0
8d 2

A more convenient way is this:

1>>> dd = dict(b=0,a=1,d=2,c=3)
2>>> for (key,val) in dd.items():
3        print key, val
5a 1
6c 3
7b 0
8d 2

This works because items is a method on dictionaries that produces a list of pairs of the keys and values:

>>> items_list = dd.items()
>>> items_list
[('a', 1), ('c', 3), ('b', 0), ('d', 2)]

That is, items_list is a list of tuples (each of length 2).

4.4.3. Filtering with for loops

One of the basic uses of loops is to collect the elements of a list that satisfy some condition. So we combine a Boolean test with a for loop.

For example let’s collect the positive numbers from a list of numbers:

1>>> nums = [0.3, -.15, 18, -7, 212.1, 0]
2>>> result = []
3>>> for x in nums:
4       if x > 0:
5          result.append(x)
6>>> result
7[0.3, 18, 212.1]

4.4.4. Nested loops

One of the most important uses of loops is to consider all possible pairings of different sequences.

Consider the following example, taken from PeterNorvig’s sudoku page. The idea is to represent some facts about sudoku puzzles. These are Norvig’s naming conventions for the 81 squares in a sudoku puzzle:

\text{rows} & \text{A, B, C, D, E, F, G, H} \\
\text{cols} & \text{1, 2, 3, 4, 5, 6, 7, 8, 9}

He names squares as follows:

 1 A1 A2 A3| A4 A5 A6| A7 A8 A9
 2 B1 B2 B3| B4 B5 B6| B7 B8 B9
 3 C1 C2 C3| C4 C5 C6| C7 C8 C9
 5 D1 D2 D3| D4 D5 D6| D7 D8 D9
 6 E1 E2 E3| E4 E5 E6| E7 E8 E9
 7 F1 F2 F3| F4 F5 F6| F7 F8 F9
 9 G1 G2 G3| G4 G5 G6| G7 G8 G9
10 H1 H2 H3| H4 H5 H6| H7 H8 H9
11 I1 I2 I3| I4 I5 I6| I7 I8 I9

Given these conventions, the set of possible squares is computed simply by pairing each of the rows with each of the columns. This can be done with a double loop:

 1>>> rows     = 'ABCDEFGHI'
 2>>> cols     = '123456789'
 3>>> squares = []
 4>>> for row in rows:
 5       for col in cols:
 6           squares.append( row + col)
 7>>> squares
 8['A1', 'A2', 'A3', 'A4', 'A5', 'A6', 'A7', 'A8', 'A9',
 9 'B1', 'B2', 'B3', 'B4', 'B5', 'B6', 'B7', 'B8', 'B9',
10 'C1', 'C2', 'C3', 'C4', 'C5', 'C6', 'C7', 'C8', 'C9',
11 'D1', 'D2', 'D3', 'D4', 'D5', 'D6', 'D7', 'D8', 'D9',
12 'E1', 'E2', 'E3', 'E4', 'E5', 'E6', 'E7', 'E8', 'E9',
13 'F1', 'F2', 'F3', 'F4', 'F5', 'F6', 'F7', 'F8', 'F9',
14 'G1', 'G2', 'G3', 'G4', 'G5', 'G6', 'G7', 'G8', 'G9',
15 'H1', 'H2', 'H3', 'H4', 'H5', 'H6', 'H7', 'H8', 'H9',
16 'I1', 'I2', 'I3', 'I4', 'I5', 'I6', 'I7', 'I8', 'I9']

We return to Norvig’s Sudoku code in future examples.

4.4.5. A better example of a while loop

As we saw when we got to for loops, if what we’re doing involves iterating through the contents of a container, a for loop is the right way to go. But while loops remain indispensable. They are the right tool when the number of times we have to iterate is governed by some logical condition. Suppose we have a game in which the user picks a number between 0 and 100 and the computer has to guess it. For each guess, the user must respond in one of three ways: either “correct” or “higher” or “lower”. The game ends only after a correct answer. Here is the code:

 1correct = False
 2low,high = (0,100)
 3while not correct:
 4   guess = low + (high - low)/2
 5   msg = 'Is it {0}?\n'.format(guess)
 6   response = raw_input(msg).lower()
 7   if response.startswith('correct'):
 8      print 'I won!'
 9      correct = True
10   elif response.startswith('lower'):
11      high = guess
12   elif response.startswith('higher'):
13      low = guess
14   else:
15      print 'I dont understand your response. Let me try again.'

Let’s take this line by line:

  1. Set the correct variable to False, the value that keeps the game going. We will flip it to True when the game is over.

  2. We keep track of the highest lower bound and the lowest upper bound to keep the narrowing the range of possible guesses. We start with the initial conditions of the game: low=0, high=100.

  3. Enter the while if correct is False.

  4. Line 4 computes a guess that is half way between the currently known low and high.

  5. Line 5 sets the variable msg to the string we print out to the user announcing our guess. The 0 is a slot in the msg string that will be filled in by the user’s guess and Line 5 fills that slot in with the current guess.

  6. The Python builtin raw_input prints out a message to the user and returns the user’s response, which is considered to have ended when the user types [Enter].

  7. The branches in the if-clause beginning on this line, check that the response is valid and reset the low/high boundaries appropriately.

  8. Crucially, in the case where the guess is correct, the name correct is set to :True, so that the test for re-entering the loop fails; as desired sinc ethe game is now over, the loop will be exited.

4.4.6. Shortcuts for updating var values

Consider the code for summing the elements of L again:

>>> x = 0
>>> L = [1,2,3,4]
>>> for y in L:
      x = x + y

The kind of thing being done in this line is very common:

x = x + L[0]

We are continuously resetting the value of x with an addition. Python allows this to be more compactly written simply as:

x += L[0]

This is almost equivalent to the line above (we discuss an importat difference below); in particular. it is an error if x is undefined. So we still need to initialize x to 0, and the while loop example from above becomes a bit shortened:

1>>> x = 0
2>>> L = [1,2,3,4]
3>>> while len(L) > 0:
4      x += L[0]
5      L = L[1:]

There are a number of shortcut assignment operators Parallel to +=:

\multicolumn{3}{c|}{\text{Shortcut}} & \text{Meaning}\\
\rm{x} \,&\, {\scriptstyle +}\!\!= \,&\, \rm{y}  & \rm{x} = \rm{x} + \rm{y}\\
\rm{x} \,&\, {\scriptstyle -}\!\!= \,&\, \rm{y}  & \rm{x} = \rm{x} - \rm{y}\\
\rm{x} \,&\, {\scriptstyle *}\!\!= \,&\, \rm{y}  & \rm{x} = \rm{x} * \rm{y}\\
\rm{x} \,&\, {\scriptstyle /}\!\!= \,&\, \rm{y}  & \rm{x} = \rm{x} / \rm{y}\\
\rm{x} \,&\, {\scriptstyle \%}\!\!= \,&\, \rm{y}  & \rm{x} = \rm{x} \% \rm{y}\\
\rm{x} \,&\, {\scriptstyle **}\!\!= \,&\, \rm{y}  & \rm{x} = \rm{x} \,*\!*\, \rm{y}\\
\rm{x} \,&\, {\scriptstyle /\!/}\!\!= \,&\, \rm{y}  & \rm{x} = \rm{x}\,/\!/\,  \rm{y}\\
x \,&\, {\scriptstyle -}\!\!= \,&\, y  & x = x - y\\
x\,&\, { *}\!\!= \,&\,y  & x = x * y \\
x\,&\, {\scriptstyle /}\!\!= \,&\,y  &  x = x/y \\
x\,&\, {\scriptstyle \%}\!\!= \,&\,y   & x = x\,{\scriptstyle \%}\, y & \text{\# Recall that \% is the remainder after division}\\
x \,&\,{ **}\!\!=\,&\, y  &  x = x\,*\!*\,y \\
x \,&\,{\scriptstyle /\!/}\!\!=\,&\, y &  x = x\,/\!/\,y & \text{\# Recall that $/\!/$ is division rounded down to an int}

4.4.7. Tuple assignments and iterating through pairs

The elements in the container in a for loop have to be appropriate for whatever is in loop var position. Thus using a tuple in loop var position when the container is not a list of tuples results in an error:

1>>> L = [3,2,1,4]
2>>> for (let,num) in L:
3       print let, num
4Traceback (most recent call last):
5  ...
6TypeError: 'int' object is not iterable

The error message may seem opaque, but it actually explains what Python is trying to do. Let x be the first thing in the container, L. When you place a tuple of names in the loop variable position, Python does a little loop within a loop: It tries to iterate through the tuple of names and through x in parallel, setting each name in the tuple of names to the corresponding element in x. But if x isn’t a container, that’s an error; and in this case it isn’t! Another way to get the same kind of error is just to try to loop through an integer, as in the following silly code snippet:

1>>> for x in 1:
2       print x
3Traceback (most recent call last):
4    ....
5TypeError: 'int' object is not iterable

This is also the error that occurs if we try to assign something that isn’t a container to a tuple of variables:

>>> (x,y) = 1
Traceback (most recent call last):
TypeError: 'int' object is not iterable

Note that the item on the right in a tuple assignment doesn’t have to be a tuple: Any container with the right number of elements will do, even if it’s not a sequence. The following piece of code is a little strange, but quite legal:

 1>>> S = set([1,2,4,3])
 2>>> (x,y,w,z) = S
 3>>> x
 5>>> y
 7>>> w
 9>>> z

Since the contents of S don’t come in any guaranteed order, we have no way of knowing what x, y, w and z individually are. All we know is that each will be a distinct element of S.

4.4.8. Loop keywords: continue and break

A common occurrence in looping through the contents of a container is that you only want to process a subset of the contents. For example, you are searching for an element with a particular property and you want to stop when you find it. Or you want to skip elements with certain properties. For cases like these, you break and continue, two Python keywords that are only useful inside loops.

For example, suppose you have the following word list:

>>> word_list = ['flow','repent','arch','singing','applesauce', 'ring']

and suppose you want to search through the list to find the first word that ends in “ing” and set the variable result to be that word. The following code does the job:

>>> for word in word_list:
       if word.endswith("ing"):
          result = word

The assignment statement result = word sets the variable result to the word you want, so that you can access it outside the loop, and the break statement terminates the loop:

>>> result

If we omitted the break statement, how would this piece of code differ? Would the value of result necessarily be the same?:

for word in word_list:
    if word.endswith("ing"):
       result = word

The answer is that the word stored in result would now be the last word in the list ending in “ing”:

1>>> for word in word_list:
2        if word.endswith("ing"):
3           result = word
4>>> result

Even if there is only one word ending in “ing”, the second loop (lines 8-10) does unnecessary work, since all the words in the list are examined, even after a suitable word is found.

The continue keyword is used for skipping elements in a container, while continuing the loop. This key word is most useful when the conditions for skipping something are simple and the conditions for processing are complicated. Then use of a continue can make the code more readable. For example, suppose we want to split a list into words beginning with letters in the first half of the alphabet and those beginning with letters in the second half, omitting uppercase words. We’re building two lists, then:

 1>>> word_list = ['box','ring','Smith','fighting']
 2>>> first_half,second_half = ([],[])
 3>>> for word in word_list:
 4        if word.istitle():
 5           continue
 6        if word[0] < 'm':
 7            first_half.append(word)
 8        else:
 9            second_half.append(word)
10>>> first_half
11['box', 'fighting']
12>>> second_half

Notice the < operator is used here to implement alphabetizing. Used on strings it refers to the most common way to order strings; used on numbers, it refers to the natural ordering of number.

Note that both continue and break raise syntax errors if used outside a loop:

>>> continue
SyntaxError: 'continue' not properly in loop