4.5. List comprehension

List comprehension is another way of doing loops.

4.5.1. Basic comprehensions

Frequently we go through a loop with the goal of building a list of results. This looks like this:

result_list = []
for x in sequence:
      result_list.append(transform(x))

A list comprehension lets you do all this in one line:

result_list = [transform(x) for x in sequence]

This means: Set result_list to a list that consists of the result of applying transform to each x in sequence.

So for example, to add 2 to every member of the list M:

>>> M = [2, 5, 6, 7, 12]
>>> result = [x + 2 for x in M]
>>> result
[4,7,8,9,14]

4.5.2. Conditional comprehensions

Python also allows you to add conditions in list comprehensions, preventing unwanted list members from generating results. We introduced the concept of doing filtering with for loops. The simplest cases of filtering can be done as list comprehensions. Recall the following for loop, which filtered the non-positive numbers from a list of numbers:

>>> nums = [0.3, -.15, 18, -7, 212.1, 0]
>>> result = []
>>> for x in nums:
       if x > 0:
          result.append(x)
>>> result
[0.3,18,212.1]

As a list comprehension this becomes:

>>> result = [x for x in nums if x > 0]
>>> result
[0.3,18,212.1]

As a second example, let’s add 2 to only the even members of the list M used above:

>>> M = [2, 5, 6, 7, 12]
>>> result = [x + 2 for x in M if x % 2 == 0]
>>> result
[4,8,14]

4.5.3. Nested Comprehensions

You can get the effect of nested loops with list comprehensions too.

Let’s return to the example of representing Sudoku squares. We computed the possible squares in a sudoku puzzle above with a nested for loop. Nested list comprehensions are easier and more natural:

rows     = 'ABCDEFGHI'
cols     = '123456789'
squares = [r+c for r in rows for c in cols]

4.5.4. Examples from Norvig’s sudoku solver

For the next example, let’s continue considering Sudoku puzzles.

The units of a square S are the collections of squares it belongs to that can’t have a value identical to S. The peers of S are the squares in its units, excepting S. Any square S has exactly 3 units and 20 peers. For example, here are the units of C2:

Col: A2,B2,C2,D2,E2,F2,G2,H2,I2

Row: C1 C2 C3 C4 C5 C6 C7 C8 C9

Box:

A1 A2 A3
B1 B2 B3
C1 C2 C3

The peers set is the union of the units minus C2. Let’s collect all the units. From that we’ll build up a dictionary mapping each square to its units.

Column units:

rows     = 'ABCDEFGHI'
cols     = '123456789'
col_units = [[r+c for r in rows] for c in cols]

This gives:

>>> col_units
[['A1', 'B1', 'C1', 'D1', 'E1', 'F1', 'G1', 'H1', 'I1'],
 ['A2', 'B2', 'C2', 'D2', 'E2', 'F2', 'G2', 'H2', 'I2'],
 ['A3', 'B3', 'C3', 'D3', 'E3', 'F3', 'G3', 'H3', 'I3'],
 ['A4', 'B4', 'C4', 'D4', 'E4', 'F4', 'G4', 'H4', 'I4'],
 ['A5', 'B5', 'C5', 'D5', 'E5', 'F5', 'G5', 'H5', 'I5'],
 ['A6', 'B6', 'C6', 'D6', 'E6', 'F6', 'G6', 'H6', 'I6'],
 ['A7', 'B7', 'C7', 'D7', 'E7', 'F7', 'G7', 'H7', 'I7'],
 ['A8', 'B8', 'C8', 'D8', 'E8', 'F8', 'G8', 'H8', 'I8'],
 ['A9', 'B9', 'C9', 'D9', 'E9', 'F9', 'G9', 'H9', 'I9']]

Row units:

row_units = [[r+c for c in cols] for r in rows]

The box units are the trickiest. You want to do a nested iteration that doesnt involve all the rows and all the columns, just a group of three at a time. The code looks like this, but it is definitely getting hard to read:

box_units = [[l+n for l in lets for n in nums]
               for lets in ('ABC','DEF','GHI')
                 for nums in ('123','456','789')]

The way to read this is to start with one value for the loopvar lets, and pair it with one value for the loopvar nums. The first value for lets. will be ‘ABC’. The first value for nums will be ‘123’. We then execute:

[l+n for l in lets for num in nums]

with those values for lets and nums, that is:

[l+n for l in 'ABC' for n in '123']

This produces:

['A1', 'A2', 'A3', 'B1', 'B2', 'B3', 'C1', 'C2', 'C3'],

which is the first valid box unit. So each pairing of an lets element and a nums element gives us a different valid box unit. With 9 possible pairings of elements of lets and nums, we get all 9 possible box units, that is:

[['A1', 'A2', 'A3', 'B1', 'B2', 'B3', 'C1', 'C2', 'C3'],
 ['A4', 'A5', 'A6', 'B4', 'B5', 'B6', 'C4', 'C5', 'C6'],
 ['A7', 'A8', 'A9', 'B7', 'B8', 'B9', 'C7', 'C8', 'C9'],
 ['D1', 'D2', 'D3', 'E1', 'E2', 'E3', 'F1', 'F2', 'F3'],
 ['D4', 'D5', 'D6', 'E4', 'E5', 'E6', 'F4', 'F5', 'F6'],
 ['D7', 'D8', 'D9', 'E7', 'E8', 'E9', 'F7', 'F8', 'F9'],
 ['G1', 'G2', 'G3', 'H1', 'H2', 'H3', 'I1', 'I2', 'I3'],
 ['G4', 'G5', 'G6', 'H4', 'H5', 'H6', 'I4', 'I5', 'I6'],
 ['G7', 'G8', 'G9', 'H7', 'H8', 'H9', 'I7', 'I8', 'I9']]

We will try rewriting the loop above once we have introduced functions, because that will make the answer much more readable.

For now we construct all units:

units = col_units + row_units + box_units

Observe that each unit is a container of squares. To associate a square S with its units, we just loop through all the units to find those that contain S. The following code snippet builds a dictionary associating each square with its 3 units:

units = dict([(s, [u for u in unitlist if s in u])
              for s in squares])

Recall that dict can be used as a dictionary creator, in particular, to build a dictionary from a list of pairs. The following list comprehension builds the list of (square, unitlist) pairs used by dict above:

[(s, [u for u in unitlist if s in u]) for s in squares]

Note how this works: We build a list of square, unitlist pairs, such that each square is associated with any unit it occurs in.

4.5.5. Set comprehension

Consider another simple example of list comprehension:

>>> L = [x for x in 'abracadabra' if x < 'e']
>>> L
['a', 'b', 'a', 'c', 'a', 'd', 'a', 'b', 'a']

This tells us how many characters in the string “abracadabra” come before “e” in the alphabet. But suppose we want to find the characters “abcd” in a string, and we don’t care how many instances there are. Then we might remove duplicates. We could do this by turning L into a set:

>>> S = set(L)
>>> S
set(['a', 'c', 'b', 'd'])

But Python offers a more efficient and more direct route: set comprehension. Just write a comprehension expression using { } to specify that a set is desired:

>>> S = {x for x in 'abracadabra' if x < 'e'}
>>> S
set(['a', 'c', 'b', 'd'])