3.4.2. Strings¶
We have already been introduced to strings as a basic data type. Now we take a look them again from a different point of view. Strings are containers. This means you can look at their insides and do things like check whether the first character is capitalized and whether the third character is “e”.
3.4.2.1. Indexing strings, string slices¶
To get at the inner components of strings Python uses the same syntax and operators as lists. The Pythonic conception is that both lists and strings belong to a ‘super’ data type, sequences. Sequence types are containers that contain elements in a particular order, so indexing by number makes sense for all sequences:
>>> X = 'dogs'
>>> X[0]
'd'
>>> X[1]
'o'
>>> X[-1]
's'
The following raises an IndexError, as it would with a 4-element list:
>>> X[4]
...
IndexError: string index out of range
Strings can also be one element long:
>>> Y = 'd'
Note
Unlike C, there is no special type for characters in Python. Characters are just one-element strings.
And they can be empty, just as lists can:
>>> Z = ''
As with lists, you can check the contents of strings. So:
>>> 'd' in X
True
>>> 'do' in X
True
>>> 'dg' in X
Python also provides easy access to subsequences of a string, just as it does for lists. The following examples illustrate how to make such references:
>>> X[0:2] # string of 1st and 2nd characters
'do'
>>> X[:-1] # string excluding last character
'dog'
References to subsequences of a string are called slices.
Guido va Rossum says: “The best way to remember how slices work is to think of the indices as pointing between characters, with the left edge of the first character numbered 0. Then the right edge of the last character of a string of n characters has index n”:
+---+---+---+---+---+
| H | e | l | p | A |
+---+---+---+---+---+
0 1 2 3 4 5
-5 -4 -3 -2 -1
The first row of numbers gives the position of the indices 0…5 in the string; the second row gives the corresponding negative indices. The slice from i to j consists of all characters between the edges labeled i and j, respectively.
For nonnegative indices, the length of a slice is the difference of the indices, if both are within bounds, e.g., the length of word[1:3] is 2.
The built-in function len() returns the length of a string:
>>> s = 'supercalifragilisticexpialidocious'
>>> len(s)
34
Strings can also be concatenated into longer sequences, just as lists can:
>>> X + Y
'dogsd'
Using the name of the type as a function gives us a way of MAKING strings, just as it did with lists:
>>> One = str(1)
One is no longer an int!:
>>> One
'1'
>>> I = int(str(1))
>>> I
1
And as with lists, calling the type with no arguments produces the empty string:
>>> Empty = str()
>>> Empty
''
There is one thing that can be done with lists that canNOT be done with strings. Assignment of values:
>>> 'spin'[2]= 'a'
...
TypeError: object does not support item assignment
This can be fixed, by avoiding the assignment or making it on a mutable sequenece, such as a list, which contains the relevant information.
See also
Section Mutability (advanced).
3.4.2.2. String methods¶
In all the following examples, S
is a string. This is just a
sample. See the official Python docs for the complete list of string
methods. Or just type help(str) at the Python prompt!
S.capitalize()
Return a string just like S, except that it is capitalized. If S is already capitalized, the result is identical to S.
S.count(x)
Return the number of times x appears in the string
S
.
S.index(x)
Return the index in
L
of the first substring whose identicql tox
. It is an error if there is no such item.
S.find(t)
Return index of first instance of t in S, or -1 if not found
S.rfind(t)
Return index of last instance of t in S, or -1 if not found
S.join(Seq)
Combine the strings of
Seq
into single string usingS
as the glue. ‘ ‘.join([“See”,”John”,”run”]) produces:"See John run"
S.replace(x,y)
Return a string in which every instance of the substring x in L is replaced with y:
>>> X = 'abracadabra' >>> X.replace('dab','bad') 'abracabadra' >>> X.replace('a','b') 'bbrbcbdbbrb'
S.split(t)
Split
S
into a list wherever at
is found. Ift
is not supplied, split wherever a space is found.
S.splitlines()
Split
S
into a list of strings, one per line.
S.strip()
Copy of
S
without leading or trailing whitespace.
S.title()
Return a string just like
S
in which all words are capitalized:>>> 'los anGeles'.title() 'Los Angeles'
S.istitle()
Return
True
is every word inS
is capitalized. Otherwise, returnFalse
:>>> 'los anGeles'.istitle() False >>> 'Los AnGeles'.istitle() False >>> 'Los Angeles'.istitle() TrueReverse the elements of the list, in place.