One reason we store items in a
list
is so we can sort them.
sort
method
of class
list
sorted
built-in
function
lambda
keyword in the
Python
Language Reference
By default
(i.e., in the absence of a
key function),
the
sort
method
sorts
int
s
and
floats
s
into increasing numeric order.
That’s because the
<
operator,
when placed between two numbers
(a < b
),
returns
True
if the two numbers are in increasing numeric order.
476 1066 1492 1776 1812 1945 1984 2001 2019
reverse
is a
keyword
argument
of the
sort
function,
just as
sep
and
end
are keyword arguments of the
print
function.
See
Ascending
and Descending.
years.sort(reverse = True)
2019 2001 1984 1945 1812 1776 1492 1066 476
score
function returns the distance between 1900 and the given number
n
;
the
absolute value
ensures that this distance is never negative.
Each number is given a score by being passed to
score
.
The numbers are then sorted in order of their scores.
Since the scoring function returns an
int
,
the scores are
int
s.
And since the scores are
int
s,
the items in the
list
are sorted in increasing numeric order of their scores.
My
score
function is an example of a
key function.
def score(year):
"Return the distance between the year and 1900."
return abs(year - 1900)
for year in years:
print(f"The score of {year:4} is {score(year):4}.")
print() #Skip a line.
years.sort(key = score)
The score of 1492 is 408. The score of 1776 is 124. The score of 1066 is 834. The score of 1812 is 88. The score of 1984 is 84. The score of 2001 is 101. The score of 476 is 1424. The score of 2019 is 119. The score of 1945 is 45. 1945 1984 1812 2001 2019 1776 1492 1066 476
score
function contains only one statement (the
return
statement)
and is called from only one place (the
sort
function).
We can therefore replace it with a
lambda function
(i.e., a nameless function) that returns the same thing.
You don’t have to give a name to a lambda function,
and, in a lambda function,
you don’t even have to write the words
def
and
return
.
#Sort in order of increasing distance from 1900. years.sort(key = lambda year: abs(year - 1900))
1945 1984 1812 2001 2019 1776 1492 1066 476
#Sort in order of decreasing distance from 1900. years.sort(key = lambda year: abs(year - 1900), reverse = True)
476 1066 1492 1776 2019 2001 1812 1984 1945
str
ing
before we can call
count
.
That’s because a number does not have a
count
function, but a
str
does.
#Sort in order of increasing number of zeros. years.sort(key = lambda year: str(year).count("0"))
1492 1776 1812 1984 476 1945 1066 2019 2001
#Sort in order of the sum of the digits. #Create a list of 4 or 3 ints with a list comprehension. years.sort(key = lambda year: sum([int(c) for c in str(year)]))
2001 (sum is 3) 1812 2019 1066 1492 476 1945 1776 1984 (sum is 22)
#Sort in order of the rightmost digit. years.sort(key = lambda year: year % 10)
2001 1492 1812 1984 1945 1776 1066 476 2019
#Put the even numbers first, the odd numbers last. #Note that within each of these two groups, the numbers are not necessarily in #increasing order. We'll fix this in the next example. years.sort(key = lambda year: year % 2)
1492 1776 1066 1812 1984 476 2001 2019 1945
Each score in the following example is a
list
of two items.
A pair of two-item lists are compared by comparing their first items,
breaking ties by comparing their second items.
See the second bullet in
“Lexicographical comparison”
in
Value
comparisons.
Later, we’ll change the
[
square
brackets]
to
(
parentheses)
,
i.e., the score will be a two-item
tuple
instead of a two-item
list
.
#Put the even numbers first, the odd numbers last. #Within each of these two groups, sort the numbers into increasing order. years.sort(key = lambda year: [year % 2, year])
476 1066 1492 1776 1812 1984 1945 2001 2019
#Put the even numbers first, the odd numbers last. #Within each of these two groups, sort the numbers into decreasing order. years.sort(key = lambda year: [year % 2, -year])
1984 1812 1776 1492 1066 476 2019 2001 1945
By default
(i.e., in the absence of a
key function),
the
sort
method
sorts
str
ings
into case-sensitive alphabetical order.
That’s because the
<
operator,
when placed between two
str
ings
(a < b
),
returns
True
if the two
str
ings
are in case-sensitive alphabetical order.
dog dragon hare horse monkey ox pig rat rooster sheep snake tiger
str
ings
in reverse alphabetical order.
animals.sort(reverse = True)
tiger snake sheep rooster rat pig ox monkey horse hare dragon dog
str
ings
in order of increasing (or decreasing) length.
Since the scoring function returns an
int
,
the scores are
int
s.
And since the scores are
int
s,
the items in the
list
are sorted in increasing numeric order of their scores.
animals.sort(key = lambda animal: len(animal)) #unnecessarily complicated
#Simpler way to do the same thing: #use the built-in function len directly as the scoring function. animals.sort(key = len)
ox dog pig rat hare tiger snake horse sheep monkey dragon rooster
animals.sort(key = len, reverse = True)
rooster monkey dragon tiger snake horse sheep hare dog pig rat ox
words = [ "boy", "Cow", "Apple", "Boy", "apple", "cow" ] words.sort() #Alphabetical order is case-sensitive by default. for word in words: print(word)
Apple Boy Cow apple boy cow
"A"
,
"B"
,
"C"
,
"D"
,
"F"
.
Let’s sort the strings in case-insensitive alphabetical order.
Since this scoring function returns a
str
ing,
the scores are
str
ings.
And since the scores are
str
ings,
the items in the
list
are sorted in case-sensitive alphabetical order of their scores.
(Since the scores are all one case (lower),
this case-sensitivity doesn’t matter.
In the following code,
upper
would work just as well as
lower
.)
words = [ "boy", "Cow", "Apple", "Boy", "apple", "cow" ] words.sort(key = lambda word: word.lower()) #case-insensitive alphabetical order #Another way to make a lambda function that simply calls one method. Remember to import operator. #words.sort(key = operator.methodcaller("lower")) for word in words: print(word)
Apple apple boy Boy Cow cow
In the above red and yellow groups, the capitalized word precedes the uncapitalized one. But in the above orange group, the the uncapitalized word precedes the capitalized one. We’ll fix this in the next exercise.
words = [ "boy", "Cow", "Apple", "Boy", "apple", "cow" ] words.sort(key = lambda word: [word.lower(), word]) for word in words: print(word)
Within the red group, the lines are now in case-insensitive alphabetical order (i.e., uppercase first). Ditto for the other groups.
Apple apple Boy boy Cow cow
list
of two items.
A pair of two-item lists are compared by comparing their first items,
breaking ties by comparing their second items.
def score(animal): return [len(animal), animal.count("e")] animals.sort(key = score)
animals.sort(key = lambda animal: [len(animal), animal.count("e")])
ox dog pig rat hare tiger snake horse sheep dragon monkey rooster
Sort by length. Break ties by number of e’s. Break further ties by alphabetical order.
animals.sort(key = lambda animal: [len(animal), animal.count("e"), animal])
ox dog pig rat hare horse snake tiger sheep dragon monkey rooster
persons = [ "John Adam Doe", "John Barry Smith", "John Adam Smith", "Jane Ann Doe", "Jane Ann Smith", "John Smith" ] def score(person): names = person.split() if len(names) == 2: #no middle name names.insert(1, "") #Give them an empty middle name. return [names[2], names[0], names[1]] #last, first, middle persons.sort(key = score) for i, person in enumerate(persons, start = 1): print(f"{i:2} {person}")
1 Jane Ann Doe 2 John Adam Doe 3 Jane Ann Smith 4 John Smith 5 John Adam Smith 6 John Barry Smith
list
of
str
ings
in alphabetical order.
Print the original and new orders side by side.
We must make a copy of the
list
and then
sort
the copy.
But the following code does not make a copy of the
list
.
animals = [ "monkey", # 0 "rooster", # 1 "dog", # 2 "pig", # 3 "rat", # 4 "ox", # 5 "tiger", # 6 "hare", # 7 "dragon", # 8 "snake", # 9 "horse", #10 "sheep" #11 ] sortedAnimals = animals #Try (unsuccessfully) to make a copy of the list. sortedAnimals.sort() for animal, sortedAnimal in zip(animals, sortedAnimals): #Right pad the left column with spaces. print(f"{animal:7} {sortedAnimal}")
dog dog dragon dragon hare hare horse horse monkey monkey ox ox pig pig rat rat rooster rooster sheep sheep snake snake tiger tiger
Here’s what went wrong.
The Python script has only one list,
and originally we could
refer to (i.e., access) that list only by using the variable
animals
.
Instead of making a copy of that
list
,
the above
assignment
statement
sortedAnimals = animals
merely made it possible to refer to that
list
by using either one
of the variables
sortedAnimals
or
animals
.
But there is still only one
list
.
So how can we create a copy of a
list
?
If a variable contains a plain old
number,
the number can be copied simply by writing an
equal
sign:
i = 10 j = i #Create a copy of the number.
If a variable refers to a
list
,
we have to do more than just write an equal sign in order to copy the
list
.
We also have to call the
copy
function that belongs to the
list
.
animals = [ "monkey", # 0 "rooster", # 1 "dog", # 2 "pig", # 3 "rat", # 4 "ox", # 5 "tiger", # 6 "hare", # 7 "dragon", # 8 "snake", # 9 "horse", #10 "sheep" #11 ] #Create a copy of the list. sortedAnimals = animals.copy() #or sortedAnimals = animals[:] sortedAnimals.sort() #Now we can sort without disturbing the original list. for animal, sortedAnimal in zip(animals, sortedAnimals): #Right pad the left column with spaces. print(f"{animal:7} {sortedAnimal}")
monkey dog rooster dragon dog hare pig horse rat monkey ox ox tiger pig hare rat dragon rooster snake sheep horse snake sheep tiger
list
at the same time.
The following call to the
sorted
built-in function
creates a new
list
named
sortedAnimals
and sorts this new
list
,
without making any change to the original
list
animals
.
Unlike the
sort
function,
sorted
is an orphan: it does not belong to anything.
In other words,
there is no dot in front of its name.
animals = [ "monkey", # 0 "rooster", # 1 "dog", # 2 "pig", # 3 "rat", # 4 "ox", # 5 "tiger", # 6 "hare", # 7 "dragon", # 8 "snake", # 9 "horse", #10 "sheep" #11 ] sortedAnimals = sorted(animals) assert len(sortedAnimals) == len(animals) #just to explain what's going on for animal, sortedAnimal in zip(animals, sortedAnimals): print(f"{animal:7} {sortedAnimal}")
monkey dog rooster dragon dog hare pig horse rat monkey ox ox tiger pig hare rat dragon rooster snake sheep horse snake sheep tiger
Some of the cards are single-digit numbers,
some are double-digit,
and some are non-numeric,
so I thought the simplest way to write the
score
function was to put all possible cards into the
list
named
increasing
.
If
index
doesn’t find what it’s looking for,
it will raise a
ValueError
exception,
which means we should now put our call to
sort
between a
try
and an
except
.
(Later we’ll write the
score
function more simply with a
dict
.)
import sys cards = [ "3", "J", "10", "7", "Q" ] #a list of 13 strings increasing = ["2", "3", "4", "5", "6", "7", "8", "9", "10", "J", "Q", "K", "A"] def score(card): "Return an int in the range 0 to 12 inclusive, or raise ValueError." return increasing.index(card) try: cards.sort(key = score) except ValueError as error: print(error, file = sys.stderr) sys.exit(1) for card in cards: print(card)
3 7 10 J Q
If the
list
of
cards
contains an illegal string such as
"X"
,
sort
will raise a
ValueError
because
index
will raise a
ValueError
.
The
ValueError
carries information about what went wrong.
This information will be copied into the variable
error
and printed by the
print
under the keyword
except
:
'X' is not in listA simpler way to do the same thing:
cards.sort(key = lambda s: increasing.index(s))
days
in chronological order,
starting with Sunday and ending with Saturday.
days = [ "Wednesday", "Monday", "Tuesday", "Thursday", "Saturday", "Monday" ] chronologicalOrder = [ "Sunday", #0 "Monday", #1 "Tuesday", #2 "Wednesday", #3 "Thursday", #4 "Friday", #5 "Saturday" #6 ] days.sort(key = lambda day: chronologicalOrder.index(day)) for day in days: print(day)
Monday Monday Tuesday Wednesday Thursday Saturday
states
in alphabetical order,
except that “Miscellaneous”
comes last.
How could you make it come first?
states = [ "New YorK", "New Jersey", "Connecticut", "Miscellaneous", "Pennsylvania", "Massachusetts" ] states.sort(key = lambda state: "ZZZZZ" if state == "Miscellaneous" else state) for state in states: print(state)
Connecticut Massachusetts New Jersey New YorK Pennsylvania Miscellaneous
years = [ "1776", "1492", "1968", "1945", "2019", "1066" ] years.sort() #alphabetical sort, because years is a list of strings. for year in years: print(year)
1066 1492 1776 1945 1968 2019
The trick fails if the integers don’t all have the same length:
years = [ "1776", "1492", "1968", "476", "1945", "2019", "1066" ] years.sort() #alphabetical sort, because years is a list of strings. for year in years: print(year)
1066 1492 1776 1945 1968 2019 476
Two ways to get the correct order if the integers do not all have the same number of digits:
years = [ "1776", "1492", "1968", "476", "1945", "2019", "1066" ] years.sort(key = lambda year: f"{year:>4}") #Left-pad the shorter strings with spaces. years.sort(key = lambda year: int(year)) #Perform a numerical sort. for year in years: print(year)
476 1066 1492 1776 1945 1968 2019
list
of addresses by increasing zip code.
Assume that the fourth field on each line is always the zip code,
and assume that the zip code is always exactly five digits.
Since all the zip codes are
str
ings
with the same number of characters,
alphabetical order is the same as numerical order.
addresses = [ "350 Fifth Avenue,New York,NY,10118", #Empire State Building "1600 Pennsylvania Avenue,Washington,DC,20500", #White House "89 East 42nd Street,New York,NY,10017", #Grand Central Terminal "405 Lexington Avenue,New York,NY,10174", #Chrysler Building "1000 Fifth Avenue,New York,NY,10028", #Metropolitan Museum of Art "1491 Mill Run Road,Mill Run,PA,15464" #Fallingwater ] def score(address): "Return the address's zip code." fields = address.split(",") #fields is a list of 4 strings return fields[3] #fields[3] is a string of 5 characters that are digits addresses.sort(key = score) for address in addresses: print(address)
89 East 42nd Street,New York,NY,10017 1000 Fifth Avenue,New York,NY,10028 350 Fifth Avenue,New York,NY,10118 405 Lexington Avenue,New York,NY,10174 1491 Mill Run Road,Mill Run,PA,15464 1600 Pennsylvania Avenue,Washington,DC,20500A simpler way to do the same thing:
addresses.sort(key = lambda address: address.split(",")[3]) #by zip code