In this section we will review some of the key Python data structures and describe suitable uses for them. This is not meant to be a complete introduction to Python, but only a refresher of some key ideas, with an emphasis on how information may be stored in Python and the consequences of different storage decisions.
Regardless of the programming language of choice, there are some common types used when representing data. They come in two flavors:
Chances are you are already quite familiar with lists.
for-in
loop. Or we can access a specific location if we know its index.[2, 5, 6]
or starting with an empty list like list()
.x[2]
to get at a value in the list (indices start at 0).append
.As a simple example, we could have a list containing the names of all teams in our conference (data obtained from http://www.heartlandconf.org/:
teams = ["HANOVER", "MOUNT ST. JOSEPH", "ROSE-HULMAN",
"TRANSYLVANIA", "ANDERSON", "BLUFFTON",
"FRANKLIN", "EARLHAM", "MANCHESTER", "DEFIANCE"]
teams[2] # Rose-Hulman
teams[2:4] # A "slice" of entries 2 through 4
for team in teams:
print(team)
# Creating a new empty list then copying the elements over
# This is meant as an example. Do NOT copy lists this way.
# Use teams.copy() instead.
teamsCopy = list()
for team in teams:
teamsCopy.append(team)
May not have used tuples before, but they are quite useful in putting together heterogeneous but coupled elements.
t=("HANOVER", 14, 4)
t[0]
.t = ("HANOVER", 14, 4)
t[0] # <-- "HANOVER"
(name, wins, losses) = t # multiple assignment
name, wins, losses = t # also works
As a longer example, we could combine lists and tuples. For example we may have a list of tuples, and process it in a for-in loop with multiple assignment:
results = [
("HANOVER", 14, 4), ("MOUNT ST. JOSEPH", 14, 4), ("ROSE-HULMAN", 14, 4),
("TRANSYLVANIA", 12, 6), ("ANDERSON", 9, 9), ("BLUFFTON", 8, 10),
("FRANKLIN", 7, 11), ("EARLHAM", 6, 12), ("MANCHESTER", 4, 14),
("DEFIANCE", 2, 16)
]
for team, wins, losses in results:
print("%s: %d Wins, %d Losses" % (team, wins, losses))
# Alternative:
for tuple in results:
print("%s: %d Wins, %d Losses" % tuple)
Notice the expression "%s: %d Wins, %d Losses" % tuple
in the contents of the print statement. This is the use of the percent operator for doing string formating. It has the general form:
string % values
Where values
is a tuple of values whose arity matches the placeholders in the string. Read more about it and other formats in the input/output section of the Python tutorial.
A dictionary, often referred to also as a map, is a structure that associates "keys" with "values".
{ key1: value1, key2: value2 }
dict()
.x["foo"]
.key in dict
For example, we used a tuple earlier to represent the information of a team and their win and loss record. We could instead have used a dictionary:
team = { "name": "HANOVER", "wins": 14, "losses": 4 }
team["name"] # <-- "HANOVER"
# Iterate over the keys:
for key in team:
print(key, team[key])
# Iterate over key-value pairs:
for key, value in team.items():
print(key, value)
As an example, we could store the teams we worked with earlier in a dictionary, indexed by the team names:
results = {
"HANOVER": ("HANOVER", 14, 4),
"MOUNT ST. JOSEPH": ("MOUNT ST. JOSEPH", 14, 4),
"ROSE-HULMAN": ("ROSE-HULMAN", 14, 4),
"TRANSYLVANIA": ("TRANSYLVANIA", 12, 6),
"ANDERSON": ("ANDERSON", 9, 9),
"BLUFFTON": ("BLUFFTON", 8, 10),
"FRANKLIN": ("FRANKLIN", 7, 11),
"EARLHAM": ("EARLHAM", 6, 12),
"MANCHESTER": ("MANCHESTER", 4, 14),
"DEFIANCE": ("DEFIANCE", 2, 16)
}
results["HANOVER"]
As a motivating example, we will consider the following problem: We have stored the text from a certain book in a text file. We will use for our example a transcript of the Tale of Two Cities by Charles Dickens, which can be found in this file provided by the "E-Texts" website. We would like to process it in a way that would facilitate answering some questions. For example:
Activity: Discuss how you would represent the text of this book in Python to facilitate answering these questions, using the data structures described above. Present at least three different approaches and discuss advantages and disadvantages of each. Some approaches might make some of the questions easier but other questions harder.
courses
.