+
, -
, *
, /
, >
, and <
can be used with the above and work largely as you'd expectSo far we've worked with single values: numbers, strings, and booleans. But Python also supports more complex data types, sometimes called data structures.
The two most common data structures are lists and dictionaries.
As you might expect, a list is an ordered collection of things.
Lists are represented using brackets ([]
).
# A list of integers
numbers = [1, 2, 3]
numbers
[1, 2, 3]
# A list of strings
strings = ['abc', 'def']
strings
['abc', 'def']
Lists are highly flexible. They can contain heterogeneous data (i.e. strings, booleans, and numbers can all be in the same list) and lists can even contain other lists!
combo = ['a', 'b', 3, 4]
combo_2 = [True, 'True', 1, 1.0]
# Note that the last element of the list is another list!
nested_list = [1, 2, 3, [4, 5]]
nested_list
[1, 2, 3, [4, 5]]
Individual elements of a list can be accessed by specifying a location in brackets. This is called indexing.
Beware: Python is zero-indexed, so the first element is element 0!
letters = ['a', 'b', 'c']
letters[0]
'a'
letters[2]
'c'
Specifying an invalid location will raise an error.
letters[4]
--------------------------------------------------------------------------- IndexError Traceback (most recent call last) Cell In[23], line 1 ----> 1 letters[4] IndexError: list index out of range
Caution!
Most programming languages are zero indexed, so a list with 3 elements has valid locations [0, 1, 2]. But this means that there is no element #3 in a 3-element list! Trying to access it will cause an out-of-range error. This is a common mistake for those new to programming (and sometimes it bites the veterans too).
Not only can you read individual elements using indexing; you can also overwrite elements.
greek = ['alpha', 'beta', 'delta']
greek[2] = 'gamma'
greek
['alpha', 'beta', 'gamma']
If given a negative number as an index, Python counts backward from the end of the list.
greek
['alpha', 'beta', 'gamma']
greek[-1]
'gamma'
greek[-3]
'alpha'
greek[-4]
--------------------------------------------------------------------------- IndexError Traceback (most recent call last) Cell In[32], line 1 ----> 1 greek[-4] IndexError: list index out of range
letters = [
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
]
# What are the first 10 letters?
letters[0:10]
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
# 6th-10th letters
letters[5:10]
['f', 'g', 'h', 'i', 'j']
# The last 5 letters
letters[21:26]
['v', 'w', 'x', 'y', 'z']
You can omit either part of the slice index if you want your slice to go all the way to one end of the list.
letters[:2]
['a', 'b']
letters[-2:]
['y', 'z']
Caution!
The starting index of a slice is inclusive but the ending index is exclusive.
l = [0, 1, 2, 3, 4]
l[1:3]
[1, 2]
Dictionaries are collections of key-value pairs. Think of a real dictionary -- you look up a word (a key), to find its definition (a value). Any given key can have only one value.
This concept has many names depending on language: map, associative array, dictionary, and more.
In Python, dictionaries are represented with curly braces. Colons separate a key from its value, and (like lists) commas delimit elements.
ethan = {
'name': 'Ethan',
'employer': 'ReviewTrackers',
'number_of_pets': 0,
'lives_in_ohio': False,
}
ethan
{'name': 'Ethan', 'employer': 'ReviewTrackers', 'number_of_pets': 0, 'lives_in_ohio': False}
gus = {
'name': 'Gus',
'employer': '84.51˚',
'number_of_pets': 1,
'lives_in_ohio': False,
}
jay = {
'name': 'Jay',
'employer': '84.51˚',
'number_of_pets': 4,
'lives_in_ohio': True,
}
Values can be looked up and set by passing a key in brackets.
ethan['number_of_pets']
0
gus['employer']
'84.51˚'
gus['employer'] = 'Eighty Four Fifty One'
gus
{'name': 'Gus', 'employer': 'Eighty Four Fifty One', 'number_of_pets': 1, 'lives_in_ohio': False}
Dictionaries, like lists, are very flexible. Keys are generally strings (though some other types are allowed), and values can be anything -- including lists or other dictionaries!
ceos['Apple'] = 'Tim Cook'
. Try to add a few more keys to this starter dictionary. For example, Bob Iger is the CEO of Disney.ceos = {'Apple': 'Tim Cook',
'Microsoft': 'Satya Nadella'}
How might you approach #2 if you needed to look up both the CEO and the CFO? What data structure would you use? There are several possible solutions.
in
keywordUsing lists as containers checks the elements of the list...
l = ['a', 'e', 'i', 'o', 'u']
'u' in l
True
'b' in l
False
Using dictionaries as containers checks the keys, not the values...
d = {
'ethan': 0,
'gus': 1,
'jay': 4,
}
'gus' in d
True
'bob' in d
False
d = {
'ethan': 0,
'gus': 1,
'jay': 4,
}
'ethan' in d
True
0 in d
False
l = ['a', 'e', 'i', 'o', 'u']
l[0] = 'A'
l.append('y')
l
['A', 'e', 'i', 'o', 'u', 'y']
d = {
'ethan': 0,
'gus': 1,
'jay': 4,
}
d['jay'] = 100
d['brad'] = 1
d
{'ethan': 0, 'gus': 1, 'jay': 100, 'brad': 1}
So what does immutability look like?
# Tuples, which we haven't discussed yet, are immutable
t = ('a', 'e', 'i', 'o', 'u')
t[0] = 'A'
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[24], line 1 ----> 1 t[0] = 'A' TypeError: 'tuple' object does not support item assignment
t.append('y')
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) Cell In[25], line 1 ----> 1 t.append('y') AttributeError: 'tuple' object has no attribute 'append'
How can you determine the type of a variable?
Pass it to the type
function (we'll talk more about functions later).
type("hello")
str
x = 5
type(x)
int
jay
{'name': 'Jay', 'employer': '84.51˚', 'number_of_pets': 4, 'lives_in_ohio': True}
type(jay)
dict
type(jay['lives_in_ohio'])
bool
While we won't spend much time on these, there are many more types of data in Python.
tuple
-- Like a list, but immutable. Good for storing data in which order is meaningful, like records of relational data.
# SELECT name, email, suite_number FROM office_managers;
mgrs = [
('W.B. Jones', 'wbj@wbjhvac.com', 110),
('Bob Vance', 'bob.vance@vancerefrigeration.com', 210),
('Michael Scott', 'mscott@dundermifflin.com', 200),
('Bill Cress', 'bill.cress@cresstools.com', 302),
('Paul Faust', 'pfaust@disasterkits.com', 310),
]
bob = mgrs[1]
bob[1]
'bob.vance@vancerefrigeration.com'
bob[2] = 111
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[39], line 1 ----> 1 bob[2] = 111 TypeError: 'tuple' object does not support item assignment
set
-- An unordered collection of items, where an item can only exist at most once. Very performant for checking if an item is present. Mutable.
primes = {2, 3, 5, 7}
primes
{2, 3, 5, 7}
primes.add(11)
primes
{2, 3, 5, 7, 11}
primes.add(5)
primes
{2, 3, 5, 7, 11}
print(5 in primes)
print(9 in primes)
True False
bytes
, bytearray
-- Raw arrays of bytes, that can be decoded into a string. The latter is mutable and the former is not.complex
, Decimal
, Fraction
-- Numeric types for specialized use cases. If you need these, you probably already know.DataFrame
-- A tabular dataset. We'll cover this in later sections.Are there any questions before we move on?