5.1. Dictionaries¶
All of the compound data types we have studied in detail so far — strings, lists, and tuples — are sequential collections. This means that the items in the collection are ordered from left to right and they use integers as indices to access the values they contain.
Dictionaries and Sets are Python’s two built-in associative data types. The dictionary associates one value with another, whereas the set associates a value with membership in some collection or group. In this chapter, we will introduce these two data structures and discuss their application in data wrangling and analysis.
5.1.1. Mapping One Value to Another with Dictionaries¶
The dictionary is a mapping type. A map is an unordered, associative collection. The association, or mapping, is from a key, which can be any immutable type, to a value, which can be any Python data object. As an example, we will create a dictionary to translate English words into Spanish. For this dictionary, the keys are strings and the values will also be strings.
We can create a dictionary by providing a list of key-value pairs separated by
:
between the keys and values and ,
between each pair.
In [1]: eng2sp = {'three': 'tres', 'one': 'uno', 'two': 'dos'}
In [2]: eng2sp
Out[2]: {'one': 'uno', 'three': 'tres', 'two': 'dos'}
It doesn’t matter what order we write the pairs. The values in a dictionary are accessed with keys, not with indices, so there is no need to care about ordering.
Here is how we use a key to look up the corresponding value.
In [3]: eng2sp = {'three': 'tres', 'one': 'uno', 'two': 'dos'}
In [4]: eng2sp['two']
Out[4]: 'dos'
The key 'two'
yields the value 'dos'
.
You may recall that we introduced get
function from the toolz
module in
the chapter on sequences. A functional alternative to using the get
operator (name[key]
) is applying the same get
function to a dictionary.
In [5]: from toolz import get
In [6]: get('two', eng2sp)
Out[6]: 'dos'
Using the get
function has a couple of advantages. First, we can use a
common API for getting values from both sequences and lists.
In [7]: L = [1,2,3]
In [8]: get(1, L)
Out[8]: 2
In [9]: D = {1:"a", 2:"b"}
In [10]: get(1, D)
Out[10]: 'a'
Second, get
allows us to get the values for a number of keys simultaneously.
In [11]: get([2,1], L)
Out[11]: (3, 2)
In [12]: get(['one', 'two'], eng2sp)