3.6. References to Sequences¶

In the next section, we will explore Python’s implementation of lists. In the process the idea of an alias is defined and illustrated, which will help explain some of the more confusing behaviors related to mutable objects such as lists.

3.6.1. Objects and References¶

If we execute these assignment statements,

a = "banana"
b = "banana"

we know that a and b will refer to a string with the letters "banana". But we don’t know yet whether they point to the same string.

There are two possible ways the Python interpreter could arrange its internal states:

In one case, a and b refer to two different string objects that have the same value. In the second case, they refer to the same object. Remember that an object is something a variable can refer to.

We already know that objects can be identified using their unique identifier. We can also test whether two names refer to the same object using the is operator. The is operator will return true if the two references are to the same object. In other words, the references are the same. Try our example from above.

In [1]: a = "banana"

In [2]: b = "banana"

In [3]: a is b
Out[3]: True

The answer is True. This tells us that both a and b refer to the same object, and that it is the second of the two reference diagrams that describes the relationship. Since strings are immutable, Python optimizes resources by making two names that refer to the same string value refer to the same object.

This is not the case with lists. Consider the following example. Here, a and b refer to two different lists, each of which happens to have the same element values.

In [4]: a = [81, 82, 83]

In [5]: b = [81, 82, 83]

In [6]: a is b
Out[6]: False

In [7]: a == b
Out[7]: True

The reference diagram for this example looks like this:

Reference diagram for equal different lists

a and b have the same value but do not refer to the same object.

There is one other important thing to notice about this reference diagram. The variable a is a reference to a collection of references. Those references actually refer to the integer values in the list. In other words, a list is a collection of references to objects. Interestingly, even though a and b are two different lists (two different collections of references), the integer object 81 is shared by both. Like strings, integers are also immutable so Python optimizes and lets everyone share the same object.

Here is the example in codelens. Pay particular attention to the id values.

(chp09_istrace)

3.6.2. Aliasing¶

Since variables refer to objects, if we assign one variable to another, both variables refer to the same object:

In [8]: a = [81, 82, 83]

In [9]: b = a

In [10]: a is b
Out[10]: True

In this case, the reference diagram looks like this:

State snapshot for multiple references (aliases) to a list

Because the same list has two different names, a and b, we say that it is aliased. Changes made with one alias affect the other. In the codelens example below, you can see that a and b refer to the same list after executing the assignment statement b = a.

(chp09_is3)

Although this behavior can be useful, it is sometimes unexpected or undesirable. In general, it is safer to avoid aliasing when you are working with mutable objects. Of course, for immutable objects, there’s no problem. That’s why Python is free to alias strings and integers when it sees an opportunity to economize.

Check your understanding

rec-5-36: What is printed by the following statements?

alist = [4, 2, 8, 6, 5]
blist = alist
blist[3] = 999
print(alist)

(A) [4, 2, 8, 6, 5]
blist is not a copy of alist, it is a reference to the list alist refers to.
(B) [4, 2, 8, 999, 5]
Yes, since alist and blist both reference the same list, changes to one also change the other.

3.6.3. Cloning Lists¶

If we want to modify a list and also keep a copy of the original, we need to be able to make a copy of the list itself, not just the reference. This process is sometimes called cloning, to avoid the ambiguity of the word copy.

The easiest way to clone a list is to use the slice operator.

Taking any slice of a creates a new list. In this case the slice happens to consist of the whole list.

(chp09_is4)

Now we are free to make changes to b without worrying about a. Again, we can clearly see in codelens that a and b are entirely different list objects.

3.6.4. Repetition and References¶

We have already seen the repetition operator working on strings as well as lists. For example,

In [11]: origlist = [45, 76, 34, 55]

In [12]: origlist * 3
Out[12]: [45, 76, 34, 55, 45, 76, 34, 55, 45, 76, 34, 55]

With a list, the repetition operator creates copies of the references. Although this may seem simple enough, when we allow a list to refer to another list, a subtle problem can arise.

Consider the following extension on the previous example.

In [13]: origlist = [45, 76, 34, 55]

In [14]: origlist * 3
Out[14]: [45, 76, 34, 55, 45, 76, 34, 55, 45, 76, 34, 55]

In [15]: newlist = [origlist] * 3

In [16]: newlist
Out[16]: [[45, 76, 34, 55], [45, 76, 34, 55], [45, 76, 34, 55]]

newlist is a list of three references to origlist that were created by the repetition operator. The reference diagram is shown below.

Now, what happens if we modify a value in origlist.

In [17]: origlist = [45, 76, 34, 55]

In [18]: newlist = [origlist] * 3

In [19]: newlist
Out[19]: [[45, 76, 34, 55], [45, 76, 34, 55], [45, 76, 34, 55]]

In [20]: origlist[1] = 99

In [21]: newlist
Out[21]: [[45, 99, 34, 55], [45, 99, 34, 55], [45, 99, 34, 55]]

newlist shows the change in three places. This can easily be seen by noting that in the reference diagram, there is only one origlist, so any changes to it appear in all three references from newlist.

Here is the same example in codelens. Step through the code paying particular attention to the result of executing the assignment statement origlist[1] = 99.

(reprefstep)

Check your understanding

rec-5-37: What is printed by the following statements?

alist = [4, 2, 8, 6, 5]
blist = alist * 2
blist[3] = 999
print(alist)

(A) [4, 2, 8, 999, 5, 4, 2, 8, 6, 5]
print(alist) not print(blist)
(B) [4, 2, 8, 999, 5]
blist is changed, not alist.
(C) [4, 2, 8, 6, 5]
Yes, alist was unchanged by the assignment statement. blist was a copy of the references in alist.

rec-5-38: What is printed by the following statements?

alist = [4, 2, 8, 6, 5]
blist = [alist] * 2
alist[3] = 999
print(blist)

(A) [4, 2, 8, 999, 5, 4, 2, 8, 999, 5]
[alist] * 2 creates a list containing alist repeated 2 times
(B) [[4, 2, 8, 999, 5], [4, 2, 8, 999, 5]]
Yes, blist contains two references, both to alist.
(C) [4, 2, 8, 6, 5]
print(blist)
(D) [[4, 2, 8, 999, 5], [4, 2, 8, 6, 5]]
blist contains two references, both to alist so changes to alist appear both times.

Note

This workspace is provided for your convenience. You can use this activecode window to try out anything you like.

Next Section - 3.7. Sequence Methods and Working with Strings and Lists