Variables are labels, not boxes. Take the following code for example:
a = [1, 2, 3]
b = a
a.append(4)
print(b)
If you imagine that variables are like boxes, you cannot make sense of assignment in Python. For an assignment, you must always read the right-hand side first: that’s where the object is created or retrieved. After that, the variable on the left is bound to the object, like a label stuck to it. Just forget about the boxes.
Since variables are mere labels, nothing prevents an object from having several labels assigned to it. When that happens, you have aliasing.
Identity, equality and aliases
charles = {'name': 'Charles', 'born': 1932}
levis = charles
print(levis is charles)
print(id(levis), id(charles))
levis['balance'] = 950
print(charles)
alex = {'name': 'Charles', 'born': 1932, 'balance': 950}
print(alex == charles)
print(alex is not charles)
charles
and levis
refer to the same object, while alex
is bound to a separate object of equal contents. We call lewis
and charles
are aliases.
The id is guaranteed to be a unique numeric label and will not change during the life of the object. In practice, we rarely use id()
function. Identity checks are most often done with the is
operator, and not by comparing ids.
The ==
operator compares the values of objects and appears more frequently than is
in Python code.
The is
operator is fast than ==
, because it cannot be overloaded, so Python does not have to find and invoke special methods to evaluate it.
Are tuples immutable?
Tuples, like most Python collections - lists, dicts, sets etc. - hold references to objects. If the referenced items are mutable, they may change even if the tuple itself does not.
t1 = (1, 2, [3, 4])
t2 = (1, 2, [3, 4])
print(t1 == t2)
print(id(t1[-1]))
t1[-1].append(5)
print(t1)
print(id(t1[-1]))
print(t1 == t2)
The distinction between equality and identity has further implications when you need to copy an object. A copy is an equal object with a different id. But if an object contains other objects, it becomes more complicated.
Copies are shallow by default
The easiest way to copy a list is to use the built-in constructor for the type itself.
l1 = [3, [4, 5], [6,7,8]]
l2 = list(l1)
print(l2)
print(l1 == l2)
print(l1 is l2)
However, the constructor or [:]
produces a shallow copy, i.e. the outermost container is duplicated, but the copy is filled with references to the same items held by the original container.
If there are mutable items, this lead to unpleasant surprises.
l1 = [3, [4, 5], [6,7,8]]
l2 = list(l1)
l1.append(9)
l1[1].remove(5)
print('l1 ->', l1)
print('l2 ->', l2)
l2[1] += [22, 33]
l2[2] += [44, 55]
print('l1 ->', l1)
print('l2 ->', l2)
Deep copies of arbitrary objects
Sometimes we need to make deep copies, i.e. duplicates that do not share references of embedded objects.
The copy
module provides the deepcopy
and copy
function that return deep and shallow copies of arbitrary objects.
class Bus:
def __init__(self, passengers=None):
if passengers is None:
self.passengers = []
else:
self.passengers = list(passengers)
def pick(self, name):
self.passengers.append(name)
def drop(self, name):
self.passengers.remove(name)
import copy
bus1 = Bus(['Alice', 'Bill', 'Claire', 'David'])
bus2 = copy.copy(bus1)
bus3 = copy.deepcopy(bus1)
print(id(bus1), id(bus2), id(bus3))
bus1.drop('Bill')
print(bus2.passengers)
print(bus3.passengers)
print(id(bus1.passengers), id(bus2.passengers), id(bus3.passengers))
If you want to control the behavior of both copy
and deepcopy
, implement the __copy__()
and __deepcopy__()
special methods.