3.2 Representation Invariants#
We now know how to define a class that bundles together related pieces of data and includes methods that operate on that data. These methods provide services to client code, and if we write them sensibly, the client code can be sure that any instances they create will always be in a sensible state. For instance, we can make sure that no data is missing by writing an initializer that creates and initializes every instance attribute. And if, say, one instance attribute must always be greater than another (because that is a rule in the domain of our program), we can ensure that the initializer and all of the methods will never violate that rule.
Let’s return to our Twitter example to consider what writing the methods “sensibly” entails.
Documenting rules with representation invariants#
Twitter imposes a 280-character limit on tweets.
If we want our code to be consistent with this rule,
we must both document it and make sure that every method of the class enforces the rule.
First, let’s formalize the notion of “rule”.
A representation invariant is a property of the instance attributes that every instance of a class must satisfy.
For example, we can say that a representation invariant for our Tweet
class is that
the content
attribute is always at most 280 characters long.
We document representation invariants in the docstring of a class, underneath its attributes.
While we could write these representation invariants in English,
we often prefer concrete Python code expressions that evaluate to True
or False
,
as such expressions are unambiguous and can be checked directly in our program.
class Tweet:
"""A tweet, like in Twitter.
Attributes:
userid: the id of the user who wrote the tweet.
created_at: the date the tweet was written.
content: the contents of the tweet.
likes: the number of likes this tweet has received.
Representation Invariants:
- len(self.content) <= 280
"""
# Attribute types
userid: str
created_at: date
content: str
likes: int
Even though this is a new definition, we have seen representation invariants already:
every instance attribute type annotation is a representation invariant!
For example, the annotation content: str
means that the content of a tweet must always be a string.
Enforcing representation invariants#
Even though documenting representation invariants is essential, documentation alone is not enough. As the author of a class, you have the responsibility of ensuring that each method is consistent with the representation invariants, in the following two ways:
At the beginning of the method body (i.e., right when the method is called), you can always assume that all of the representation invariants are satisfied.
At the end of the method (i.e., right before the method returns), it is your responsibility to ensure that all of the representation invariants are satisfied.
That is, each representation invariant is both a precondition and postcondition of every method in a class. You are free to temporarily violate the representation invariants during the body of the method (and will often do so while mutating the object), as long as by the end of the method, all of the invariants are restored.
The initializer method is an exception: it does not have any preconditions on the attributes (since they haven’t even been created yet), but it must initialize the attributes so that they satisfy every representation invariant.
In our Twitter code, what method(s) may require modification in order to ensure that our representation invariant (len(self.content) <= 280
) is enforced?
Currently, the initializer allows the user to create a Tweet
object with any message they want, including one that exceeds the limit.
There are a variety of strategies that we can take for enforcing our representation invariant.
One approach is to process the initializer arguments so that the instance attributes are initialized to allowed values. For example, we might truncate a tweet message that’s too long:
class Tweet:
def __init__(self, who: str, when: date, what: str) -> None:
"""Initialize a new Tweet.
If <what> is longer than 280 chars, only first 280 chars are stored.
>>> t = Tweet('Rukhsana', date(2017, 9, 16), 'Hey!')
>>> t.userid
'Rukhsana'
>>> t.created_at
datetime.date(2017, 9, 16)
>>> t.content
'Hey!'
>>> t.likes
0
"""
self.userid = who
self.created_at = when
self.content = what[:280]
self.likes = 0
Another approach is to not change the code at all, but instead specify a precondition on the initializer:
class Tweet:
def __init__(self, who: str, when: date, what: str) -> None:
"""Initialize a new Tweet.
Preconditions:
- len(what) <= 280
>>> t = Tweet('Rukhsana', date(2017, 9, 16), 'Hey!')
>>> t.userid
'Rukhsana'
>>> t.created_at
datetime.date(2017, 9, 16)
>>> t.content
'Hey!'
>>> t.likes
0
"""
self.userid = who
self.created_at = when
self.content = what
self.likes = 0
As we discussed in 1.3 The Function Design Recipe, a precondition is something that we assume to be true about the function’s input. In the context of this section, we’re saying, “The representation invariant will be enforced by our initializer assuming that the client code satisfies our preconditions.” On the other hand, if this precondition is not satisfied, we aren’t making any promise about what the method will do (and in particular, whether it will enforce the representation invariants).
Checking representation invariants automatically with python_ta
#
PythonTA supports checking all representation invariants, just like it does with preconditions!
Let’s add a check_contracts
decorator to our Tweet
example, but use our original initializer that doesn’t check the length of the content.
from python_ta.contracts import check_contracts
@check_contracts
class Tweet:
"""A tweet, like in Twitter.
Attributes:
userid: the id of the user who wrote the tweet.
created_at: the date the tweet was written.
content: the contents of the tweet.
likes: the number of likes this tweet has received.
Representation Invariants:
- len(self.content) <= 280
"""
# Attribute types
userid: str
created_at: date
content: str
likes: int
def __init__(self, who: str, when: date, what: str) -> None:
"""Initialize a new Tweet."""
self.userid = who
self.created_at = when
self.content = what
self.likes = 0
Now, we’ll obtain an error whenever we attempt to create a Tweet
value with invalid attributes.
>>> Tweet('David', date(2023, 5, 10), 'David' * 100)
Traceback (most recent call last):
File "<input>", line 1, in <module>
...
AssertionError: "Tweet" representation invariant "len(self.content) <= 280" was violated for instance attributes {userid: 'David', created_at: datetime.date(2023, 5, 10), content: 'DavidDavidDa...idDavidDavid', likes: 0}
Notes about using check_contracts
with classes:
python_ta
is strict with the headerRepresentation Invariants:
. In particular, both the “Representation
” and “Invariants
” must be capitalized (and spelled correctly), and must be followed by a colon. Please watch out for this, as otherwise any representation invariants you add will not be checked!
Another example: non-negativity constraints#
Look again at the attributes of Tweet
.
Another obvious representation invariant is that likes
must be at least 0;
our type annotation likes: int
allows for negative integers, after all.
Do any methods need to change so that we can ensure this is always true?
We need to check the initializer and any other method that mutates self.likes
.
First, the initializer sets likes
to 0, which satisfies this invariant.
The method Tweet.like
adds to the likes
attribute, which would seem safe,
but what if the client code passes a negative number?
Again, we are faced with a choice on how to handle this.
We could impose a precondition that Tweet.like
be called with n >= 0
.
Or, we could allow negative numbers as input, but simply set self.likes = 0
if its value falls below 0.
Or, we could simply refuse to add a negative number, and simply return
(i.e., do nothing) in this case.
All of these options change the method’s behaviour, and so whatever we choose, we would need to update the method’s documentation!
Client code can violate representation invariants also#
We’ve now learned how to write a class that declares and enforces appropriate representation invariants. We guarantee that whenever client code creates new instances of our class, and calls methods on them (obeying any preconditions we specify), our representation invariants will always be satisfied.
Sadly, even being vigilant in implementing our methods doesn’t fully prevent client code from violating representation invariants—we’ll see why in the next section.