Making the Most of Polymorphism with the Liskov Substitution Principle
October 04, 2018
In part 2 of the SOLID series, we reviewed how to use the Open/Closed Principle (OCP) to write more maintainable code. In short, we learned that, according to the OCP, objects should be “open for extension” but “closed to modification”. In other words, you should not have to change old code in order to implement new behavior. Rather, you should extend behavior by adding new abstractions, leaving old code untouched (and therefore avoiding cascading breakage.)
Abstraction is the key to writing OCP-adherent code—any given object should be unaware of how its partner objects are implemented. One way to do this is through the use of interfaces, which are a kind of contract between objects that guarantees implementation of certain functionality. However, interfaces aren’t always an appropriate solution, particularly when the involved objects have a clear hierarchical parent-child relationship with one another. In such cases, the use of inheritance and polymorphism is probably a better bet. But without an interface contract, how can you guarantee that the objects you’re interacting with will all have the same behavior? To do so, you must ensure that as far as the function using it is concerned, a given object type and all of its subtypes can be used interchangeably. In other words, you have to adhere to the third of the SOLID principles, the Liskov Substitution Principle (LSP).
A Quick Refresher on SOLID
SOLID is an acronym for a set of five software development principles, which if followed, are intended to help developers create flexible and clean code. The five principles are:
- The Single Responsibility Principle—Classes should have a single responsibility and thus only a single reason to change.
- The Open/Closed Principle—Classes and other entities should be open for extension but closed for modification.
- The Liskov Substitution Principle—Objects should be replaceable by their subtypes.
- The Interface Segregation Principle—Interfaces should be client specific rather than general.
- The Dependency Inversion Principle—Depend on abstractions rather than concretions.
The Liskov Substitution Principle
In object-oriented design, a common technique of creating objects with like behavior is the use of super- and sub-types. A supertype is defined with some set of characteristics that all of its subtypes then inherit. In turn, subtypes may then choose to override the supertype’s implementation of some behavior, thus allowing for behavior differentiation through polymorphism. This is an extremely powerful technique; however, it raises the question of what exactly makes one object a subtype of another. Is it enough for a particular object to inherit from another? In 1987, Barbara Liskov proposed an answer to this question, arguing that an object should only be considered a subtype of another object if it is interchangeable with its parent object so far as any interacting function is concerned. Liskov and co-author Jeannette Wing further clarified this idea in their 1994 paper, A Behavioral Notation of Subtyping [1], in which they set out a requirement for constraining the behavior of subtypes:
Subtype Requirement: Let 𝝓(x) be a property provable about objects x of type T. Then 𝝓(y) should be true for objects y of type S where S is a subtype of T.
This is perhaps too academic of a definition for our purposes, but it hints at something important: if a parent object has some necessarily provable attribute, then its subtypes must have the same provable attribute. In his development of the SOLID principles, Robert C. Martin took this definition a step further by trying to restate it in a way that was more meaningful for day-to-day software development [2]. In Martin’s definition, the LSP is stated as follows:
Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it.
In other words, LSP-adherent design is about implicit contracts between derived classes and functions that use their parent classes. A derived class (or in Liskov’s terminology, a subtype) must behave in a manner that does not break a function that uses the derived class’ parent class. This idea of contracts in classes is closely related to Bertrand Meyer’s idea of Design by Contract, which roughly states that methods of classes should declare pre-conditions that must be true for a method to execute and post-conditions that are guaranteed to be true after the method executes [3]. In LSP terms, the validity of pre- and post-conditions is guaranteed by adherence to the following standards:
- A derived object cannot expect users to obey a stronger pre-condition than its parent object expects.
- A derived object may not guarantee a weaker post-condition than its parent object guarantees.
- Derived objects must accept anything that a base class could accept and have behaviors/outputs that do not violate the constraints of the base class.
In studying principles such as the LSP it’s easy to lose sight of why they matter and what they are really saying. The complex language and apparent dogma of such principles have a habit of overshadowing real-world considerations. But once you strip out the academic / technical language, the LSP is really just saying that subtypes should not break the contracts set by their parent types. In practical terms, this means that if a given function uses some object, then you should be able to replace that object with one of its subtypes without anything breaking.
Once you strip out the academic / technical language, the Liskov Substitution Principle is really just saying that subtypes should not break the contracts set by their parent types
As for why this is a good practice, the answer is that failure to adhere to the LSP quickly raises problems as a codebase expands. Without LSP adherence, changes to a program are likely to have unexpected consequences and/or require opening a previously closed class. On the other hand, following the LSP allows easy extension of program behavior because subclasses can be inserting into working code without causing undesired outcomes.
But… It’s Working Fine Already
One of the things that makes the application of SOLID principles difficult is that programs with flawed design may, at first, be working just fine. Let’s look at one such program.
class Laboratory
attr_accessor :scientists
def initialize(scientists, experiments)
@scientists = scientists
@experiments = experiments
end
def run_all_experiments
@scientists.each do |scientist|
@experiments.each do |experiment|
if scientist.class == MadScientist
sabotage = [true, false].sample
scientist.run_experiment(experiment, sabotage)
else
scientist.run_experiment(experiment)
end
end
end
end
end
class Experiment
attr_accessor :title
def initialize(title)
@title = title
end
end
class Scientist
attr_accessor :name
def initialize(name)
@name = name
end
def run_experiment(experiment)
puts "#{name} is now running the #{experiment.title} experiment."
end
end
class MadScientist < Scientist
def run_experiment(experiment, sabotage)
if sabotage
puts "#{name} is now sabotaging the #{experiment.title} experiment!"
else
puts "#{name} is now running the #{experiment.title} experiment."
end
end
end
chemistry_experiment = Experiment.new("chemistry")
physics_experiment = Experiment.new("physics")
biology_experiment = Experiment.new("biology")
experiments = [chemistry_experiment, physics_experiment, biology_experiment]
marie_curie = Scientist.new("Marie Curie")
niels_bohr = Scientist.new("Niels Bohr")
hubert_farnsworth = MadScientist.new("Hubert Farnsworth")
scientists = [marie_curie, niels_bohr, hubert_farnsworth]
lab = Laboratory.new(scientists, experiments)
lab.run_all_experiments
Here we have a program to handle the operations of a scientific laboratory. It has a Laboratory
class, which houses set of scientists
and experiments
and it has the ability to run all the experiments on its docket. The Laboratory
does this by iterating over each of its scientists
and having them each run every experiment
(presumably for reproducibility and peer-review purposes.) Separately, the program has an Experiment
class and a Scientist
class which are used to instantiate experiments
and scientists
respectively. Note how the Laboratory#run_all_experiments
method depends on each scientist having a run_experiment
method, to which it passes the current experiment
object. Finally, and critically for our purposes, the program has a MadScientist
class, which is a subclass of Scientist
. The MadScientist
class overrides its parent’s run_experiment
method and requires a boolean sabotage
parameter in addition to the existing experiment
parameter. MadScientist
objects use the sabotage
parameter to, unsurprisingly, decide whether to sabotage the current experiment.
When we run this program the output works just fine. Our scientists marie_curie
and niels_bohr
dutifully carry out their experiments while our mad scientist, hubert_farnsworth
, randomly sabotages experiments. However, on closer examination, it’s clear that we have some design problems. The MadScientist
subclass requires an additional parameter to execute its run_experiment
method. Put another way, the MadScientist#run_experiment
method has stricter preconditions than its parent Scientist#run_experiment
method. This is a clear violation of the LSP. As a result, in order for Laboratory#run_all_experiments
to execute without any errors it has to check the type of the current scientist and pass in different parameters depending on whether the scientist is mad or not. With only two scientist types, this isn’t such a big deal, but if we extend program behavior by adding new subtypes, this problem will only get worse.
Extending Behavior with Proper Subtypes
Writing SOLID code is in many ways an exercise in defensive prediction. It’s not always possible to know where your program requirements will eventually go, and yet you must prepare to accept new requirements anyway. In order to defend against future problems, one must predict possible areas for extension. The easiest way to do this is to use abstraction early on. In the case of our laboratory program, should the Laboratory
class really know or care whether its scientists
are mad or not? As soon as we knew mad scientists were a possibility we should have stopped to consider how their behavior might differ from regular scientists. Both can run experiments but for the one to be a true subtype as the other then they must do so using the same inputs.
Consider a two flawed approaches to fixing this problem:
- We could simply pass all
scientists
(mad or not) asabotage
argument in the hope that only the mad ones would use it. However, this would require that we change the signature of theScientist#run_experiment
method so that it takes in asabotage
argument lest we raise anArgumentError
. This is problematic because theScientist
objects would never actually use this argument so passing it to them introduces unnecessary constraints and opportunity for errors. - We could make
MadScientist
its own base class rather than having it inherit fromScientist
; however, this would eliminate the polymorphism benefits that we get from the use of subclasses. In this case,Laboratory
would still have to check whether a given staff member was aScientist
orMadScientist
and pass arguments accordingly.
In an ideal world, our Laboratory
should be able to run its experiments using any type of scientist given to it without care for how each of them implements their run_experiment method. In our current implementation, the randomized sabotage
argument is generated in the Laboratory
, but really this is an implementation detail relevant only to those scientists who are mad. Should not the decision to sabotage be the mad scientist’s responsibility rather than the laboratory’s? Let’s see what that would look like.
class Laboratory
attr_accessor :scientists
def initialize(scientists, experiments)
@scientists = scientists
@experiments = experiments
end
def run_all_experiments
@scientists.each do |scientist|
@experiments.each do |experiment|
scientist.run_experiment(experiment)
end
end
end
end
class MadScientist < Scientist
def run_experiment(experiment)
if sabotage?
puts "#{name} is now sabotaging the #{experiment.title} experiment!"
else
puts "#{name} is now running the #{experiment.title} experiment."
end
end
private
def sabotage?
[true, false].sample
end
end
In this version of the program, the Laboratory#run_all_experiments
method no longer does any type checking. It simply passes each experiment to its scientists in turn without concern for whether they are mad. We have also removed the sabotage
parameter from the signature of the MadScientist
implementation of run_experiment
, meaning that its pre-conditions now match those of its parent class, Scientist
. Rather than rely on a passed-in argument, the MadScientist
now has a private sabotage?
method, which it calls inside its run_experiment
method and applies the result accordingly. When we run the program, we get the same results as in the first version.
Updating our program required very few changes, but it did require some careful thought about what it is for one object to be a subtype of another and where responsibility for behavior differentiation should lie. Even with these small changes though our program has become significantly more flexible—we can now add as many derived classes of Scientist
as we like, each with its own implementation of run_experiment
, so long as they all adhere to the contract set by the parent Scientist
class.
TL;DR
The third of the SOLID principles, the Liskov Substitution Principle (LSP), states that a subtype of a given object must be interchangeable with its parent so far as any functions that rely on the parent are concerned. This principle is closely related to the concept of Design by Contract, which describes the use of pre- and post-conditions for any class’ methods. With the LSP, a subtype may neither define pre-conditions that are stronger than those of its parent, nor define post-conditions that are weaker than those of its parent. In practice, this means that any function making use of a particular object may also make use of that object’s subtypes without any adverse effects. By adhering to the LSP, which is inherently necessary if adhering to the related Open/Closed Principle, it is easier to produce flexible, maintainable, and ultimately reusable code.
References
- Paper: A Behavioral Notion of Subtyping; Liskov, Barbara, and Wing, Jeannette
- Paper: The Liskov Substitution Principle; Martin, Robert C.
- Paper: Applying “Design by Contract”; Meyer, Bertrand
- Wikipedia: Liskov Substitution Principle
We’re officially more than half-way through our exploration of the SOLID principles. Stay tuned for upcoming articles on the Interface Segregation Principle and the Dependency Inversion Principle.
Note: This article was originally published on Medium.