Making the Most of Polymorphism with the Liskov Substitution Principle

October 04, 2018

In part 2 of the SOLID series, we reviewed how to use the Open/Closed Principle (OCP) to write more maintainable code. In short, we learned that, according to the OCP, objects should be “open for extension” but “closed to modification”. In other words, you should not have to change old code in order to implement new behavior. Rather, you should extend behavior by adding new abstractions, leaving old code untouched (and therefore avoiding cascading breakage.)

Abstraction is the key to writing OCP-adherent code—any given object should be unaware of how its partner objects are implemented. One way to do this is through the use of interfaces, which are a kind of contract between objects that guarantees implementation of certain functionality. However, interfaces aren’t always an appropriate solution, particularly when the involved objects have a clear hierarchical parent-child relationship with one another. In such cases, the use of inheritance and polymorphism is probably a better bet. But without an interface contract, how can you guarantee that the objects you’re interacting with will all have the same behavior? To do so, you must ensure that as far as the function using it is concerned, a given object type and all of its subtypes can be used interchangeably. In other words, you have to adhere to the third of the SOLID principles, the Liskov Substitution Principle (LSP).

A Quick Refresher on SOLID

SOLID is an acronym for a set of five software development principles, which if followed, are intended to help developers create flexible and clean code. The five principles are:

The Single Responsibility Principle—Classes should have a single responsibility and thus only a single reason to change.
The Open/Closed Principle—Classes and other entities should be open for extension but closed for modification.
The Liskov Substitution Principle—Objects should be replaceable by their subtypes.
The Interface Segregation Principle—Interfaces should be client specific rather than general.
The Dependency Inversion Principle—Depend on abstractions rather than concretions.

The Liskov Substitution Principle

In object-oriented design, a common technique of creating objects with like behavior is the use of super- and sub-types. A supertype is defined with some set of characteristics that all of its subtypes then inherit. In turn, subtypes may then choose to override the supertype’s implementation of some behavior, thus allowing for behavior differentiation through polymorphism. This is an extremely powerful technique; however, it raises the question of what exactly makes one object a subtype of another. Is it enough for a particular object to inherit from another? In 1987, Barbara Liskov proposed an answer to this question, arguing that an object should only be considered a subtype of another object if it is interchangeable with its parent object so far as any interacting function is concerned. Liskov and co-author Jeannette Wing further clarified this idea in their 1994 paper, A Behavioral Notation of Subtyping [1], in which they set out a requirement for constraining the behavior of subtypes:

Subtype Requirement: Let 𝝓(x) be a property provable about objects x of type T. Then 𝝓(y) should be true for objects y of type S where S is a subtype of T.

This is perhaps too academic of a definition for our purposes, but it hints at something important: if a parent object has some necessarily provable attribute, then its subtypes must have the same provable attribute. In his development of the SOLID principles, Robert C. Martin took this definition a step further by trying to restate it in a way that was more meaningful for day-to-day software development [2]. In Martin’s definition, the LSP is stated as follows:

Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it.

In other words, LSP-adherent design is about implicit contracts between derived classes and functions that use their parent classes. A derived class (or in Liskov’s terminology, a subtype) must behave in a manner that does not break a function that uses the derived class’ parent class. This idea of contracts in classes is closely related to Bertrand Meyer’s idea of Design by Contract, which roughly states that methods of classes should declare pre-conditions that must be true for a method to execute and post-conditions that are guaranteed to be true after the method executes [3]. In LSP terms, the validity of pre- and post-conditions is guaranteed by adherence to the following standards:

A derived object cannot expect users to obey a stronger pre-condition than its parent object expects.
A derived object may not guarantee a weaker post-condition than its parent object guarantees.
Derived objects must accept anything that a base class could accept and have behaviors/outputs that do not violate the constraints of the base class.

In studying principles such as the LSP it’s easy to lose sight of why they matter and what they are really saying. The complex language and apparent dogma of such principles have a habit of overshadowing real-world considerations. But once you strip out the academic / technical language, the LSP is really just saying that subtypes should not break the contracts set by their parent types. In practical terms, this means that if a given function uses some object, then you should be able to replace that object with one of its subtypes without anything breaking.

Once you strip out the academic / technical language, the Liskov Substitution Principle is really just saying that subtypes should not break the contracts set by their parent types

As for why this is a good practice, the answer is that failure to adhere to the LSP quickly raises problems as a codebase expands. Without LSP adherence, changes to a program are likely to have unexpected consequences and/or require opening a previously closed class. On the other hand, following the LSP allows easy extension of program behavior because subclasses can be inserting into working code without causing undesired outcomes.

But… It’s Working Fine Already

One of the things that makes the application of SOLID principles difficult is that programs with flawed design may, at first, be working just fine. Let’s look at one such program.

class Laboratory
  attr_accessor :scientists

  def initialize(scientists, experiments)
    @scientists = scientists
    @experiments = experiments
  end

  def run_all_experiments
    @scientists.each do |scientist|
      @experiments.each do |experiment|
        if scientist.class == MadScientist
          sabotage = [true, false].sample
          scientist.run_experiment(experiment, sabotage)
        else
          scientist.run_experiment(experiment)
        end
      end
    end
  end
end

class Experiment
  attr_accessor :title

  def initialize(title)
    @title = title
  end
end

class Scientist
  attr_accessor :name

  def initialize(name)
    @name = name
  end

  def run_experiment(experiment)
    puts "#{name} is now running the #{experiment.title} experiment."
  end
end

class MadScientist < Scientist
  def run_experiment(experiment, sabotage)
    if sabotage
      puts "#{name} is now sabotaging the #{experiment.title} experiment!"
    else
      puts "#{name} is now running the #{experiment.title} experiment."
    end
  end
end

chemistry_experiment = Experiment.new("chemistry")
physics_experiment = Experiment.new("physics")
biology_experiment = Experiment.new("biology")
experiments = [chemistry_experiment, physics_experiment, biology_experiment]

marie_curie = Scientist.new("Marie Curie")
niels_bohr = Scientist.new("Niels Bohr")
hubert_farnsworth = MadScientist.new("Hubert Farnsworth")
scientists = [marie_curie, niels_bohr, hubert_farnsworth]

lab = Laboratory.new(scientists, experiments)
lab.run_all_experiments

Here we have a program to handle the operations of a scientific laboratory. It has a Laboratory class, which houses set of scientists and experiments and it has the ability to run all the experiments on its docket. The Laboratory does this by iterating over each of its scientists and having them each run every experiment (presumably for reproducibility and peer-review purposes.) Separately, the program has an Experiment class and a Scientist class which are used to instantiate experiments and scientists respectively. Note how the Laboratory#run_all_experiments method depends on each scientist having a run_experiment method, to which it passes the current experiment object. Finally, and critically for our purposes, the program has a MadScientist class, which is a subclass of Scientist. The MadScientist class overrides its parent’s run_experiment method and requires a boolean sabotage parameter in addition to the existing experiment parameter. MadScientist objects use the sabotage parameter to, unsurprisingly, decide whether to sabotage the current experiment.

When we run this program the output works just fine. Our scientists marie_curie and niels_bohr dutifully carry out their experiments while our mad scientist, hubert_farnsworth, randomly sabotages experiments. However, on closer examination, it’s clear that we have some design problems. The MadScientist subclass requires an additional parameter to execute its run_experiment method. Put another way, the MadScientist#run_experiment method has stricter preconditions than its parent Scientist#run_experiment method. This is a clear violation of the LSP. As a result, in order for Laboratory#run_all_experiments to execute without any errors it has to check the type of the current scientist and pass in different parameters depending on whether the scientist is mad or not. With only two scientist types, this isn’t such a big deal, but if we extend program behavior by adding new subtypes, this problem will only get worse.

Extending Behavior with Proper Subtypes

Writing SOLID code is in many ways an exercise in defensive prediction. It’s not always possible to know where your program requirements will eventually go, and yet you must prepare to accept new requirements anyway. In order to defend against future problems, one must predict possible areas for extension. The easiest way to do this is to use abstraction early on. In the case of our laboratory program, should the Laboratory class really know or care whether its scientists are mad or not? As soon as we knew mad scientists were a possibility we should have stopped to consider how their behavior might differ from regular scientists. Both can run experiments but for the one to be a true subtype as the other then they must do so using the same inputs.

Consider a two flawed approaches to fixing this problem:

We could simply pass all scientists (mad or not) a sabotage argument in the hope that only the mad ones would use it. However, this would require that we change the signature of the Scientist#run_experiment method so that it takes in a sabotage argument lest we raise an ArgumentError. This is problematic because the Scientist objects would never actually use this argument so passing it to them introduces unnecessary constraints and opportunity for errors.
We could make MadScientist its own base class rather than having it inherit from Scientist; however, this would eliminate the polymorphism benefits that we get from the use of subclasses. In this case, Laboratory would still have to check whether a given staff member was a Scientist or MadScientist and pass arguments accordingly.

In an ideal world, our Laboratory should be able to run its experiments using any type of scientist given to it without care for how each of them implements their run_experiment method. In our current implementation, the randomized sabotage argument is generated in the Laboratory, but really this is an implementation detail relevant only to those scientists who are mad. Should not the decision to sabotage be the mad scientist’s responsibility rather than the laboratory’s? Let’s see what that would look like.

class Laboratory
  attr_accessor :scientists

  def initialize(scientists, experiments)
    @scientists = scientists
    @experiments = experiments
  end

  def run_all_experiments
    @scientists.each do |scientist|
      @experiments.each do |experiment|
        scientist.run_experiment(experiment)
      end
    end
  end
end

class MadScientist < Scientist
  def run_experiment(experiment)
    if sabotage?
      puts "#{name} is now sabotaging the #{experiment.title} experiment!"
    else
      puts "#{name} is now running the #{experiment.title} experiment."
    end
  end

  private

  def sabotage?
    [true, false].sample
  end
end

In this version of the program, the Laboratory#run_all_experiments method no longer does any type checking. It simply passes each experiment to its scientists in turn without concern for whether they are mad. We have also removed the sabotage parameter from the signature of the MadScientist implementation of run_experiment, meaning that its pre-conditions now match those of its parent class, Scientist. Rather than rely on a passed-in argument, the MadScientist now has a private sabotage? method, which it calls inside its run_experiment method and applies the result accordingly. When we run the program, we get the same results as in the first version.

Updating our program required very few changes, but it did require some careful thought about what it is for one object to be a subtype of another and where responsibility for behavior differentiation should lie. Even with these small changes though our program has become significantly more flexible—we can now add as many derived classes of Scientist as we like, each with its own implementation of run_experiment, so long as they all adhere to the contract set by the parent Scientist class.

TL;DR

The third of the SOLID principles, the Liskov Substitution Principle (LSP), states that a subtype of a given object must be interchangeable with its parent so far as any functions that rely on the parent are concerned. This principle is closely related to the concept of Design by Contract, which describes the use of pre- and post-conditions for any class’ methods. With the LSP, a subtype may neither define pre-conditions that are stronger than those of its parent, nor define post-conditions that are weaker than those of its parent. In practice, this means that any function making use of a particular object may also make use of that object’s subtypes without any adverse effects. By adhering to the LSP, which is inherently necessary if adhering to the related Open/Closed Principle, it is easier to produce flexible, maintainable, and ultimately reusable code.

References

We’re officially more than half-way through our exploration of the SOLID principles. Stay tuned for upcoming articles on the Interface Segregation Principle and the Dependency Inversion Principle.

Note: This article was originally published on Medium.

Severin Perez