Polymorphism in Elixir: Protocols vs. Behaviours

Ah, behaviours and protocols - those mysterious beasts. As a newcomer to Elixir you don’t really need to deal with them, and you’ll be far too busy wrapping your mind around syntax, functional programming and processes to care much anyway.

Once you’ve got the basics down though, it’s important to understand how these two approaches work together, since they are the foundation of polymorphism in Elixir and datatype polymorphism is one of the major advantages Elixir holds over its cousin, Erlang.

Polymorphism: the condition of occurring in several different forms. (Polymorphism as defined by the magic Google dictionary)

Here are the major differences between protocols and behaviours in Elixir:

Protocols Behaviours
Apply to data structures Apply to modules
Specify new implementations of existing functions for new datatypes Specify a public spec/contract for modules to implement
“Here’s my datatype, it can X” “Here’s my module, it implements Y”
Exclusive to Elixir Provided by the Erlang runtime

Let’s dive into some examples.

Protocols

We’re going to create a new protocol called Nameable. This protocol defines some useful methods dealing with things that have one or more names.

defprotocol Nameable do
  # The typespec is optional but useful for implementers of the protocol.
  @type name :: String.t

  @spec first_name(t) :: name
  def first_name(nameable)

  @spec last_name(t) :: name
  def last_name(nameable)

  @spec full_name(t) :: name
  def full_name(nameable)

  @spec number_of_names(t) :: integer
  def number_of_names(nameable)
end

Now we’re going to create a struct called %Person that implements Nameable. The beauty of protocols is that we didn’t need to know anything about %Person when we defined the protocol. The protocol is extensible. So you can define a protocol once and any number of unknown future data structures can implement it.

defmodule Person do
  defstruct [:names]
end

defimpl Nameable, for: Person do
  def first_name(person), do: List.first(person.names)
  def last_name(person),  do: List.last(person.names)
  def full_name(person),  do: Enum.join(person.names, " ")
  def number_of_names(person), do: Enum.count(person.names)
end

We can call these methods on the Nameable protocol just like any other ordinary module. We can supply as an argument anything that implements Nameable.

iex > linus = %Person{names: ~w(Linus Torvalds)}
=> %Person{names: ["Linus", "Torvalds"]}
iex > Nameable.full_name(linus)
=> "Linus Torvalds"
iex > eric = %Person{names: ~w(Eric S Raymond)}
=> %Person{names: ["Eric", "S", "Raymond"]}
iex > Nameable.number_of_names(eric)
=> 3

Now let’s say we want to include dogs in our application. Dogs can have a name too, but they only have one.

defmodule Dog do
  defstruct [:name]
end

What happens if we try to pass this struct to our Nameable module?

iex > my_dog = %Dog{name: "Max"}
=> %Dog{name: "Max"}
iex > Nameable.full_name(my_dog)
=> (Protocol.UndefinedError) protocol Nameable not implemented for %Dog{name: "Max"}

Kaboom! Each struct requires it’s own implementation of a protocol. So before we can use it, we have to implement the Nameable protocol for our %Dog struct.

defimpl Nameable, for: Dog do
  def first_name(dog), do: dog.name
  def last_name(_dog), do: "" # Dogs don't have a last name
  def full_name(dog),  do: dog.name
  def number_of_names(_dog), do: 1 # Dogs always only have one name
end

Let’s see if that works.

iex > my_dog = %Dog{name: "Max"}
=> %Dog{name: "Max"}
iex > Nameable.full_name(my_dog)
"Max"

Hooray! Now we can pass either a %Dog or a %Person to Nameable.full_name and it will work. We can go on to create as many other structs as we like and write implementations for those to work with Nameable, or write implementations for existing Elixir data structures like lists or maps.

You might wonder at this point what happens if you specify an implementation but don’t actually write all (or any) of the required functions. In this case Elixir will emit a compiler warning, and if you try to call one of the missing functions you’ll see an undefined function error.

So much for protocols. Now what about behaviours?

Behaviours

Where protocols allow for us to mix and match new data structures, behaviours let us mix and match modules. In other words, behaviours are pluggable backends, or interfaces.

Let’s imagine we want to be able to serialize some Elixir data into various formats. We might want to serialize it into JSON, or XML, native code literals or some other unknown format. It would be nice if the code that wants to serialize the data didn’t need to know the implementations of every serializer, and instead could deal with some idealized abstract version of it instead. This way we can hide the implementation detail from the calling code and make our backends pluggable.

# we want to be able to call all serializers in the same way
SomeArbitrarySerializer.serialize(my_data)

For this to work, both the caller of this abstract interface and the implementer of specific backends both need to agree on a “contract” that defines a clear interface so that certain function calls (callbacks) are always guaranteed to work. That’s where behaviours come in.

Here’s an example:

# Any implementation of this behaviour _must_ define the `serialize` callback.
defmodule Serializer do
  @callback serialize(any) :: String.t
end

It’s important to understand that this has not yet defined any function called Serializer.serialize/1. Instead, @callback is a special macro reserved by the Elixir compiler. Behind the scenes it calls Kernel.Typespec.defcallback/1 to register the callback type, then sets the @callback module attribute in a way that the Erlang virtual machine can understand.

Any modules that implement the Serializer behaviour must do two things:

With this guarantee, we can write any module we like that implements the Serializer behaviour and we know it will work with all code that expects a Serializer.

# ❌ incorrect implementation of Serializer
defmodule JSONSerializer do
  @behaviour Serializer
end

# => warning: undefined behaviour function serialize/1 (for behaviour Serializer)

Oops! We forgot to implement the required serialize/1 function. Luckily the Elixir compiler has our back - if we try to compile the code above, we see a warning message.

Let’s try that again:

# ✅ correct implementation of Serializer
defmodule JSONSerializer do
  @behaviour Serializer

  def serialize(term) do
    do_some_magic(term)
  end
end

Great. It worked!

@behaviour might seem like a magic incantation, but it’s really just another ordinary module attribute that’s passed through to Erlang, which has its own callbacks/behaviour implementation.

This pattern can be used in combination with defoverridable to create default implementations with customisable behaviour. Let’s take a look at how this works in the case of GenServer.

defmodule GenServer do
  defmacro __using__(_) do
    quote do
      @behaviour GenServer
      def init(...) do ... end
      def terminate(..., ...) do ... end
      def code_change(..., ..., ...) do ... end
      defoverridable init: 1, terminate: 2, code_change: 3
    end
  end
end

defmodule MyStatefulModule do
  use GenServer

  # We can optionally override the default implementations of init/1,
  # terminate/2 etc here
end

When we create a new module and use GenServer, our module conforms to the behaviour of GenServer and also gets default implementations to boot. Kind of like inheriting from an abstract class in Ruby, but more explicit.

So there you have it! Protocols and Behaviours in Elixir.

BONUS: Proposed extension to behaviours in Elixir 1.5

The pattern shown in the GenServer example above is so common that a new syntax has been proposed to make it easier to use:

defmodule GenServer do
  defmacro __using__(_) do
    quote do
      @behaviour GenServer
      def init(...) do ... end
      def terminate(..., ...) do ... end
      def code_change(..., ..., ...) do ... end
      defoverridable GenServer ### PROPOSED SYNTAX FOR ELIXIR 1.5 ###
    end
  end
end

defmodule MyStatefulModule do
  use GenServer

  @impl GenServer ### PROPOSED SYNTAX FOR ELIXIR 1.5 ###
  # This declaration causes the Elixir compiler to check that the handle_call/3
  # function is a part of the GenServer behaviour, and warns if not. It's a
  # helpful tool to aid developers in correctly implementing behaviours.
  def handle_call(message, from, state) do
    ...
  end
end

I’m currently working on implementing this new syntax for Elixir 1.5, if you’re interested you can follow my progress here.