Pattern matching on Bitstrings

Deankinyua

Deankinyua

1 month ago

Assume that you have a long string of text that you want to display to the user. You want to display a maximum of 60 characters and add an ellipsis if and only if the text was longer than 60 characters.

One approach would be to count the number of characters in the input string and if it is greater than 60 characters, use something like String.slice/3 to trim. Another approach would be :

  @spec trim_description(String.t()) :: String.t()
  def trim_description(<<description::binary-size(60), _rest::binary>>), do: description <> "..."
  def trim_description(description), do: description

Understanding the above approach requires some background knowledge into how strings are represented in Elixir.

A string in Elixir is represented as a binary. Somewhere in this documentation you are going to read this :

# A string is a UTF-8 encoded binary
is_binary("hello")
>> true

Binaries on the other hand are bitstrings where the number of bits is divisible by 8.

“Bit” + “strings” - A sequence of bits (nothing complicated).

If what we said above is true, then a bitstring:

  • with 64 bits is a binary
  • with 8 bits is a binary
  • with 16 bits is a binary
  • with 49 bits is not a binary (not divisible by 8)

Now that we have all the background knowledge, we can proceed to understand how we present all this stuff.

A bitstring is denoted with the <<>> syntax.

<<head::binary-size(2), rest::binary>> = <<0, 1, 2, 3>> 

The :: signs specify the size in bits

<<42::8>> - size is 8 bits.

binary-size specifies the size in bytes (8 bits):

<<head::binary-size(2), rest::binary>> = <<0, 1, 2, 3>>
# head is 16 bits 
# rest is unknown

whereas ::binary is used when the size is unknown.

Hence the trim_description method will truncate a long description into one of 60 bytes. This method is more performant as we require no function calls to compute the size of the input string first and conditionally add an ellipsis based on the same.