Advent of code 2021: Day 16- 11 minutes read - 2184 words
For day 16 Advent of Code 2021, it was all about monadic parser combinators (whatever they are)! Just like Day 15, this one took me a lot longer to complete than I had wanted, this time it was all about learning Haskell’s approach to parsing text.
According to the story, we had to decode a message in a custom binary format, the Buoyancy Interchange Transmission System (BITS). BITS is a message format that encodes both literal values and operations.
Building the parser
The first part of the problem was to decode the hexadecimal input to a stream of bits. I felt I cheated a bit here, because I just went for simplicity:
So this would just take a list of characters (
0 - F) with the nibbles they represent. The bind operator (think
flatMap in scala) is just the ticket there.
Now that I had to parse the input, I turned to monadic parser combinators. That sounds complicated, but is actually making parsing very easy (once I had understood the concept). Essentially, it allows you to build little parsing building blocks and because these parsing blocks are monads, they can be combined with monadic operators. I’m going to try to explain by building up the blocks.
First of all, we have the basic element of the parser, we’re looking at parsing 0s and 1s:
This will parse either a
0 or a
1. To test it:
The output of the parser is an
Either - which is
Left for an error condition or
Right for a result.
This basic building block can then be used to build up more complex parsers:
To define the first type of packet - we’ve got a 3-bit field for the “version”, and another 3-bit field which is always
set to 4 (“100”). Then a series of 5-bit fields. First there are many of those fields starting with
by a payload of a 4-bit nibble. Finally, there’s 5-bit field starting with
'0' and another 4-bit nibble. All these
payload fields are combined and then turn into a numeric literal.
As a way of an example, when parsing the string
We’ve got the version (V), type (T) and two groups (A and B):
Before I continue on the other packets, I’d just like to take a detour on how I’ve modelled the packets in Haskell:
- This defines a
Packetdata structure which can either be
Litliteral, which has a version and a value (both integers)
Opoperator, which has a version, a type and a list of packets nested within it
deriving (Eq, Show)implements the ability to print or compare the data
Now we just need to parse the operator packets - but it is not as simple as that. Operator packets
can have other subpackets inside them. To determine how many packets there are, they can be specified either by the
number of subpackets, or the length (in bits) of all the subpackets. To check which is which, there is a bit that
indicates this. If the first bit is
'1', then the following 11 bits specify the number of packets:
The Haskell do notation is really useful here (this is the equivalent of the scala for-comprehension), and essentially makes monadic binds easier to read. To explain the above a bit further:
- the parser tries to read the character
1- if that is not present, then
tryensure that the character is not consumed.
- following that (by using the
>>combinator), it parses 11 binary characters and puts the output into
- then it parses
lenpackets using the
p_packetparser (see further down)
For the other type of operator packet parsing - where the first bit is
'0', then the following 15 bits specify the
length of the payload.
This does the following:
p_positionparser takes a position and returns a parse failure if the position is less than
p_op_by_lenparser first reads in the character
0- if it doesn’t match it fails - and then 15 binary characters. That makes up the
- then we get the current position and add the column value to
- then we use the
manyTillparser combinator to parse
p_positionparser gets a match.
Finally, we parse a whole
We get the version and the type id - followed by the packet payload. Now this can be either
p_op_by_num - and the
<|> operator allows to choose between them. If the first one succeeds, the second one is not
Lastly, we define how to parse a packet:
A packet can either be a literal or an operand.
For part one, all that was asked was to add up the version ids in each of the packets:
Again, the Haskell pattern matching shines through here. If we have a
Lit packet, then we just return the version
integer, if we have an
Op packet, then we return the version and add up the results of recursively calling the
For part two, we had to evaluate the operators. Depending on the operator, the result would be a different arithmetic
operation. As it turned out, with the parsing completed, the hard part was already done. To evaluate, I just needed
something slightly more complicated than the
sumVersions from part one:
So to evaluate an operation, we need to evaluate each of the types of packet.
- For a
Litliteral, the value was just whatever was parsed in.
- For an
Opoperator, the result would depend on the type id. Using pattern matching and a helper function, I am able to express this quite succinctly.
evaluateAlltakes a function that takes a list of integers as input and a single integer as output, it also takes a list of packets. It then evaluates each packet and feeds that to the function.
- So operation
1sums up all the value using the built in function
- Same with
For the last remaining operators - they are boolean operators that only work on lists of size two:
boolOp is a function that takes two parameters. First, a function that compares two integers. Then a list of
integers and it returns an integer. By using currying, we turn
boolOp (<) into a function that is
[Int] -> Int
with the first parameter filled in by
< (operators in Haskell are just functions - calling
a < b is just syntactic
sugar for the function call
(<) a b). That way I can express any boolean operation easily. For this puzzle, I just
To get the result of this puzzle, I just combined the parsing with the evaluating:
When I initially implemented the first part, I was a little unfocused when reading about the subpackets in the operator. I skimmed over the fact that the two different types were talking about two different lengths. Number of packets or number of bits in the payload.
My first implementation of the
Op parser was
Basically, I would read in the version and the type, the would read in the length - which can be defined as either a 11 bit length field or a 15 bit length field - then completely ignore the length and read in as many sub packets as I could.
Now this worked for part 1. But it was a fluke. When it was reading in subpackets, it did not stop reading until it couldn’t read anymore packets. That meant that when operator packets were inside other operator packets, some that packets that belonged to other operators would be associated with the wrong “parent”. But because the answer for part 1 was just the sum of all the version values it did not matter whether a packet was associated with the right operator.
For part 2, that mattered a lot. It was only then that I came up with the idea of using the
Another fun gotcha was that I initially forgot to add the
return to the parsers when I was using do notation. This
meant that the parsers just kept giving me compile errors.
return is a bit of a funny one when you initially encounter
it coming from imperative programming. It is NOT a return statement. It is a function that wraps the value in the
monad being used in the do notation.
The type signature of return is:
But missing that off meant that instead of dealing with a
Parser String, the function was returning a
String - and
that understandably didn’t make the compiler happy.
Rest of the solution on GitHub
I found this introduction to parsing in Haskell absolutely fascinating. Before doing this exercise I probably would have reached for regular expressions or writing functions to operate on the source string, but I found that the code is more succinct and clear than the specification of the issue. Furthermore, non of these constructs are extensions to the language or parsing specific, all is achieved by using monads. And this is what is making me appreciate the simplicity and beauty of Haskell even more.
And no, I’m not going to do a post trying to explain what monads are! Not yet anyway.Tags functional haskell advent-of-code
If you'd like to find more of my writing, why not follow me on Twitter or Mastodon?