70 %
Chris Biscardi

Advent of Code 2020 in Rust day 4: parser error locations with nom

In day 02 we wrote a simpler parser: using nom.

day-02-parser.rs
rust
fn password_line(input: &str) -> IResult<&str, PasswordLine> {
let (input, lower) = digit1(input)?;
let (input, _) = tag("-")(input)?;
let (input, upper) = digit1(input)?;
let (input, _) = tag(" ")(input)?;
let (input, parsed_character) = anychar(input)?;
let (input, _) = tag(": ")(input)?;
let (input, password) = alpha1(input)?;
Ok((
input,
PasswordLine {
lower: lower.parse::<u8>().unwrap(),
upper: upper.parse::<u8>().unwrap(),
character: parsed_character,
password,
},
))
}

in day 04, we write a more complicated parser that can handle a variety of different attributes in different orders.

nom_locate and LocatedSpan

I went back through day 4's parser and introduced nom_locate's LocatedSpan to replace the input &str. This allows the grabbing of positional data for tokens at any point. I want to use this information in a custom error type for when documents are invalid (although in advent of code, the input is usually not malformed).

span-example.rs
rust
fn cid(input: Span) -> IResult<Span, PassportParse> {
let (input, _) = tag("cid:")(input)?;
let (input, _) = digit1(input)?;
Ok((input, CID(())))
}

spans acquired via position have empty fragments.

rust
{
offset: 21,
line: 1,
fragment: "",
extra: (),
}

Whereas if we dbg! the input spans, they contain fragments

rust
Span {
offset: 46,
line: 2,
fragment: "\npid:545766238 ecl:hzl\neyr:2022",
extra: (),
}

Sharing Functionality

One interesting piece of using parser combinators is that we can build up some of our own parser functionality and re-use them. This example shows a year parser that is used to implement the parsers for byr, iyr, and eyr which all have similar requirements.

composing-parsers.rs
rust
fn year<'a>(prefix: &str, lower: usize, upper: usize, input: Span<'a>) -> IResult<Span<'a>, usize> {
let (input, _) = tag(prefix)(input)?;
let (input, year) = digit1(input)?;
match year.parse::<usize>() {
Ok(digits) => {
if digits >= lower && digits <= upper {
Ok((input, digits))
} else {
Err(nom::Err::Error(nom::error::Error {
input,
code: nom::error::ErrorKind::Digit,
}))
}
}
_ => Err(nom::Err::Error(nom::error::Error {
input,
code: nom::error::ErrorKind::Digit,
})),
}
}
fn byr(input: Span) -> IResult<Span, PassportParse> {
year("byr:", 1920, 2002, input).map(|(i, r)| (i, BYR(r)))
}
fn iyr(input: Span) -> IResult<Span, PassportParse> {
year("iyr:", 2010, 2020, input).map(|(i, r)| (i, IYR(r)))
}
fn eyr(input: Span) -> IResult<Span, PassportParse> {
year("eyr:", 2020, 2030, input).map(|(i, r)| (i, EYR(r)))
}

we can .map over the result to turn them into the PassportParse type we need to satisfy alt.

Performance

The runtime of the parser is about 500 us.

| example | lower bound | best guess | upper bound | | ------- | ----------- | ---------- | ----------- | | nom | 578.63 us | 581.37 us | 584.33 us |