70 %

Toast is in Beta!Learn more →

Chris Biscardi

Advent of Code 2020 in Rust day 4: parser error locations with nom

In day 02 we wrote a simpler parser: using nom.

day-02-parser.rs
rust
fn password_line(input: &str) -> IResult<&str, PasswordLine> {
let (input, lower) = digit1(input)?;
let (input, _) = tag("-")(input)?;
let (input, upper) = digit1(input)?;
let (input, _) = tag(" ")(input)?;
let (input, parsed_character) = anychar(input)?;
let (input, _) = tag(": ")(input)?;
let (input, password) = alpha1(input)?;
Ok((
input,
PasswordLine {
lower: lower.parse::<u8>().unwrap(),
upper: upper.parse::<u8>().unwrap(),
character: parsed_character,
password,
},
))
}

in day 04, we write a more complicated parser that can handle a variety of different attributes in different orders.

nom_locate and LocatedSpan

I went back through day 4's parser and introduced nom_locate's LocatedSpan to replace the input &str. This allows the grabbing of positional data for tokens at any point. I want to use this information in a custom error type for when documents are invalid (although in advent of code, the input is usually not malformed).

span-example.rs
rust
fn cid(input: Span) -> IResult<Span, PassportParse> {
let (input, _) = tag("cid:")(input)?;
let (input, _) = digit1(input)?;
Ok((input, CID(())))
}

spans acquired via position have empty fragments.

rust
{
offset: 21,
line: 1,
fragment: "",
extra: (),
}

Whereas if we dbg! the input spans, they contain fragments

rust
Span {
offset: 46,
line: 2,
fragment: "\npid:545766238 ecl:hzl\neyr:2022",
extra: (),
}

Sharing Functionality

One interesting piece of using parser combinators is that we can build up some of our own parser functionality and re-use them. This example shows a year parser that is used to implement the parsers for byr, iyr, and eyr which all have similar requirements.

composing-parsers.rs
rust
fn year<'a>(prefix: &str, lower: usize, upper: usize, input: Span<'a>) -> IResult<Span<'a>, usize> {
let (input, _) = tag(prefix)(input)?;
let (input, year) = digit1(input)?;
match year.parse::<usize>() {
Ok(digits) => {
if digits >= lower && digits <= upper {
Ok((input, digits))
} else {
Err(nom::Err::Error(nom::error::Error {
input,
code: nom::error::ErrorKind::Digit,
}))
}
}
_ => Err(nom::Err::Error(nom::error::Error {
input,
code: nom::error::ErrorKind::Digit,
})),
}
}
fn byr(input: Span) -> IResult<Span, PassportParse> {
year("byr:", 1920, 2002, input).map(|(i, r)| (i, BYR(r)))
}
fn iyr(input: Span) -> IResult<Span, PassportParse> {
year("iyr:", 2010, 2020, input).map(|(i, r)| (i, IYR(r)))
}
fn eyr(input: Span) -> IResult<Span, PassportParse> {
year("eyr:", 2020, 2030, input).map(|(i, r)| (i, EYR(r)))
}

we can .map over the result to turn them into the PassportParse type we need to satisfy alt.

Performance

The runtime of the parser is about 500 us.

examplelower boundbest guessupper bound
nom578.63 us581.37 us584.33 us