Nom is a parser combinator library in Rust. We can use this to write a Rust implementation of MDX, starting with headings.
Our goal is to parse the following mdx file (which in this case has no differences from a markdown file).
main.rs we'll use a couple of nom functions.
and a custom error which acts pretty much like the original.
At the top of the file we'll define our data structure. This is what we're going to parse the MDX into. In this case it's an
ATXHeading struct (the name of one type of heading in commonmark). In this case we're using a reference to a
[u8] with a lifetime annotation, but that's not super important. We could have also used
We'll start with a couple of parsers for hashes and spaces. Nom uses macros quite heavily although in 5.0 you can also write parsers with functions as we'll see in a moment. The
named! macro uses the identifier in the first argument (
spaces) and builds the macros in the second argument into that identifier, so we can use
spaces as parsers later.
Then we write a few function-based parsers that operate on strings and return
IResult is a super important type to get to know because it's used everywhere and specifying the types for it is super important. While the current return for these parsers is an
IResult<&str, &str> with two type arguments (the input and return types), later we'll see that we can also use three to determine the error value in addition.
The meat of our setup is
atx_heading which uses the parsers we defined earlier to parse values out and return a tuple of the leftover input and the atx struct or an error. We use
.map_err to convert the return types into our custom error type so that we can return our own custom error if the hash length for the heading is greater than 6, which means it should be a paragraph. Our heading parser doesn't care about paragraphs, it only cares that it has to fail and the paragraph parser will occur somewhere else in our program.
Finally, here's a test that asserts that we can parse an mdx string into the
Note that this is not a fully spec compliant parser (we noted
TODOs in the program comments) but it will work for specifically written headings. Can you flesh this out to parse the rest of the ATX Heading in the spec? This is part of my work on the MDX Rust implementation so by the time you read this there may be a more sophisticated parser for headings waiting for you there.