MacroKata

Welcome to MacroKata, a set of exercises which you can use to learn how to write macros in Rust. When completing each task, there are three goals:

  • Get your code to compile without warnings or errors.
  • Get your code to "work correctly" (i.e. produce the same output)
  • Importantly, generate the same code as what the sample solution does.

You should complete the kata in order, as they increase in difficulty, and depend on previous kata.

This set of exercises is written for people who have already spent some time programming in Rust. Before completing this, work through a Rust tutorial and build some small programs yourself.

Getting Started

Clone this repository:

$ git clone https://www.github.com/tfpk/macrokata/

You will also need to install the Rust "nightly" toolchain, so that we can show expanded macros:

$ rustup toolchain install nightly

Next, install cargo-expand:

$ cargo install cargo-expand

Build the main binary provided with this repo:

$ cargo build --bin macrokata

You can find the first kata (my_first_macro) inside exercises/01_my_first_macro. Read the first chapter of the book and get started by editing the main.rs file.

To compare your expanded code to the "goal", use the test subcommand:

$ cargo run -- test 01_my_first_macro

You can run your own code as follows:

$ cargo run --bin 01_my_first_macro

How To Learn About Procedural Macros

I was originally planning to expand macrokata into discussing procedural macros as well. As I was researching that, I found dtolnay's superlative Proc Macro Workshop. Jon Gjengset's video on proc-macros is also a phenomenal resource (despite its length).

I've put my attempt to write something like that on hold because I think the above is better in every way. Do file an issue if there's something that we could do here to complement that workshop though.

Exercise 1: My First Macro

Welcome to this introduction to Rust's Macro system. To complete each exercise (including this one), you should:

  • Read this file to understand the theory being tested, and what task you will be asked to complete.
  • Try and complete the main.rs file.
  • Test to see if your macro creates the same code we have, using cargo run -- test 01_my_first_macro.
  • Run your code, using cargo run --bin 01_my_first_macro, to see what it does.

What are Macros?

Rust's macros are a way of using code to generate code before compilation. Because the generation happens before the compiler does anything, you are given much more flexibility in what you can write.

This allows you to break many of the syntax rules Rust imposes on you. For example, Rust does not allow "variadic" functions: functions with variable numbers of arguments. This makes a println function impossible -- it would have to take any number of arguments (println("hello") and println("{}", 123), for example).

Rust gets around this rule by using a println! macro. Before println! is compiled, Rust rewrites the macro into a function which takes a single array of arguments. That way, even though it looks to you like there are multiple arguments, once it's compiled there's always just one array.

Macros can range from simple (e.g. reducing duplicated code) to complex (e.g. implementing HTML parsing inside of Rust). This guide aims to build you up from the simple to the complex.

As mentioned, you've already used macros: println! for example, is a macro. vec![] is as well. Macros always have a name. To run a macro, call its name with a bang (!) afterwards, and then brackets (any of (), [] or {}) containing arguments.

In other words, to run the macro my_macro, you'd say my_macro!() or my_macro![] or my_macro!{}.

Macro Rules vs. Procedural Macros

Rust has two macros systems, but this guide will only focus on one. macro_rules! are a special language to describe how to transform code into valid Rust code: this is the system we will focus on. Procedural macros (proc-macros) are a method of writing a Rust function which transforms an input piece of Rust code into an output piece.

Proc Macros are useful, but complex, and not the subject of this guide. You can read more about them here.

How do I create one?

The simplest form of macro looks like this:

macro_rules! my_macro {
    () => {
        3
    }
}

fn main() {
let _value = my_macro!();
}

The macro_rules! instructs the compiler that there is a new macro you are defining. It is followed by the name of the macro, my_macro. The next line specifies a "rule". Inside the normal brackets is a "matcher" -- some text (formally, we refer to the text as "tokens") -- which Rust will use to decide which rule to execute. Inside the curly brackets is a "transcriber", which is what Rust will replace my_macro!() with.

So, my_macro!() will be replaced by 3.

Exercise 1: My First Macro

Your task is to write a macro named show_output!() which calls the show_output() function.

You may not edit the main function, but it should eventually look like the following:

fn main() {
    show_output()
}

Exercise 2: Numbers

As a reminder, to complete this exercise:

  • Read this file to understand the theory being tested, and what task you will be asked to complete.
  • Try and complete the main.rs file.
  • Test to see if your macro creates the same code we have; using cargo run -- test 02_numbers.
  • Run your code, using cargo run --bin 02_numbers, to see what it does.

Macros With Arguments

Macros would be pretty useless if you couldn't modify their behaviour based on input from the programmer. To this end, let's see how we can vary what our macro does.

The simplest way of doing this is to have our macro behave differently if different tokens are placed in-between the matcher. As a reminder, the matcher is the bit in each rule before the =>.

Below we see a macro which will replace itself with true if the letter t is inside the brackets; and f otherwise.

macro_rules! torf {
    (t) => {
        true
    };
    (f) => {
        false
    };
}
fn main() {
let _true = torf!(t);
let _false = torf!(f);
}

You'll note the syntax has changed slightly: we've gone from having one of the () => {} blocks (which is called a rule) to having two. Macros try to find the first rule that matches, and replaces the macro with the contents of the transcriber block.

Macros are very similar to a match statement because they find the first match and take action based on that; but it's important to note that you're not matching on variables, you're matching on tokens.

But what is a "token"

Up until now, we've spoken about "tokens" without explaining what we mean, further than a handwavy "it's text".

When Rust code is compiled, one of the first steps of parsing is turning bytes of text into a "token tree", which is a data-structure representing the text-fragments of a line of code (so (3 + (4 + 5)) becomes a token tree containing 3, + and another token tree containing 4, + and 5).

This means that macro matchers aren't restricted to matching exact text, and that they preserve brackets when matching things.

As you've seen above, macros let you capture all the tokens inside their brackets, and then modify the code the write back out based on those tokens. This ability to react to different pieces of code without them having been fully compiled lets us create powerful extensions to the Rust language, using your own syntax.

Further advanced reading about what tokens are can be found here.

Exercise 2: Numbers

Your task is to create a macro called num which replaces the words one, two and three with the relevant numbers.

You may not edit the main function, but it should eventually look like the following:

fn main() {
    print_result(1 + 2 + 3);
}

Exercise 3: Literal Metavariables

In the last exercise, we saw how we could change the behaviour of a macro based on text inside the brackets. This is great, but it's basically an if statement on the text inside the brackets: it's very simplistic.

Now we will introduce the concept of a "metavariable". Metavariables capture a particular part of the text inside the macro's brackets, and let you reuse it.

The syntax for a metavariable is simple. To explain the syntax, see the example below:

macro_rules! do_thing {
    (print $metavar:literal) => {
        println!("{}", $metavar)
    };
}

The $metavar:literal is saying that you're capturing any literal (which is something like 'a', or 3, or "hello"), and naming it metavar. Then, $metavar inside the println! is saying to "fill in" that space with whatever metavar is.

For an invocation like

macro_rules! do_thing {
    (print $metavar:literal) => {
        println!("{}", $metavar)
    };
}

fn main() {
do_thing!(print 3);
}

Rust understands that metavar means 3. So, when doing substitution, it starts by writing

println!("{}", $metavar);

and then substitutes 3 for $metavar:

fn main() {
println!("{}", 3);
}

But what about types?

You might be wondering why we haven't said anything about the type of the literal. It turns out that the type doesn't matter during macro expansion. Rather than needing the type, Rust just needs to know what sort of syntax to expect. If you tried to provide a variable name, and you needed a literal, Rust will throw an error. If you needed a string literal, and you provided a char literal, then Rust will happily expand the code. It'll throw an error later on in the compilation process, as if you had written the expanded code.

Why do these examples avoid using macros?

The example above uses the println! macro inside the do_thing macro. Rust is totally fine with this! However, macrokata tries to avoid (as much as possible) using macros we didn't define inside the main function. The reason for this is that, if we did use println! you would see its expansion as well. That could be confusing, since

print("some text")

is much easier to read than

    {
        ::std::io::_print(
            ::core::fmt::Arguments::new_v1(
                &["some text"],
                &[],
            ),
        );
    };

Exercise 3: Literal Meta-Variables

Your task is to create a macro which can perform two small bits of math:

  • The syntax math!(3 plus 5) should expand to 3 + 5, where 3 and 5 could be any literal.
  • The syntax math!(square 2) should expand to 2 * 2, where 2 could be any literal.

You may not edit the main function, but it should eventually look like the following:

fn main() {
    print_result(3 + 5);
    print_result(2 * 2);
}

Exercise 4: Expression Metavariables

We can now capture fragments of Rust code that are literals, however there are other fragments of Rust code which can be captured in metavariables. In general, every metavariable is of the form $<NAME>:<FRAGSPEC>. <NAME> is replaced with the name of the metavariable, but FRAGSPEC is more interesting. It means "Fragment Specifier", and it tells you what sort of fragment of Rust code you intend to match. We've already seen literal, but another common fragment specifier is expr, which allows you to capture any Rust expression (for example, (3 * 5) or function_call() + CONSTANT).

Using this specifier is nearly identical to using the literal fragment specifier: $x:expr indicates a metavariable, which is an expression, named x.

It's also worth mentioning the fragment specifier stmt, which is similar to expr, but allows Rust statements too, like let statements.

Macros and the Precedence of Operators

Macros do affect the order of operations. The expression 3 * math!(4, plus, 2) expands to 3 * (4 + 2). This is not clearly outlined anywhere (that I can find), and a previous version of this guide incorrectly stated the opposite.

You can check this behaviour by seeing the following:

macro_rules! math {
    () => { 3 + 4 }
}

fn main() {
    let math_result = 2 * math!();
   
    // 2 * (3 + 4) == 14
    assert_eq!(math_result, 14);
    
    // (2 * 3) + 4 == 10
    assert_ne!(math_result, 10);
}

"Follow-set Ambiguity Rules"

The Rust parser needs to have some way of knowing where a metavariable ends. If it didn't, expressions like $first:expr $second:expr would be confusing to parse in some circumstances. For example, how would you parse a * b * c * d? Would first be a, and second be *b * c * d? Or would first be a * b * c, and second be * d?

To avoid this problem entirely, Rust has a set of rules called the "follow-set ambiguity rules". These tell you which tokens are allowed to follow a metavariable (and which aren't).

For literals, this rule is simple: anything can follow a literal metavariable.

For expr (and its friend stmt) the rules are much more restrictive: they can only be followed by => or , or ;.

This means that building a matcher like

macro_rules! broken_macro {
    ($a:expr please) => $a
}

fn main() {
    // Fails to compile!
    let value = broken_macro!(3 + 5 please);
}

will give you this compiler error:

error: `$a:expr` is followed by `please`, which is not allowed for `expr` fragments
 --> broken_macro.rs:2:14
  |
2 |     ($a:expr please) => { $a }
  |              ^^^^^^ not allowed after `expr` fragments
  |
  = note: allowed there are: `=>`, `,` or `;`

As we encounter more expression types, we'll make sure to mention their follow-set rules, but this page in the Rust reference has a comprehensive list of the rules for each fragment specifier type.

Exercise 4: Expression Variables

In this task, you will be completing a similar task to the previous one. Last time, your macro should have worked with any literal, but now we would like a macro which works with any expression.

  • The syntax math!(3, plus, (5 + 6)) should expand to 3 + (5 + 6), where 3 and (5 + 6) could be any expression.
  • The syntax math!(square my_expression) should expand to my_expression * my_expression, where my_expression could be any expression.

You may not edit the main function, but it should eventually look like the following:

fn main() {
    let var = 5;
    print_result((2 * 3) + var);
    print_result(var * var);
}

Exercise 5: A More Complex Example

In this task, we'll be implementing code to make the following syntax possible:

fn main() {
for_2d!(row <i32> in 1..5, col <i32> in 2..7, {
    // code
});
}

Ignoring extra curly braces, this code should translate to

fn main() {
for row in 1..5 {
    let row: i32 = row;
    for col in 2..7 {
        let col: i32 = col;
        // code
    }
}
}

Note that the names of the variables may change (i.e. they could be row and col, or x and y, or something else).

To complete this task, there more fragment specifiers you will need to know about:

  • ident: an "identifier", like a variable name. ident metavariables Can be followed by anything.
  • block: a "block expression" (curly braces, and their contents). Can be followed by anything.
  • ty: a type. Can only be followed by =>, ,, =, |, ;, :, >, >>, [, {, as, where, or a block metavariable.

As a reminder, you may not edit the main function, but it should eventually look like the following:

fn main() {
    for row in 1..5 {
        let row: i32 = row;
        for col in 2..7 {
            let col: i32 = col;
            { (Coordinate { x: col, y: row }).show() }
        }
    }
    let values = [1, 3, 5];
    for x in values {
        let x: u16 = x;
        for y in values {
            let y: u16 = y;
            {
                (Coordinate {
                    x: x.into(),
                    y: y.into(),
                })
                    .show()
            }
        }
    }
}

Exercise 6: Repetition

Hopefully, you're now feeling pretty confident with metavariables. One of the first justifications we gave for macros was their ability to simulate "variadic" functions (functions which have a variable number of arguments). In this exercise, we'll have a look at how you can implement them yourself.

A simple approach might be to write a rule for each number of arguments. For example, one might write

macro_rules! listing_literals {
    (the $e1:literal) => {
        {
            let mut my_vec = Vec::new();
            my_vec.push($e1);
            my_vec
        }
    };
    (the $e1:literal and the $e2:literal) => {
        {
            let mut my_vec = Vec::new();
            my_vec.push($e1);
            my_vec.push($e2);
            my_vec
        }
    };
    (the $e1:literal and the $e2:literal and the $e3:literal) => {
        {
            let mut my_vec = Vec::new();
            my_vec.push($e1);
            my_vec.push($e2);
            my_vec.push($e3);
            my_vec
        }
    }
}

fn main() {
    let vec: Vec<&str> = listing_literals!(the "lion" and the "witch" and the "wardrobe");
    assert_eq!(vec, vec!["lion", "witch", "wardrobe"]);
    let vec: Vec<i32> = listing_literals!(the 9 and the 5);
    assert_eq!(vec, vec![9, 5]);
}

This is very clunky, and involves a large amount of repeated code. Imagine doing this for 10 arguments! What if we could say that we want a variable number of a particular patterns? That would let us say "give me any number of $e:expr tokens, and I'll tell you what to do with them'".

Macro repetitions let us do just that. They consist of three things:

  • A group of tokens that we want to match repeatedly.
  • Optionally, a separator token (which tells the parser what to look for between each match).
  • Either +, * or ?, which says how many times to expect a match. + means "at least once". * means "any number of times, including 0 times". ? means "either 0 times, or 1 time".

Let's look at an example of a macro repetition, to parse the exact macro we showed above.

The matcher we would use for this is $(the $my_literal:literal)and+. To break that down:

  • $( means that we're starting a repetition.
  • Inside the brackets, the $my_literal:literal is the pattern we're matching. We'll match the exact text "the", and then a literal token.
  • The ) means that we're done describing the pattern to match.
  • The and is optional, but it is the "separator": a token you can use to separate multiple repetitions. Commonly it's , to comma-separate things.
  • Here, we use +, which means the repetition must happen at least once. * would have worked just as well if we were okay with an empty Vec.

What's now left is to use the matched values. To do this, the rule would be something like:

($(the $my_literal:literal)and+) => {
    {
        let mut my_vec = Vec::new();
        $(my_vec.push($my_literal));+;
        my_vec
    }
}

The line $(my_vec.push($my_literal));+; is nearly identical to the repetition we saw above, but to break it down:

  • $( tells us that we're starting a repetition.
  • my_vec.push($my_literal) is the code that will be transcribed. $my_literal will be replaced with each of the literals specified in the matcher.
  • The ) means that we're done describing the code that will be transcribed.
  • The ; means we're separating these lines with semicolons. Note that if you want, this could also be empty (to indicate they should be joined without anything in the middle).
  • The + ends the repetition.
  • The ; adds a final semicolon after the expansion of everything.

So this will expand into the same code we saw above!

It's worth noting that we've used an extra set of curly braces in our transcriber. This is because if you don't put the code in a block, the code will look like let whatever = let mut my_vec = Vec::new();, which doesn't make sense.

If you put the code in a curly brace, then the right-hand side of the = sign will be a block which returns my_vec.

Exercise 6: Repetition

In this task, you will be creating an if_any! macro. If any of the first arguments are true, it should execute the block which is the last argument.

You may not edit the main function, but once you have completed the exercise, your if_any! macro should expand to look like the following:

fn main() {
    if false || 0 == 1 || true {
        print_success();
    }
}

Exercise 7: More Repetition

This exercise is going to also cover writing repetitions, but now involving more than one metavariable. Don't worry: the syntax is the exact same as what you've seen before.

Before you start, let's just quickly cover the different ways you can use a metavariable within a repetition.

Multiple Metavariables in One Repetition

You can indicate that two metavariables should be used in a single repetition.

For example, ( $($i:ident is $e:expr),+ ) would match my_macro!(pi is 3.14, tau is 6.28). You would end up with $i having matched pi and tau; and $e having matched 3.14 and 6.28.

Any repetition in the transcriber can use $i, or $e, or both within the same repetition. So a transcriber could be $(let $i = $e;)+; or let product = $($e)*+

One Metavariable Each, For Two Repetitions

Alternatively, you could specify two different repetitions, each containing their own metavariable. For example, this program will construct two vecs.

macro_rules! two_vecs {
    ($($vec1:expr),+; $($vec2:expr),+) => {
        {
            let mut vec1 = Vec::new();
            $(vec1.push($vec1);)+
            let mut vec2 = Vec::new();
            $(vec2.push($vec2);)+

            (vec1, vec2)
        }
    }
}

fn main() {
    let vecs = two_vecs!(1, 2, 3; 'a', 'b');
}

Importantly, with the above example, you have to be careful about using $vec1 and $vec2 in the same repetition within the transcriber. It is a compiler error to use two metavariables captured a different number of times in the same repetition.

To quote the reference:

Each repetition in the transcriber must contain at least one metavariable to decide how many times to expand it. If multiple metavariables appear in the same repetition, they must be bound to the same number of fragments. For instance, ( $( $i:ident ),* ; $( $j:ident ),* ) => (( $( ($i,$j) ),* )) must bind the same number of $i fragments as $j fragments. This means that invoking the macro with (a, b, c; d, e, f) is legal and expands to ((a,d), (b,e), (c,f)), but (a, b, c; d, e) is illegal because it does not have the same number.

Exercise 7: More Repetition

In this task, you will be creating a hashmap macro. It should consist of comma-separated pairs, of the form literal => expr, This should construct an empty HashMap and insert the relevant key-value pairs.

You may not edit the main function, but it should eventually look like the following:

fn main() {
    let value = "my_string";
    let my_hashmap = {
        let mut hm = HashMap::new();
        hm.insert("hash", "map");
        hm.insert("Key", value);
        hm
    };
    print_hashmap(&my_hashmap);
}

Exercise 8: Nested Repetition

In this exercise, you will need to use nested repetition. That's where you write a repetition inside another one, for example, ( $( $( $val:expr ),+ );+ ) would let you specify at least one value, but separate them with either ; and ,.

The only oddity about nested repetition is that you must ensure that you use metavariables in a context where it's clear you're only referring to one of them. In other words, the $val metavariable in the last paragraph must be used within a nested repetition.

Exercise 8: Nested Repetition

In this task, you will be building a macro to load a data structure with an adjacency list from a graph. As a refresher, graphs are data structures that describe how different nodes are connected.

Each will be a literal, and you will be specifying, for each node, which nodes it connects to. For example,

graph!{
    1 -> (2, 3, 4, 5);
    2 -> (1, 3);
    3 -> (2);
    4 -> ();
    5 -> (1, 2, 3);
}

should get translated into a Vec containing the pairs (1, 2), (1, 3), ... (2, 1), ... (5, 3).

You may not edit the main function, but it should eventually look like the following:

#[allow(clippy::vec_init_then_push)]
fn main() {
    let my_graph = {
        let mut vec = Vec::new();
        vec.push((1, 2));
        vec.push((1, 3));
        vec.push((1, 4));
        vec.push((1, 5));
        vec.push((2, 1));
        vec.push((2, 3));
        vec.push((3, 2));
        vec.push((5, 1));
        vec.push((5, 2));
        vec.push((5, 3));
        vec
    };
    print_vec(&my_graph);
}

Exercise 9: Ambiguity and Ordering

Up until this point, we've mostly been dealing with macros with a single rule. We saw earlier that macros can require more than one rule, but so far we've never had ambiguity in which rule should be followed.

There are, however, multiple circumstances where rules could have ambiguity, so it's important to understand how macros deal with that ambiguity.

The following is adapted from the rust documentation on macros:

  • When a macro is invoked (i.e. someone writes my_macro!()), the compiler looks for a macro with that name, and tries each rule in turn.

  • To try a rule, it reads through each token in the parser in turn. There are three possibilities:

    1. The token found matches the matcher. In this case, it keeps parsing the next token. If there are no tokens left, and the matcher is complete, then the rule matches.
    2. The token found does not match the matcher. In this case, Rust tries the next rule. If there are no rules left, an error is raised as the macro cannot be expanded.
    3. The rule is ambiguous. In other words, it's not clear from just this token what to do. If this happens, this is an error.
  • If it finds a rule that matches the tokens inside the brackets; it starts transcribing. Once a match is found, no more rules are examined.

Let's have a look at some examples:

macro_rules! ambiguity {
    ($($i:ident)* $j:ident) => { };
}

fn main() {
ambiguity!(error);
}

This example fails because Rust is not able to determine what $j should be just by looking at the current token. If Rust could look forward, it would see that $j must be followed by a ), but it cannot, so it causes an error.

macro_rules! ordering {
    ($j:expr) => { "This was an expression" };
    ($j:literal) => { "This was a literal" };
}

fn main() {
let expr1 = ordering!('a');  // => "This was an expression".
let expr1 = ordering!(3 + 5);  // => "This was an expression".
}

This example demonstrates an example where Rust macros can behave strangely due to ordering rules: even though literal is a much stricter condition than expr, because literals are exprs, the first rule will always match.

Exercise 9: Ambiguity and Ordering

This task is a little bit different to previous tasks: we have given you a partially functional macro already, along with some invocations of that macro.

You should adjust the macro's rules and syntax to make sure that you achieve the correct behaviour without any ambiguity.

  • sum!() should sum together two or more expressions together.
  • get_number_type!() should determine what sort of Rust syntax is being used: a positive literal, a negative literal, a block, or an expression.

You may not edit the main function, but it should eventually look like the following:

fn main() {
    NumberType::PositiveNumber(5).show();
    NumberType::NegativeNumber(-5).show();
    #[allow(clippy::let_and_return)]
    NumberType::UnknownBecauseBlock({
            let x = 6;
            x
        })
        .show();
    NumberType::UnknownBecauseExpr(1 + 2 + 3 + 4).show();
    NumberType::UnknownBecauseExpr(3 + 5 - 1).show();
}

Exercise 10: Macros Calling Macros

We briefly mentioned in a previous exercise that macros are able to call other macros. In this exercise we will look at a brief example of that. Before we do, there are three small notes we should mention.

Useful built-in macros

There are two useful macros which the standard library provides - stringify!() and concat!(). Both of them produce static string slices, made up of tokens.

The stringify! macro takes tokens and turns them into a &str that textually represents what those tokens are. For example, stringify!(1 + 1) will become "1 + 1".

The concat! macro takes a comma-separated list of literals, and creates a &str which concatenates them. For example, concat!("test", true, 99) becomes "testtrue99".

It's useful to know that if either of these have a macro in their parameter, (i.e. stringify!(test!())), the internal macro will be expanded first. So, if test!() expanded to 1 + 1, your string would be "1 + 1", not "test!()".

The tt fragment specifier

An important macro specifier which we have not, as of yet, discussed, is the tt macro. This captures a "Token Tree", which is any token, or a group of tokens inside brackets. This is the most flexible fragment specifier, because it imposes no meaning on what the captured tokens might be. For example:

macro_rules! stringify_number {
    (one) => {"1"};
    (two) => {"2"};
    ($tokens:tt) => { stringify!($tokens)};
}

fn main() {
stringify_number!(one); // is "1"
stringify_number!(while); // is "while"
stringify_number!(bing_bang_boom); // is "bing_bang_boom"
}

It's really important to keep in mind with tt macros that you must ensure that anything after them can be unambiguously parsed.

In other words, the metavariable $($thing:tt)* (ending with *, + OR ?) must be the last fragment in the parser. Since anything can be a token tree, Rust could not know what to accept after that parser.

To avoid this issue, you can either match a single tt, and make the user wrap multiple tokens inside brackets, or you can specify a delimiter for your match (i.e. $($thing:tt),+, since two token trees not separated by a , could not match).

Restrictions on "Forwarding Macros"

There is one important restriction when calling a macro using another macro.

When forwarding a matched fragment to another macro-by-example, matchers in the second macro will be passed an AST of the fragment type, which cannot be matched on except as a fragment of that type. The second macro can't use literal tokens to match the fragments in the matcher, only a fragment specifier of the same type. The ident, lifetime, and tt fragment types are an exception, and can be matched by literal tokens. The following illustrates this restriction:

macro_rules! foo {
    ($l:expr) => { bar!($l); }
// ERROR:               ^^ no rules expected this token in macro call
}

macro_rules! bar {
    (3) => {}
}

fn main() {
foo!(3);
}

The following illustrates how tokens can be directly matched after matching a tt fragment:

// compiles OK
macro_rules! foo {
    ($l:tt) => { bar!($l); }
}

macro_rules! bar {
    (3) => {}
}

fn main() {
foo!(3);
}

Exercise 10: Macros Calling Macros

In this exercise, you have already been provided with a macro called digit, which maps the identifiers zero through nine to a &str with their numeric value.

Your task is to write a macro called number!() which takes at least one of the identifiers zero through nine, and converts them to a string containing numbers.

For example, number!(one two three) should expand to "123".

Note: previously exercise 10 was about making a hashmap. The exercise has changed, but the old code is still available in the archive/ directory. It will be removed on the next update of this book.

Exercise 11: Macro Recursion

This exercise is a sort of culmination of everything you've learned so far about macros.

To complete it, you'll need to note one important fact: macros can recurse into themselves.

This allows very powerful expansions. As a simple example:


enum LinkedList {
    Node(i32, Box<LinkedList>),
    Empty
}

macro_rules! linked_list {
    () => {
        LinkedList::Empty
    };
    ($expr:expr $(, $exprs:expr)*) => {
        LinkedList::Node($expr, Box::new(linked_list!($($exprs),*)))
    }
}

fn main() {
    let my_list = linked_list!(3, 4, 5);
}

The above example is very typical. The first rule is the "base case": an empty list of tokens implies an empty linked list.

The second rule always matches one expression first (expr). This allows us to refer to it on its own, in this case to create the Node. The rest of the expressions (exprs) are stored in a repetition, and all we'll do with them is recurse into linked_list!(). If there's no expressions left, that call to linked_list!() will give back Empty, otherwise it'll repeat the same process.

While macro recursion is incredibly powerful, it is also slow. As a result, there is a limit to the amount of recursion you are allowed to do. In rustc, the limit is 128, but you can configure it with #![recursion_limit = "256"] as a crate-level attribute.

Exercise 11: Currying

Before you complete the exercise, let's briefly discuss a concept called "currying". If you're already familiar with the concept, perhaps from your own experience of functional programming, you can skip the next two paragraphs.

In most imperative languages, the syntax to call a function with multiple arguments is function(arg1, arg2, arg3). If you do not provide all the arguments, that is an error. In many functional languages, however, the syntax for function calls is more akin to function(arg1)(arg2)(arg3). The advantage of this notation is that if you specify less than the required number of arguments, it's not an error: you get back a function that takes the rest of the arguments. A function that behaves this way is said to be "curried" (named after Haskell Curry, a famous mathematician).

A good example of this is a curried add function. In regular Rust, we'd say add is move |a, b| a + b. If we curried that function, we'd instead have move |a| move |b| a + b. What this means is that we can write let add_1 = add(1);, and we now have a function which will add 1 to anything.

In this exercise, you will build a macro which helps you understand currying, and build a curried function in Rust. The syntax for this macro will be curry!((a: i32) => (b: i32) => _, {a + b}). Each pair of ident: ty is an argument, and the last _ indicates that the compiler will infer the return type. The block provided last is, of course, the computation we want to do after receiving all the arguments.

Each step of the currying process, you should call the macro print_curried_argument. This takes in a value (which, for the purposes of the exercise, you can assume will always be Copy). It will print out the value that you have been provided as an argument.

Exercise 12: Hygiene

To quote the reference:

By default, all identifiers referred to in a macro are expanded as-is, and are looked up at the macro's invocation site. This can lead to issues if a macro refers to an item (i.e. function/struct/enum/etc.) or macro which isn't in scope at the invocation site. To alleviate this, the $crate metavariable can be used at the start of a path to force lookup to occur inside the crate defining the macro.

Here is an example to illustrate (again, taken from the reference linked above):

// Definitions in the `helper_macro` crate.
#[macro_export]
macro_rules! helped {
    /*
    () => { helper!() }
         // ^^^^^^ This might lead to an error due to 'helper' not being in scope.
    */
    () => { $crate::helper!() }
}

#[macro_export]
macro_rules! helper {
    () => { () }
}

// Usage in another crate.
// Note that `helper_macro::helper` is not imported!

use helper_macro::helped;

fn unit() {
    helped!();
}

In other words, this means that a macro needs to have all the functions/structs/macros it uses in scope at its call site, not at the place where it is defined. This is fine if the macro is used within a single file, but if a macro is exported, then it makes things complicated.

The $crate metavariable lets you refer to things that are in the crate the macro was defined in (as opposed to the crate the macro was called in). If the macro was defined in crate foo, and used in crate bar, then a reference to a struct Widget like: Widget::new() is creating a new bar::Widget (and if one doesn't exist, you'll get an error). If you called $crate::Widget::new(), then you're always talking about foo::Widget, no matter what crate you're in.

A footnote on how expanding macros into text is misleading

(Based on the disclaimer for the brilliant cargo-expand)

Be aware that macro expansion to text is a lossy process. That means that the expanded code we show in these kata should be used as a debugging aid only. There should be no expectation that the expanded code can be compiled successfully, nor that if it compiles then it behaves the same as the original code. In these kata, we try to avoid these issues as far as possible.

For instance, answer = 3 when compiled ordinarily by Rust, but the expanded code, when compiled, would set answer = 4.

fn main() {
    let x = 1;

    macro_rules! first_x {
        () => { x }
    }

    let x = 2;

    let answer = x + first_x!();
}

Refer to The Little Book Of Rust Macros for more on the considerations around macro hygiene.

Exercise 12

Exercise 12 consists of a file containing multiple modules. Fix the code so that the macro works correctly in all invocations.

Note that you will need to use the $crate metavariable.

The subject of scoping, importing and exporting are well covered by the Rust Reference.

While practical examples of these may be useful, this first verison of macrokata does not include exercises for them. If you plan on using macros in a larger project, we suggest reading the above reference.

Extra Reading

There are two excellent resources for further reading on Rust's macro system: