MacroKata
Welcome to MacroKata, a set of exercises which you can use to learn how to write macros in Rust. When completing each task, there are three goals:
- Get your code to compile without warnings or errors.
- Get your code to "work correctly" (i.e. produce the same output)
- Importantly, generate the same code as what the sample solution does.
You should complete the kata in order, as they increase in difficulty, and depend on previous kata.
This set of exercises is written for people who have already spent some time programming in Rust. Before completing this, work through a Rust tutorial and build some small programs yourself.
Getting Started
Clone this repository:
$ git clone https://www.github.com/tfpk/macrokata/
You will also need to install the Rust "nightly" toolchain, so that we can show expanded macros:
$ rustup toolchain install nightly
Next, install cargo-expand
:
$ cargo install cargo-expand
Build the main binary provided with this repo:
$ cargo build --bin macrokata
You can find the first kata (my_first_macro
) inside exercises/01_my_first_macro
.
Read the first chapter of the book
and get started by editing the main.rs
file.
To compare your expanded code to the "goal", use the test
subcommand:
$ cargo run -- test 01_my_first_macro
You can run your own code as follows:
$ cargo run --bin 01_my_first_macro
How To Learn About Procedural Macros
I was originally planning to expand macrokata
into discussing procedural
macros as well. As I was researching that, I found dtolnay's superlative Proc
Macro Workshop.
Jon Gjengset's video on proc-macros
is also a phenomenal resource (despite its length).
I've put my attempt to write something like that on hold because I think the above is better in every way. Do file an issue if there's something that we could do here to complement that workshop though.
Exercise 1: My First Macro
Welcome to this introduction to Rust's Macro system. To complete each exercise (including this one), you should:
- Read this file to understand the theory being tested, and what task you will be asked to complete.
-
Try and complete the
main.rs
file. -
Test to see if your macro creates the same code we have, using
cargo run -- test 01_my_first_macro
. -
Run your code, using
cargo run --bin 01_my_first_macro
, to see what it does.
What are Macros?
Rust's macros are a way of using code to generate code before compilation. Because the generation happens before the compiler does anything, you are given much more flexibility in what you can write.
This allows you to break many of the syntax rules Rust imposes on you. For
example, Rust does not allow "variadic" functions: functions with variable
numbers of arguments. This makes a println
function impossible -- it would
have to take any number of arguments (println("hello")
and println("{}", 123)
, for example).
Rust gets around this rule by using a println!
macro. Before println!
is
compiled, Rust rewrites the macro into a function which takes a single array of
arguments. That way, even though it looks to you like there are multiple
arguments, once it's compiled there's always just one array.
Macros can range from simple (e.g. reducing duplicated code) to complex (e.g. implementing HTML parsing inside of Rust). This guide aims to build you up from the simple to the complex.
As mentioned, you've already used macros: println!
for example, is a macro.
vec![]
is as well. Macros always have a name. To run a macro, call its name
with a bang (!
) afterwards, and then brackets (any of ()
, []
or {}
)
containing arguments.
In other words, to run the macro my_macro
, you'd say my_macro!()
or
my_macro![]
or my_macro!{}
.
Macro Rules vs. Procedural Macros
Rust has two macros systems, but this guide will only focus on one.
macro_rules!
are a special language to describe how to transform
code into valid Rust code: this is the system we will focus on.
Procedural macros (proc-macros) are a method of writing a Rust function
which transforms an input piece of Rust code into an output piece.
Proc Macros are useful, but complex, and not the subject of this guide. You can read more about them here.
How do I create one?
The simplest form of macro looks like this:
macro_rules! my_macro { () => { 3 } } fn main() { let _value = my_macro!(); }
The macro_rules!
instructs the compiler that there is a new macro you are
defining. It is followed by the name of the macro, my_macro
. The next line
specifies a "rule". Inside the normal brackets is a "matcher" -- some text
(formally, we refer to the text as "tokens") -- which Rust will use to decide
which rule to execute. Inside the curly brackets is a "transcriber", which is
what Rust will replace my_macro!()
with.
So, my_macro!()
will be replaced by 3
.
Exercise 1: My First Macro
Your task is to write a macro named show_output!()
which calls the
show_output()
function.
You may not edit the main
function, but it should eventually look like the
following:
fn main() {
show_output()
}
Exercise 2: Numbers
As a reminder, to complete this exercise:
- Read this file to understand the theory being tested, and what task you will be asked to complete.
-
Try and complete the
main.rs
file. -
Test to see if your macro creates the same code we have; using
cargo run -- test 02_numbers
. -
Run your code, using
cargo run --bin 02_numbers
, to see what it does.
Macros With Arguments
Macros would be pretty useless if you couldn't modify their behaviour based on input from the programmer. To this end, let's see how we can vary what our macro does.
The simplest way of doing this is to have our macro behave differently if
different tokens are placed in-between the matcher. As a reminder, the matcher
is the bit in each rule before the =>
.
Below we see a macro which will replace itself with true
if the letter t
is
inside the brackets; and f
otherwise.
macro_rules! torf { (t) => { true }; (f) => { false }; } fn main() { let _true = torf!(t); let _false = torf!(f); }
You'll note the syntax has changed slightly: we've gone from having one of the
() => {}
blocks (which is called a rule) to having two. Macros try to find
the first rule that matches, and replaces the macro with the contents of the
transcriber block.
Macros are very similar to a match
statement because they find the first match
and take action based on that; but it's important to note that you're not matching
on variables, you're matching on tokens.
But what is a "token"
Up until now, we've spoken about "tokens" without explaining what we mean, further than a handwavy "it's text".
When Rust code is compiled, one of the first steps of parsing is turning bytes
of text into a "token tree", which is a data-structure representing the
text-fragments of a line of code (so (3 + (4 + 5))
becomes a token tree containing
3
, +
and another token tree containing 4
, +
and 5
).
This means that macro matchers aren't restricted to matching exact text, and that they preserve brackets when matching things.
As you've seen above, macros let you capture all the tokens inside their brackets, and then modify the code the write back out based on those tokens. This ability to react to different pieces of code without them having been fully compiled lets us create powerful extensions to the Rust language, using your own syntax.
Further advanced reading about what tokens are can be found here.
Exercise 2: Numbers
Your task is to create a macro called num
which replaces the words one
, two
and three
with the relevant numbers.
You may not edit the main
function, but it should eventually look like the
following:
fn main() {
print_result(1 + 2 + 3);
}
Exercise 3: Literal Metavariables
In the last exercise, we saw how we could change the behaviour of a macro based on text inside the brackets. This is great, but it's basically an if statement on the text inside the brackets: it's very simplistic.
Now we will introduce the concept of a "metavariable". Metavariables capture a particular part of the text inside the macro's brackets, and let you reuse it.
The syntax for a metavariable is simple. To explain the syntax, see the example below:
macro_rules! do_thing {
(print $metavar:literal) => {
println!("{}", $metavar)
};
}
The $metavar:literal
is saying that you're capturing any literal
(which is
something like 'a'
, or 3
, or "hello"
), and naming it metavar
. Then,
$metavar
inside the println!
is saying to "fill in" that space with whatever
metavar
is.
For an invocation like
macro_rules! do_thing { (print $metavar:literal) => { println!("{}", $metavar) }; } fn main() { do_thing!(print 3); }
Rust understands that metavar
means 3
. So, when doing substitution,
it starts by writing
println!("{}", $metavar);
and then substitutes 3
for $metavar
:
fn main() { println!("{}", 3); }
But what about types?
You might be wondering why we haven't said anything about the type of the literal. It turns out that the type doesn't matter during macro expansion. Rather than needing the type, Rust just needs to know what sort of syntax to expect. If you tried to provide a variable name, and you needed a literal, Rust will throw an error. If you needed a string literal, and you provided a char literal, then Rust will happily expand the code. It'll throw an error later on in the compilation process, as if you had written the expanded code.
Why do these examples avoid using macros?
The example above uses the println!
macro inside the do_thing
macro. Rust is totally fine with this! However, macrokata
tries
to avoid (as much as possible) using macros we didn't define inside
the main function. The reason for this is that, if we did use println!
you would see its expansion as well. That could be confusing, since
print("some text")
is much easier to read than
{
::std::io::_print(
::core::fmt::Arguments::new_v1(
&["some text"],
&[],
),
);
};
Exercise 3: Literal Meta-Variables
Your task is to create a macro which can perform two small bits of math:
- The syntax
math!(3 plus 5)
should expand to3 + 5
, where3
and5
could be any literal. - The syntax
math!(square 2)
should expand to2 * 2
, where2
could be any literal.
You may not edit the main
function, but it should eventually look like the
following:
fn main() {
print_result(3 + 5);
print_result(2 * 2);
}
Exercise 4: Expression Metavariables
We can now capture fragments of Rust code that are literals, however there are
other fragments of Rust code which can be captured in metavariables. In general,
every metavariable is of the form $<NAME>:<FRAGSPEC>
. <NAME>
is replaced
with the name of the metavariable, but FRAGSPEC
is more interesting. It means
"Fragment Specifier", and it tells you what sort of fragment of Rust code you
intend to match. We've already seen literal
, but another common fragment
specifier is expr
, which allows you to capture any Rust expression (for
example, (3 * 5)
or function_call() + CONSTANT
).
Using this specifier is nearly identical to using the literal
fragment
specifier: $x:expr
indicates a metavariable, which is an expression, named
x
.
It's also worth mentioning the fragment specifier stmt
, which is similar to
expr
, but allows Rust statements too, like let
statements.
Macros and the Precedence of Operators
Macros do affect the order of operations. The expression 3 * math!(4, plus, 2)
expands to 3 * (4 + 2)
. This is not clearly outlined anywhere
(that I can find), and a previous version of this guide incorrectly stated
the opposite.
You can check this behaviour by seeing the following:
macro_rules! math { () => { 3 + 4 } } fn main() { let math_result = 2 * math!(); // 2 * (3 + 4) == 14 assert_eq!(math_result, 14); // (2 * 3) + 4 == 10 assert_ne!(math_result, 10); }
"Follow-set Ambiguity Rules"
The Rust parser needs to have some way of knowing where a metavariable ends.
If it didn't, expressions like $first:expr $second:expr
would be confusing to
parse in some circumstances. For example, how would you parse a * b * c * d
?
Would first
be a
, and second
be *b * c * d
? Or would first
be a * b * c
,
and second
be * d
?
To avoid this problem entirely, Rust has a set of rules called the "follow-set ambiguity rules". These tell you which tokens are allowed to follow a metavariable (and which aren't).
For literal
s, this rule is simple: anything can follow a literal
metavariable.
For expr
(and its friend stmt
) the rules are much more restrictive: they
can only be followed by =>
or ,
or ;
.
This means that building a matcher like
macro_rules! broken_macro {
($a:expr please) => $a
}
fn main() {
// Fails to compile!
let value = broken_macro!(3 + 5 please);
}
will give you this compiler error:
error: `$a:expr` is followed by `please`, which is not allowed for `expr` fragments
--> broken_macro.rs:2:14
|
2 | ($a:expr please) => { $a }
| ^^^^^^ not allowed after `expr` fragments
|
= note: allowed there are: `=>`, `,` or `;`
As we encounter more expression types, we'll make sure to mention their follow-set rules, but this page in the Rust reference has a comprehensive list of the rules for each fragment specifier type.
Exercise 4: Expression Variables
In this task, you will be completing a similar task to the previous one. Last time, your macro should have worked with any literal, but now we would like a macro which works with any expression.
- The syntax
math!(3, plus, (5 + 6))
should expand to3 + (5 + 6)
, where3
and(5 + 6)
could be any expression. - The syntax
math!(square my_expression)
should expand tomy_expression * my_expression
, wheremy_expression
could be any expression.
You may not edit the main
function, but it should eventually look like the
following:
fn main() {
let var = 5;
print_result((2 * 3) + var);
print_result(var * var);
}
Exercise 5: A More Complex Example
In this task, we'll be implementing code to make the following syntax possible:
fn main() {
for_2d!(row <i32> in 1..5, col <i32> in 2..7, {
// code
});
}
Ignoring extra curly braces, this code should translate to
fn main() { for row in 1..5 { let row: i32 = row; for col in 2..7 { let col: i32 = col; // code } } }
Note that the names of the variables may change (i.e. they could be row
and
col
, or x
and y
, or something else).
To complete this task, there more fragment specifiers you will need to know about:
ident
: an "identifier", like a variable name.ident
metavariables Can be followed by anything.block
: a "block expression" (curly braces, and their contents). Can be followed by anything.ty
: a type. Can only be followed by=>
,,
,=
,|
,;
,:
,>
,>>
,[
,{
,as
,where
, or ablock
metavariable.
As a reminder, you may not edit the main
function, but it should eventually
look like the following:
fn main() {
for row in 1..5 {
let row: i32 = row;
for col in 2..7 {
let col: i32 = col;
{ (Coordinate { x: col, y: row }).show() }
}
}
let values = [1, 3, 5];
for x in values {
let x: u16 = x;
for y in values {
let y: u16 = y;
{
(Coordinate {
x: x.into(),
y: y.into(),
})
.show()
}
}
}
}
Exercise 6: Repetition
Hopefully, you're now feeling pretty confident with metavariables. One of the first justifications we gave for macros was their ability to simulate "variadic" functions (functions which have a variable number of arguments). In this exercise, we'll have a look at how you can implement them yourself.
A simple approach might be to write a rule for each number of arguments. For example, one might write
macro_rules! listing_literals { (the $e1:literal) => { { let mut my_vec = Vec::new(); my_vec.push($e1); my_vec } }; (the $e1:literal and the $e2:literal) => { { let mut my_vec = Vec::new(); my_vec.push($e1); my_vec.push($e2); my_vec } }; (the $e1:literal and the $e2:literal and the $e3:literal) => { { let mut my_vec = Vec::new(); my_vec.push($e1); my_vec.push($e2); my_vec.push($e3); my_vec } } } fn main() { let vec: Vec<&str> = listing_literals!(the "lion" and the "witch" and the "wardrobe"); assert_eq!(vec, vec!["lion", "witch", "wardrobe"]); let vec: Vec<i32> = listing_literals!(the 9 and the 5); assert_eq!(vec, vec![9, 5]); }
This is very clunky, and involves a large amount of repeated code. Imagine doing
this for 10 arguments! What if we could say that we want a variable number of
a particular patterns? That would let us say "give me any number of $e:expr
tokens, and I'll tell you what to do with them'".
Macro repetitions let us do just that. They consist of three things:
- A group of tokens that we want to match repeatedly.
- Optionally, a separator token (which tells the parser what to look for between each match).
- Either
+
,*
or?
, which says how many times to expect a match.+
means "at least once".*
means "any number of times, including 0 times".?
means "either 0 times, or 1 time".
Let's look at an example of a macro repetition, to parse the exact macro we showed above.
The matcher we would use for this is $(the $my_literal:literal)and+
.
To break that down:
$(
means that we're starting a repetition.- Inside the brackets,
the $my_literal:literal
is the pattern we're matching. We'll match the exact text "the", and then a literal token. - The
)
means that we're done describing the pattern to match. - The
and
is optional, but it is the "separator": a token you can use to separate multiple repetitions. Commonly it's,
to comma-separate things. - Here, we use
+
, which means the repetition must happen at least once.*
would have worked just as well if we were okay with an emptyVec
.
What's now left is to use the matched values. To do this, the rule would be something like:
($(the $my_literal:literal)and+) => {
{
let mut my_vec = Vec::new();
$(my_vec.push($my_literal));+;
my_vec
}
}
The line $(my_vec.push($my_literal));+;
is nearly identical to the repetition we saw above, but to break it down:
$(
tells us that we're starting a repetition.my_vec.push($my_literal)
is the code that will be transcribed.$my_literal
will be replaced with each of the literals specified in the matcher.- The
)
means that we're done describing the code that will be transcribed. - The
;
means we're separating these lines with semicolons. Note that if you want, this could also be empty (to indicate they should be joined without anything in the middle). - The
+
ends the repetition. - The
;
adds a final semicolon after the expansion of everything.
So this will expand into the same code we saw above!
It's worth noting that we've used an extra set of curly braces in our transcriber. This is because if you don't
put the code in a block, the code will look like let whatever = let mut my_vec = Vec::new();
, which doesn't make sense.
If you put the code in a curly brace, then the right-hand side of the =
sign will be a block which returns my_vec
.
Exercise 6: Repetition
In this task, you will be creating an if_any!
macro. If any of the first arguments are true,
it should execute the block which is the last argument.
You may not edit the main
function, but once you have completed the exercise, your if_any!
macro should expand to look like the
following:
fn main() {
if false || 0 == 1 || true {
print_success();
}
}
Exercise 7: More Repetition
This exercise is going to also cover writing repetitions, but now involving more than one metavariable. Don't worry: the syntax is the exact same as what you've seen before.
Before you start, let's just quickly cover the different ways you can use a metavariable within a repetition.
Multiple Metavariables in One Repetition
You can indicate that two metavariables should be used in a single repetition.
For example, ( $($i:ident is $e:expr),+ )
would match my_macro!(pi is 3.14, tau is 6.28)
.
You would end up with $i
having matched pi
and tau
; and $e
having matched 3.14
and
6.28
.
Any repetition in the transcriber can use $i
, or $e
, or both within the same repetition.
So a transcriber could be $(let $i = $e;)+
; or let product = $($e)*+
One Metavariable Each, For Two Repetitions
Alternatively, you could specify two different repetitions, each containing their own metavariable. For example, this program will construct two vecs.
macro_rules! two_vecs { ($($vec1:expr),+; $($vec2:expr),+) => { { let mut vec1 = Vec::new(); $(vec1.push($vec1);)+ let mut vec2 = Vec::new(); $(vec2.push($vec2);)+ (vec1, vec2) } } } fn main() { let vecs = two_vecs!(1, 2, 3; 'a', 'b'); }
Importantly, with the above example, you have to be careful about using $vec1
and $vec2
in the same repetition within the transcriber. It is a compiler
error to use two metavariables captured a different number of times in the same
repetition.
To quote the reference:
Each repetition in the transcriber must contain at least one metavariable to decide how many times to expand it. If multiple metavariables appear in the same repetition, they must be bound to the same number of fragments. For instance,
( $( $i:ident ),* ; $( $j:ident ),* ) => (( $( ($i,$j) ),* ))
must bind the same number of$i
fragments as$j
fragments. This means that invoking the macro with(a, b, c; d, e, f)
is legal and expands to((a,d), (b,e), (c,f))
, but(a, b, c; d, e)
is illegal because it does not have the same number.
Exercise 7: More Repetition
In this task, you will be creating a hashmap
macro. It should consist
of comma-separated pairs, of the form literal => expr,
This should construct an empty HashMap
and insert
the
relevant key-value pairs.
You may not edit the main
function, but it should eventually look like the
following:
fn main() {
let value = "my_string";
let my_hashmap = {
let mut hm = HashMap::new();
hm.insert("hash", "map");
hm.insert("Key", value);
hm
};
print_hashmap(&my_hashmap);
}
Exercise 8: Nested Repetition
In this exercise, you will need to use nested repetition. That's where you
write a repetition inside another one, for example, ( $( $( $val:expr ),+ );+ )
would let you specify at least one value, but separate them with either ;
and ,
.
The only oddity about nested repetition is that you must ensure that you use
metavariables in a context where it's clear you're only referring to one of them.
In other words, the $val
metavariable in the last paragraph must be used within
a nested repetition.
Exercise 8: Nested Repetition
In this task, you will be building a macro to load a data structure with an adjacency list from a graph. As a refresher, graphs are data structures that describe how different nodes are connected.
Each will be a literal, and you will be specifying, for each node, which nodes it connects to. For example,
graph!{
1 -> (2, 3, 4, 5);
2 -> (1, 3);
3 -> (2);
4 -> ();
5 -> (1, 2, 3);
}
should get translated into a Vec
containing the pairs (1, 2)
, (1, 3)
, ... (2, 1)
, ... (5, 3)
.
You may not edit the main
function, but it should eventually look like the
following:
#[allow(clippy::vec_init_then_push)]
fn main() {
let my_graph = {
let mut vec = Vec::new();
vec.push((1, 2));
vec.push((1, 3));
vec.push((1, 4));
vec.push((1, 5));
vec.push((2, 1));
vec.push((2, 3));
vec.push((3, 2));
vec.push((5, 1));
vec.push((5, 2));
vec.push((5, 3));
vec
};
print_vec(&my_graph);
}
Exercise 9: Ambiguity and Ordering
Up until this point, we've mostly been dealing with macros with a single rule. We saw earlier that macros can require more than one rule, but so far we've never had ambiguity in which rule should be followed.
There are, however, multiple circumstances where rules could have ambiguity, so it's important to understand how macros deal with that ambiguity.
The following is adapted from the rust documentation on macros:
-
When a macro is invoked (i.e. someone writes
my_macro!()
), the compiler looks for a macro with that name, and tries each rule in turn. -
To try a rule, it reads through each token in the parser in turn. There are three possibilities:
- The token found matches the matcher. In this case, it keeps parsing the next token. If there are no tokens left, and the matcher is complete, then the rule matches.
- The token found does not match the matcher. In this case, Rust tries the next rule. If there are no rules left, an error is raised as the macro cannot be expanded.
- The rule is ambiguous. In other words, it's not clear from just this token what to do. If this happens, this is an error.
-
If it finds a rule that matches the tokens inside the brackets; it starts transcribing. Once a match is found, no more rules are examined.
Let's have a look at some examples:
macro_rules! ambiguity {
($($i:ident)* $j:ident) => { };
}
fn main() {
ambiguity!(error);
}
This example fails because Rust is not able to determine what $j
should be just by looking at
the current token. If Rust could look forward, it would see that $j
must be followed by a )
,
but it cannot, so it causes an error.
macro_rules! ordering { ($j:expr) => { "This was an expression" }; ($j:literal) => { "This was a literal" }; } fn main() { let expr1 = ordering!('a'); // => "This was an expression". let expr1 = ordering!(3 + 5); // => "This was an expression". }
This example demonstrates an example where Rust macros can behave strangely due to
ordering rules: even though literal
is a much stricter condition than expr
,
because literal
s are expr
s, the first rule will always match.
Exercise 9: Ambiguity and Ordering
This task is a little bit different to previous tasks: we have given you a partially functional macro already, along with some invocations of that macro.
You should adjust the macro's rules and syntax to make sure that you achieve the correct behaviour without any ambiguity.
sum!()
should sum together two or more expressions together.get_number_type!()
should determine what sort of Rust syntax is being used: a positive literal, a negative literal, a block, or an expression.
You may not edit the main
function, but it should eventually look like the
following:
fn main() {
NumberType::PositiveNumber(5).show();
NumberType::NegativeNumber(-5).show();
#[allow(clippy::let_and_return)]
NumberType::UnknownBecauseBlock({
let x = 6;
x
})
.show();
NumberType::UnknownBecauseExpr(1 + 2 + 3 + 4).show();
NumberType::UnknownBecauseExpr(3 + 5 - 1).show();
}
Exercise 10: Macros Calling Macros
We briefly mentioned in a previous exercise that macros are able to call other macros. In this exercise we will look at a brief example of that. Before we do, there are three small notes we should mention.
Useful built-in macros
There are two useful macros which the standard library provides - stringify!()
and concat!()
. Both of them produce static string slices, made up of tokens.
The stringify!
macro takes tokens and turns them into a &str
that
textually represents what those tokens are. For example, stringify!(1 + 1)
will become "1 + 1"
.
The concat!
macro takes a comma-separated list of literals, and creates a
&str
which concatenates them. For example, concat!("test", true, 99)
becomes
"testtrue99"
.
It's useful to know that if either of these have a macro in their parameter,
(i.e. stringify!(test!())
), the internal macro will be expanded first.
So, if test!()
expanded to 1 + 1
, your string would be "1 + 1"
, not
"test!()"
.
The tt
fragment specifier
An important macro specifier which we have not, as of yet, discussed,
is the tt
macro. This captures a "Token Tree", which is any token,
or a group of tokens inside brackets. This is the most flexible
fragment specifier, because it imposes no meaning on what the captured
tokens might be. For example:
macro_rules! stringify_number { (one) => {"1"}; (two) => {"2"}; ($tokens:tt) => { stringify!($tokens)}; } fn main() { stringify_number!(one); // is "1" stringify_number!(while); // is "while" stringify_number!(bing_bang_boom); // is "bing_bang_boom" }
It's really important to keep in mind with tt
macros that you must
ensure that anything after them can be unambiguously parsed.
In other words, the metavariable $($thing:tt)*
(ending with *
, +
OR ?
) must
be the last fragment in the parser. Since anything can be a token tree, Rust could
not know what to accept after that parser.
To avoid this issue, you can either match a single tt
, and make the user wrap multiple tokens
inside brackets, or you can specify a delimiter for your match (i.e. $($thing:tt),+
, since
two token trees not separated by a ,
could not match).
Restrictions on "Forwarding Macros"
There is one important restriction when calling a macro using another macro.
When forwarding a matched fragment to another macro-by-example, matchers in the
second macro will be passed an
AST of the fragment type,
which cannot be matched on except as a fragment of that type. The second macro
can't use literal tokens to match the fragments in the matcher, only a fragment
specifier of the same type. The ident
, lifetime
, and tt
fragment types are
an exception, and can be matched by literal tokens. The following illustrates
this restriction:
macro_rules! foo {
($l:expr) => { bar!($l); }
// ERROR: ^^ no rules expected this token in macro call
}
macro_rules! bar {
(3) => {}
}
fn main() {
foo!(3);
}
The following illustrates how tokens can be directly matched after matching a tt
fragment:
// compiles OK macro_rules! foo { ($l:tt) => { bar!($l); } } macro_rules! bar { (3) => {} } fn main() { foo!(3); }
Exercise 10: Macros Calling Macros
In this exercise, you have already been provided with a macro called digit
, which
maps the identifiers zero
through nine
to a &str
with their numeric value.
Your task is to write a macro called number!()
which takes at least one of the identifiers zero
through nine
, and converts them to a string containing numbers.
For example, number!(one two three)
should expand to "123"
.
Note: previously exercise 10 was about making a hashmap. The exercise has changed, but the old
code is still available in the archive/
directory. It will be removed on the next update of this book.
Exercise 11: Macro Recursion
This exercise is a sort of culmination of everything you've learned so far about macros.
To complete it, you'll need to note one important fact: macros can recurse into themselves.
This allows very powerful expansions. As a simple example:
enum LinkedList { Node(i32, Box<LinkedList>), Empty } macro_rules! linked_list { () => { LinkedList::Empty }; ($expr:expr $(, $exprs:expr)*) => { LinkedList::Node($expr, Box::new(linked_list!($($exprs),*))) } } fn main() { let my_list = linked_list!(3, 4, 5); }
The above example is very typical. The first rule is the "base case": an empty list of tokens implies an empty linked list.
The second rule always matches one expression first (expr
). This allows us
to refer to it on its own, in this case to create the Node
. The rest of
the expressions (exprs
) are stored in a repetition, and all we'll do with
them is recurse into linked_list!()
. If there's no expressions left,
that call to linked_list!()
will give back Empty
, otherwise it'll
repeat the same process.
While macro recursion is incredibly powerful, it is also slow. As a result,
there is a limit to the amount of recursion you are allowed to do.
In rustc, the limit is 128
, but you can configure it with
#![recursion_limit = "256"]
as a crate-level attribute.
Exercise 11: Currying
Before you complete the exercise, let's briefly discuss a concept called "currying". If you're already familiar with the concept, perhaps from your own experience of functional programming, you can skip the next two paragraphs.
In most imperative languages, the syntax to call a function with multiple arguments
is function(arg1, arg2, arg3)
. If you do not provide all the arguments, that is
an error. In many functional languages, however, the syntax for function calls is
more akin to function(arg1)(arg2)(arg3)
. The advantage of this notation is that
if you specify less than the required number of arguments, it's not an error:
you get back a function that takes the rest of the arguments. A function that behaves
this way is said to be "curried" (named after Haskell Curry, a famous mathematician).
A good example of this is a curried add
function. In regular Rust, we'd say add
is
move |a, b| a + b
. If we curried that function, we'd instead have move |a| move |b| a + b
.
What this means is that we can write let add_1 = add(1);
, and we now have a function
which will add 1 to anything.
In this exercise, you will build a macro which helps you understand currying,
and build a curried function in Rust. The syntax for this macro will be
curry!((a: i32) => (b: i32) => _, {a + b})
. Each pair of ident: ty
is an
argument, and the last _
indicates that the compiler will infer the return
type. The block provided last is, of course, the computation we want to do after
receiving all the arguments.
Each step of the currying process, you should call the macro print_curried_argument
.
This takes in a value (which, for the purposes of the exercise, you can assume will
always be Copy
). It will print out the value that you have been provided as an argument.
Exercise 12: Hygiene
To quote the reference:
By default, all identifiers referred to in a macro are expanded as-is, and are looked up at the macro's invocation site. This can lead to issues if a macro refers to an item (i.e. function/struct/enum/etc.) or macro which isn't in scope at the invocation site. To alleviate this, the
$crate
metavariable can be used at the start of a path to force lookup to occur inside the crate defining the macro.
Here is an example to illustrate (again, taken from the reference linked above):
// Definitions in the `helper_macro` crate.
#[macro_export]
macro_rules! helped {
/*
() => { helper!() }
// ^^^^^^ This might lead to an error due to 'helper' not being in scope.
*/
() => { $crate::helper!() }
}
#[macro_export]
macro_rules! helper {
() => { () }
}
// Usage in another crate.
// Note that `helper_macro::helper` is not imported!
use helper_macro::helped;
fn unit() {
helped!();
}
In other words, this means that a macro needs to have all the functions/structs/macros it uses in scope at its call site, not at the place where it is defined. This is fine if the macro is used within a single file, but if a macro is exported, then it makes things complicated.
The $crate
metavariable lets you refer to things that are in the crate the macro was defined in (as opposed to the
crate the macro was called in). If the macro was defined in crate foo
, and used in crate bar
, then a reference
to a struct Widget
like: Widget::new()
is creating a new bar::Widget
(and if one doesn't exist, you'll get an error).
If you called $crate::Widget::new()
, then you're always talking about foo::Widget
, no matter what crate you're in.
A footnote on how expanding macros into text is misleading
(Based on the disclaimer for the brilliant cargo-expand)
Be aware that macro expansion to text is a lossy process. That means that the expanded code we show in these kata should be used as a debugging aid only. There should be no expectation that the expanded code can be compiled successfully, nor that if it compiles then it behaves the same as the original code. In these kata, we try to avoid these issues as far as possible.
For instance, answer = 3
when compiled ordinarily by Rust,
but the expanded code, when compiled, would set answer = 4
.
fn main() { let x = 1; macro_rules! first_x { () => { x } } let x = 2; let answer = x + first_x!(); }
Refer to The Little Book Of Rust Macros for more on the considerations around macro hygiene.
Exercise 12
Exercise 12 consists of a file containing multiple modules. Fix the code so that the macro works correctly in all invocations.
Note that you will need to use the $crate
metavariable.
The subject of scoping, importing and exporting are well covered by the Rust Reference.
While practical examples of these may be useful, this first verison of macrokata
does not
include exercises for them. If you plan on using macros in a larger project, we suggest reading
the above reference.
Extra Reading
There are two excellent resources for further reading on Rust's macro system:
- The Rust Reference,
- The Little Book of Rust Macros (note that an older version exists, so make sure you're reading this up-to-date one!)