Article Roadmap: From JSON parsing to bindgen to Declarative and Procedural Macros and how to debug them
Safer, Simpler Embedded Rust with Apache Mynewt on STM32 Blue Pill
Declarative and
Procedural Macros (plus bindgen
and tips for Visual Studio Code) to protect Embedded Rust coders from stumbling into
embedded traps
I’m named MyNewt not Mynewt because my handler is a crazy coder named Lup and this lizard doesn’t speak on behalf of the wonderful Apache Mynewt team
What’s great about Apache Mynewt? Today Mynewt runs on many microcontroller platforms with preemptive multitasking. (So it won’t choke when reading sensors and transmitting data simultaneously.)
Mynewt has drivers for many sensors (like BME280), networks (ESP8266, nRF24L01, …) and protocols (CoAP, CBOR, …).
But Mynewt was built with C, which has its problems…
The C code appeared in an earlier article about Mynewt on STM32 Blue Pill
The new Rust code is here. Based on the previous article on Rust and Mynewt…
Declarative Macros in Rust
Declarative Macros in Rust
have the form ( pattern ) => { substitution }
Here’s a simple Declarative Macro
add_88!()
that returns its argument plus 88…
The Rust macro accepts a single
parameter $e
, which should be a valid Rust Expression (denoted
by expr
). In C we would code the macro as…
#define add_88(e) ((e) + 88)
The Rust and C macro definitions
are similar, except that Rust insists on knowing whether the parameter will be an expression (expr
), identifier (ident
),
type (ty
), statement (stmt
), code block (block
)… (Here’s the whole list)
We may provide multiple patterns like this to create an “overloaded function”…
Complex Pattern Matching in Declarative Macros
But wait… Rust macros can do so
much more because of pattern matching! Here’s a tiny snippet from a Rust macro that parses JSON code
(adapted from the serde_json
library)…
https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/src/mynewt/macros.rs#L105-L120
The pattern looks complicated…
(@$enc:ident @object $obj:ident [$($key:tt)+] ($value:expr) , $($rest:tt)*)
The parse!()
macro is meant to be called with these parameters…
parse!( @json @object context ["device"] ("010203") , (omitted) )
Which is an intermediate step
that’s called when encoding our CoAP message in JSON: coap!(@json { “device”: “010203”, … })
The macro parameters are matched against the pattern like this…
Matching the parse!() macro parameters with the macro pattern
What’s a Tag? Think of a Tag as an enum
— an
option that specifies how the macro should behave. For the first parameter of the macro we accept the
tag “@json”
to indicate that the macro should encode sensor data
in JSON format, or “@cbor”
for CBOR format. Tags may be used to
implement internal rules.
What’s a Token
Tree? It’s one or more Rust tokens that are logically grouped. Here are three
examples of Token Trees: x
, (x + y)
, { println!("hi"); }
.
The +
and *
operators should be
familiar if you have used Regular Expressions. So $($key:tt)+
means the $key
placeholder will be matched to one or more Token
Trees (denoted by tt
).
In the source code of the macro there are a couple of interesting things…
1️⃣ d!(…)
is a simple macro we created to dump the parameters of the
macro. Useful for debugging.
2️⃣ coap_item_str!(…)
is a macro that will generate the JSON or CBOR
code for encoding the key: value
entry. The encoding format
depends on the first parameter.
3️⃣ The macro is recursive! Once
the rule has encoded the key: value
entry, it continues to
encode the rest of the input JSON ($rest
) by calling itself!
parse!( @$enc @object $obj () ($($rest)*) ($($rest)*) )
That’s how our coap!()
macro (complete source code here)
recursively parses a JSON document and emits the CoAP encoding. Which makes Rust Declarative Macros
very powerful for parsing many types of code recursively. Even Domain
Specific Languages! Check out the details in the Little Book of Rust Macros.
🤔 You may think… What’s the cost of embedding Rust code inside a C platform like Mynewt? Do Rust macros inject a lot more code to make them run on a C platform?
Not at all! The macros are expanded at compile time. And the Rust compiler is incredibly clever at pattern matching and type inference. So these macros will take up exactly the same RAM, ROM and CPU resources as the older C code! While keeping the code clear and simple!
Two Legged Problems
Import C Functions into Rust with bindgen
We could import a C function into Rust by coding the Rust binding manually (as explained earlier). But writing the Rust bindings by hand for the entire Mynewt API is way too tedious.
Fortunately we have the bindgen
tool: It reads a Mynewt C function declaration like this (os_task_init()
is the Mynewt function for creating background tasks)…
And produces the Rust binding like this…
We have a script that generates Rust bindings for Mynewt Core API, Sensor API, JSON API and CBOR API.
There’s still manual tweaking required… In the script we see many whitelisted and blacklisted C functions and types. These were carefully chosen to avoid duplicates across the different APIs.
Rust is self-documenting, so the Rust bindings for Mynewt automatically have documentation. Check out the Mynewt API Documentation for Rust…
💎 Normally
bindgen
reads an entire C header file to generate Rust bindings for all functions declared in the file. But Mynewt uses many include folders that will totally confusebindgen
.That’s why the script passes the options
-CC -E -dD
togcc
to create a C file that has all the include files (for that specific API) concatenated into one long source file. Which works great withbindgen
!
The Restless bindgen Horde
Creating the safe wrapper in Rust doesn’t need a lot of manual coding… Procedural Macros in Rust can do the job for us automatically! First let’s understand what a Procedural Macro can do…
Create a Procedural Macro with syn
Here’s a simple Procedural Macro
named out!()
(shown in the pic above) that expands…
out!( NETWORK_TASK )
into…
unsafe { &mut NETWORK_TASK }
…because to a C coder starting
Rust coding, unsafe &mut
looks doubly intimidating, so we
use a macro to declare that NETWORK_TASK
will be modified by the
external function, like this: out!(NETWORK_TASK)
From https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/macros/src/lib.rs#L279-L292
Unlike Declarative Macros,
Procedural Macros are coded like a real Rust function. A Procedural Macro simply transforms a stream
of Rust source code tokens (TokenStream
) into the expanded
tokens, that will be fed back into the Rust compiler after expanding the macro.
parse_macro_input!()
is provided by the syn
library for parsing streams
of Rust source code tokens. We can parse any Rust code with it: Expressions, Statement Blocks, Struct
Definitions, Extern
Declarations, … Here we are parsing the input as a Rust Identifier, like a variable or constant name.
We use format!()
to create the desired code. Then we parse the code into a
TokenStream
and pass it back to the compiler.
This strn!()
macro is similar… It expands
strn!( "network" )
into…
&Strn::new( b"network\0" )
The macro code looks similar,
except that it parses the input as a Literal String (LitStr
)
instead of an Identifier. (And the expansion is different of course.) Why did we create strn!()
?
C vs Rust Strings: Null vs Non-Null Terminated
In the world of Embedded C,
strings are simple fixed arrays of ASCII bytes, terminated by a null 00
byte. Rust strings are more complicated — A Rust string is a Vector of bytes. A Rust Vector is resizable, so
the vector uses an internal counter to remember its length. Strings in Rust are
not terminated by the null byte.
This causes problems when Embedded Rust coders call Mynewt APIs, drivers and other C functions… 1️⃣ They may use Rust strings as though they were C strings, omitting the terminating null byte (and causing the C code to crash) 2️⃣ Or they may create many temporary Rust strings while appending the null byte before calling the C functions.
The solution we have chosen is a
simplistic one — We create a new type Strn
that represents a
null-terminated string that will never be modified. It contains a fixed
slice (array) of bytes that always ends with null. Strn
verifies
this when setting and getting the string.
The wrappers for Mynewt APIs only
accept Strn
for incoming strings, instead of the default *const char
.
How do we create an Strn
? By calling the strn!()
macro we just seen…
strn!( "network" )
Which expands into…
&Strn::new( b"network\0" )
This creates an Strn
from the Rust Byte String b"network\0"
. C strings behave more like Rust Byte
Strings, but unfortunately Byte Strings are harder to manipulate without a helper like Strn
.
No more missing terminating nulls… No more temporary copies of strings just for adding nulls… We have made Mynewt APIs safer, simpler and more efficient for Rust!
Compose a Procedural Macro with quote!{}
Checking result codes returned by
C functions can be very tedious and we forget to check them sometimes. Let’s look at this macro run!()
that was used in our sample code to transform a bunch of C
function calls (for encoding CoAP CBOR messages)…
…into this code with proper error
checking, via check_result()
…
How should we generate the
expanded code in the run!()
macro? We could have used the format!()
method shown earlier…
But for generating a block of
code, calling the quote!{}
macro (from the quote
library) looks cleaner…
From https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/macros/src/lib.rs#L395-L402
Note that we use #stmts
as a placeholder to tell the macro to substitute the value of
stmts
into actual expanded tokens.
We also used quote!{}
in our macro like this…
From https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/macros/src/lib.rs#L371-L377
quote!{}
is used here like a template, injecting stmt_tokens
into a chunk of Rust code. Super useful for generating
Rust code… in a Rust program!
Match Code Patterns with syn
Procedural Macros are more
powerful than Declarative Macros because they can analyse the source code tokens (with the syn
library) and expand them differently depending on the context.
Our run!()
macro is picky — it only watches out for calls to functions
named cbor_encode_...
. And it wraps the call with an error
handler (and inserts the namespace tinycbor
): let res = tinycbor::cbor_encode_...
It doesn’t disturb other
statements like let encoder = ...
because wrapping this with an
error handler would be so wrong. So our macro needs to…
1️⃣ Match a Statement…
2️⃣ That contains a Function Call…
3️⃣ That looks like cbor_encode_...
That’s easy to do with the syn
parser and Rust pattern matching…
From https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/macros/src/lib.rs#L358-L371
Rust has incredibly powerful
enums
that will let us match against patterns like Semi(expr, ...)
and Call(expr)
. This works really well with the syn
parser for creating powerful source code transformations.
The complete run
macro with calls to syn
and quote
is shown below. Although the run
macro is meant for calling Embedded C functions, the macro
applies a layer of error checking on top of the function calls, in a way that feels natural to the
Rust coder. Safer coding, made possible with Rust’s Procedural Macros!
Evolving the #[safe_wrap] macro
Put Everything Together: #[safe_wrap] macro
Remember the Safe Wrapper for Mynewt Functions? Now we can explain how the wrapper was constructed automatically with Procedural Macros.
To create the wrapper, we apply
the #[safe_wrap]
attribute to the Rust binding created by bindgen
…
#[safe_wrap]
is a Procedural Macro defined here…
From https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/macros/src/lib.rs#L24-L107
The macro calls parse_macro_input!()
from the syn library. The input to the
macro is parsed as an extern
function declaration (denoted by
ItemForeignMod
).
From https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/macros/src/lib.rs#L24-L107
Next, we inspect each parameter
of the extern
function and call transform_arg_list()
to transform each parameter into three forms…
1️⃣ Wrapper Declaration: How the parameter type is exposed via the wrapped function.
For example, *mut
for output pointers looks odd to C coders, so we rename it as
Out<…>
, which clearly states the intent (output pointer).
*const c_char
for input strings is renamed to &Strn, since the
Strn
type validates that the string is null-terminated. (Again,
Strn
signifies the intent.)
2️⃣ Validation Statement: To validate each parameter if needed, like verifying
that all input strings are null-terminated. (We use the Strn
type to perform the string validation.)
3️⃣ Call
Expression: Inside the wrapper, use this type cast expression to call the C function. Rust
is stricter than C about types, so we use type casting to handle tiny discrepancies like i32
vs u32
(signed vs unsigned
integers).
From https://github.com/lupyuen/stm32bluepill-mynewt-sensor/blob/rust-safe/macros/src/lib.rs#L24-L107
Finally we use quote!
macro to combine the three forms into the expanded output below. The quote!
macro clearly shows the structure of the output, thus I
highly recommend quote!
for
composing Procedural Macros.
We use quote_spanned!
instead of quote!
in some spots…
What’s a Span? When the Rust Compiler sends a stream of tokens to our macro, it also transmits the byte location of each token in the source file. So when it hits a compilation error, the compiler can display the exact source code that caused the error.
Our macro is synthesising source code — creating new source code based on the original source code. So it’s possible that our synthesised source code will have errors. When that happens, we should tell the Rust Compiler the precise location of the original code.
That’s why it’s important for our
macros to preserve the Span information. We create three forms of each
parameter, but they are placed into different sections. By using quote_spanned!
we preserve the original Span of each parameter. And the Rust Coder gets meaningful, relevant error messages.
Overview of the #[safe_wrap] attribute
Result, Ok and Err in Embedded Rust
What are Result<>, Ok() and Err() in Rust?
In Rust, it’s customary for
functions to return Result<…>
instead of an error code
like C. That’s why our Safe Wrapper is declared as…
MynewtResult<…>
(derived from Result<…>
)
is a Generic Type that contains either…
1️⃣ Ok(result)
in which result
is
an optional result value
2️⃣ Err(error_code)
in which error_code
is the Mynewt system error code
How do we indicate the type of the result value? Through the function definition…
fn my_function() -> MynewtResult<i32>
…means that my_function()
returns an integer result value or an error code.
fn my_function() -> MynewtResult<()>
…means that my_function()
returns no result value (a.k.a. void
) but it may return an error code.
To return a result value, we
return Ok(value)
or if there’s nothing to be returned, Ok(())
To return an error, we return a
standard Mynewt error code wrapped with Err()
like this: Err(MynewtError::SYS_EAGAIN)
What happens when we call a function that returns an error? For example…
See the strange ?
dangling at the end of the task_init()
function call?
It returns any errors immediately when they occur. The function exits early without executing the rest of the function.
So Result<…>, Ok(…), Err(…)
and ?
really help to make Mynewt error handling so much easier.
Debug Rust Macros with Visual Studio Code
Visual Studio Code is a great way to debug Rust Declarative and Procedural Macros. Just follow these steps…
1️⃣ Install Visual
Studio Code. Install rustup
according to the instructions
at rustup.rs
2️⃣ Select the Nightly Build of the Rust Compiler…
rustup default nightly
rustup update
3️⃣ For Windows: Install the Remote WSL Extension for Visual Studio Code so that the Rust build runs in the Linux (Ubuntu) environment, which has fewer problems. Otherwise you’ll have to install Rust twice according to these instructions.
4️⃣ Install the Rust Language Support Extension for Visual Studio Code. If it won’t install properly, check these instructions.
5️⃣ Install the Task Runner
Extension for Visual Studio Code. This lets you click on build tasks easily in the
Task Runner
pane at lower left.
5️⃣ In Visual Studio Code, click
View → Command Palette → Git Clone
.
Enter
https://github.com/lupyuen/test-rust-macros
and select a local folder. This Rust project contains demo macros used in the next section.
6️⃣ When prompted, open the cloned repository and open the workspace
7️⃣ In the Workspace
pane, open the file src/main.rs
to view the demo macros. We’ll be using the demo macros next…
Here’s a video of the
installation steps. Click CC
to view the instructions…
Watch Macro Expansion in Visual Studio Code
To see how simple macros are
expanded, use trace_macros!()
like this…
Mouse over the macro name, like
add_88
(located in the main()
function)
The macro expansion appears in a pop-up
Click Peek Problem
if the expansion is long
For complex macros, use this method to view expanded macros in the Rust build log…
1️⃣ In Visual Studio Code, browse
the Workspace
and open the file .cargo/config
2️⃣ Uncomment the first option…
"-Z", "unstable-options", "--pretty", "expanded",
3️⃣ Click Terminal → Run Task → cargo build
The expanded macro appears in the
cargo build
log…
Expanded macro in the cargo build log
4️⃣ With the Task Runner
Extension, we may also click cargo build
in the Task Runner
pane at lower left
Here’s a video of the macro
expansion. Click CC
to view the instructions…
If you’re not using Visual Studio Code, run this in the command line to see the expanded macros…
cargo rustc -- -Z unstable-options --pretty expanded
Macro Hygiene in Rust
More about Macro Hygiene
Debug Macro Hygiene in Visual Studio Code
The Rust Compiler can show us
information about the salt
variables and which context they
belong to…
1️⃣ In Visual Studio Code, browse
the Workspace
and open the file .cargo/config
2️⃣ Uncomment the second option…
"-Z", "unstable-options", "--pretty", "expanded,hygiene",
Check that the first option is commented.
3️⃣ Click Terminal → Run Task → cargo build
The expanded macro with context
information appears in the cargo build
log. Here’s what it
means…
Rust Compiler displaying the context of every variable
Here’s a video of macro hygiene
in Visual Studio Code. Click CC
to view the instructions…
If you’re not using Visual Studio Code, run this in the command line to see the hygiene information…
cargo rustc -- -Z unstable-options --pretty expanded,hygiene
Macro Context: Embedded vs Desktop
🤔 Macro programming is also known as Metaprogramming… Like “Inception”, it plays weird tricks with your mind, as you code a program within a program… And you ask yourself: “Which level am I at right now? What am I really coding for?”
We are actually stacked on TWO “Inception” Layers right now…
1️⃣ We’re coding an Embedded Rust
program for STM32 Blue Pill. Cargo.toml
is located here.
2️⃣ We’re coding a Procedural
Rust Macro, which transforms the source code of the Embedded Rust program. Cargo.toml
is located here.
They run on different contexts…
So we may make the mistake of adding Rust Standard Library features to our Embedded Rust program. Which will produce Rust Compiler errors.
As we code, always think carefully… Which level are we coding on?
Looking for more Embedded Rust on STM32 Blue Pill? Check this out…
The Lizard Rests
Check out the latest article on Visual Embedded Rust…
Appendix: Index Of Images
This section is not meant for humans; it’s for web crawlers to index the text content of the images
“Hello I’m MyNewt. I’m an Embedded OS that runs on Bare Metal”
“Yep I’m open source. Fully fluffy inside.”
“Many types of microcontrollers. I run on Super Blue Pill too.”
“Why run Mynewt instead of Embedded Rust on bare metal?”
(Thanks to RedMart for delivering the bare metal cans on Sunday)
“What’s wrong with C? The code gets messy when we do something simple… Like sending a JSON message over CoAP, using macros…”
“It’s almost 2020. Why do coders suffer like this?
“Here’s the solution in Rust… Simple Clean Rust Code”
“Just that the code inside the curly brackets {…} isn’t really Rust, it’s JSON”
“How does Rust support alien languages?”
“Answer: Rust supports them thru Declarative Macros”
Encoding Format: @json or @cbor
Parser State: @object means that we are parsing an object
Encoding Context: Required for macro hygiene. Key of JSON entry / Value of JSON entry. Remaining JSON to be parsed
“My two-legged friends are now thinking…”
“What if I need to call some code in Embedded C?”
“What if my sensor or network drivers run only on C?”
“The entire Mynewt API is in C… How do we call that from Rust?”
“Rust will let you call any C function, even C functions defined in Mynewt!”
“All you need to provide: the Rust bindings for the functions”
“Which the bindgen tool can generate automatically”
“This C declaration… is transformed into this Rust declaration by bindgen”
“With bindgen we can import all the C functions from Mynewt into Rust… In a single click!”
“But is the imported horde all set to run on BARE METAL?”
“See how we are rolling on top of Bare Metal? This is highly UNSAFE!”
“We could tumble down anytime and damage the Bare Metal. Or ourselves”
“Calling an imported C function from Rust is UNSAFE. Looks ugly too!”
“In my fantasy, calling a Mynewt function from Rust would be so safe and easy. Like this…”
“We should call the Mynewt function thru a SAFE wrapper…”
“Wrapper checks incoming strings for null termination”
“Wrapper checks that output pointers are valid”
“The SAFE wrapper protects Bare Metal from any damage. SAFER and SIMPLER coding… the power of Rust!”
“Sometimes working with Rust feels like meeting a Food Safety Inspector…”
“HEY YOU!!!”
“Yes… Inspector?”
“You are violating the rules of Safety and Hygiene… NO LIZARDS ALLOWED!!!”
“I can explain…”
“NO LIZARDS ALLOWED!!!”
“What’s Hygiene? Hygiene is keeping your bathroom clean. Hygiene is keeping your pet lizard’s cage clean”
“Hygiene means Never do sneaky things to fool the Rust Compiler… Like pretending to be something else with the same name”
“This is very odd but 100% true… (Ask your Food Safety Inspector): Declarative Macros respect Hygiene, Procedural Macros DO NOT respect Hygiene”
“Pretend you’re running a Restaurant Business (for humans, not lizards). A Food Safety Inspector walks into your restaurant and orders the soup. The Inspector makes a complaint…”
“There’s no salt in the soup”
“Sorry Inspector, I’m really sure there’s salt in your Tomato Soup”
“There’s no salt in the soup”
“I wish I could show it to you… But there’s really SALT IN YOUR SOUP!”
“There’s no salt in the soup”
“I am putting MY FINGER IN THE SOUP… There’s REALLY SALT IN THE SOUP!!! Why do you KEEP SAYING THAT???”
“There’s no salt in the soup”
“As you bash your head against the menu, you realise one thing… The menu reads TOMATO SOUP with SEA SALT… But you served the Inspector TOMATO SOUP with TABLE SALT”
“The Inspector was right… There was no SEA SALT in the soup. And SEA SALT is not the same as TABLE SALT, even though we refer to them by the same name: SALT”
“Unfortunately for
Rust macro coders, this Hygiene problem is very real… The salt
that’s inside the macro
here… Is NOT in the same context as the salt
here…”
“Because the
contexts are different, the Rust Compiler stops us from using the same salt
from inside and outside the
macro…”
“The solution:
Pass the salt as a parameter. This forces the Rust Compiler to treat the two salts
as the same”
“In make_bad_soup()
: Contexts don’t
match. Rust Compiler fails with Hygiene Error”
“In make_good_soup()
: Contexts match.
Rust Compiler is happy!”
“Rust has so many goodies for Embedded Coders…”
“Do we really want to keep on coding Embedded C forever?”
“Some of the Arduino and C code I’ve seen… Made me fall off the tree!”
“As you sleep tonight, dream of the safe, clear, simple code that you’ll be writing… In Embedded Rust”
“I’m not really green. FOOLED YA!”