mirror of
https://github.com/google/pebble.git
synced 2025-03-15 16:51:21 +00:00
174 lines
8 KiB
Markdown
174 lines
8 KiB
Markdown
|
# Nanopb: New features in nanopb 0.4
|
||
|
|
||
|
## What's new in nanopb 0.4
|
||
|
|
||
|
Long in the making, nanopb 0.4 has seen some wide reaching improvements
|
||
|
in reaction to the development of the rest of the protobuf ecosystem.
|
||
|
This document showcases features that are not immediately visible, but
|
||
|
that you may want to take advantage of.
|
||
|
|
||
|
A lot of effort has been spent in retaining backwards and forwards
|
||
|
compatibility with previous nanopb versions. For a list of breaking
|
||
|
changes, see [migration document](migration.html)
|
||
|
|
||
|
### New field descriptor format
|
||
|
|
||
|
The basic design of nanopb has always been that the information about
|
||
|
messages is stored in a compact descriptor format, which is iterated in
|
||
|
runtime. Initially it was very tightly tied with encoder and decoder
|
||
|
logic.
|
||
|
|
||
|
In nanopb-0.3.0 the field iteration logic was separated to
|
||
|
`pb_common.c`. Already at that point it was clear that the old format
|
||
|
was getting too limited, but it wasn't extended at that time.
|
||
|
|
||
|
Now in 0.4, the descriptor format was completely decoupled from the
|
||
|
encoder and decoder logic, and redesigned to meet new demands.
|
||
|
Previously each field was stored as `pb_field_t` struct, which was
|
||
|
between 8 and 32 bytes in size, depending on compilation options and
|
||
|
platform. Now information about fields is stored as a variable length
|
||
|
sequence of `uint32_t` data words. There are 1, 2, 4 and 8 word formats,
|
||
|
with the 8 word format containing plenty of space for future
|
||
|
extensibility.
|
||
|
|
||
|
One benefit of the variable length format is that most messages now take
|
||
|
less storage space. Most fields use 2 words, while simple fields in
|
||
|
small messages require only 1 word. Benefit is larger if code previously
|
||
|
required `PB_FIELD_16BIT` or `PB_FIELD_32BIT` options. In
|
||
|
the `AllTypes` test case, 0.3 had data size of 1008 bytes in
|
||
|
8-bit configuration and 1408 bytes in 16-bit configuration. New format
|
||
|
in 0.4 takes 896 bytes for either of these.
|
||
|
|
||
|
In addition, the new decoupling has allowed moving most of the field
|
||
|
descriptor data into FLASH on Harvard architectures, such as AVR.
|
||
|
Previously nanopb was quite RAM-heavy on AVR, which cannot put normal
|
||
|
constants in flash like most other platforms do.
|
||
|
|
||
|
### Python packaging for generator
|
||
|
|
||
|
Nanopb generator is now available as a Python package, installable using
|
||
|
`pip` package manager. This will reduce the need for binary
|
||
|
packages, as if you have Python already installed you can just
|
||
|
`pip install nanopb` and have the generator available on path as
|
||
|
`nanopb_generator`.
|
||
|
|
||
|
The generator can also take advantage of the Python-based `protoc`
|
||
|
available in `grpcio-tools` Python package. If you also install that,
|
||
|
there is no longer a need to have binary `protoc` available.
|
||
|
|
||
|
### Generator now automatically calls protoc
|
||
|
|
||
|
Initially, nanopb generator was used in two steps: first calling
|
||
|
`protoc` to parse the `.proto` file into `.pb` binary
|
||
|
format, and then calling `nanopb_generator.py` to output the
|
||
|
`.pb.h` and `.pb.c` files.
|
||
|
|
||
|
Nanopb 0.2.3 added support for running as a `protoc` plugin, which
|
||
|
allowed single-step generation using `--nanopb_out` parameter. However,
|
||
|
the plugin mode has two complications: passing options to nanopb
|
||
|
generator itself becomes more difficult, and the generator does not know
|
||
|
the actual path of input files. The second limitation has been
|
||
|
particularly problematic for locating `.options` files.
|
||
|
|
||
|
Both of these older methods still work and will remain supported.
|
||
|
However, now `nanopb_generator` can also take `.proto` files
|
||
|
directly and it will transparently call `protoc` in the background.
|
||
|
|
||
|
### Callbacks bound by function name
|
||
|
|
||
|
Since its very beginnings, nanopb has supported field callbacks to allow
|
||
|
processing structures that are larger than what could fit in memory at
|
||
|
once. So far the callback functions have been stored in the message
|
||
|
structure in a `pb_callback_t` struct.
|
||
|
|
||
|
Storing pointers along with user data is somewhat risky from a security
|
||
|
point of view. In addition it has caused problems with `oneof` fields,
|
||
|
which reuse the same storage space for multiple submessages. Because
|
||
|
there is no separate area for each submessage, there is no space to
|
||
|
store the callback pointers either.
|
||
|
|
||
|
Nanopb-0.4.0 introduces callbacks that are referenced by the function
|
||
|
name instead of setting the pointers separately. This should work well
|
||
|
for most applications that have a single callback function for each
|
||
|
message type. For more complex needs, `pb_callback_t` will also remain
|
||
|
supported.
|
||
|
|
||
|
Function name callbacks also allow specifying custom data types for
|
||
|
inclusion in the message structure. For example, you could have
|
||
|
`MyObject*` pointer along with other message fields, and then process
|
||
|
that object in custom way in your callback.
|
||
|
|
||
|
This feature is demonstrated in
|
||
|
[tests/oneof_callback](https://github.com/nanopb/nanopb/tree/master/tests/oneof_callback) test case and
|
||
|
[examples/network_server](https://github.com/nanopb/nanopb/tree/master/examples/network_server) example.
|
||
|
|
||
|
### Message level callback for oneofs
|
||
|
|
||
|
As mentioned above, callbacks inside submessages inside oneofs have been
|
||
|
problematic to use. To make using `pb_callback_t`-style callbacks there
|
||
|
possible, a new generator option `submsg_callback` was added.
|
||
|
|
||
|
Setting this option to true will cause a new message level callback to
|
||
|
be added before the `which_field` of the oneof. This callback will be
|
||
|
called when the submessage tag number is known, but before the actual
|
||
|
message is decoded. The callback can either choose to set callback
|
||
|
pointers inside the submessage, or just completely decode the submessage
|
||
|
there and then. If any unread data remains after the callback returns,
|
||
|
normal submessage decoding will continue.
|
||
|
|
||
|
There is an example of this in [tests/oneof_callback](https://github.com/nanopb/nanopb/tree/master/tests/oneof_callback) test case.
|
||
|
|
||
|
### Binding message types to custom structures
|
||
|
|
||
|
It is often said that good C code is chock full of macros. Or maybe I
|
||
|
got it wrong. But since nanopb 0.2, the field descriptor generation has
|
||
|
heavily relied on macros. This allows it to automatically adapt to
|
||
|
differences in type alignment on different platforms, and to decouple
|
||
|
the Python generation logic from how the message descriptors are
|
||
|
implemented on the C side.
|
||
|
|
||
|
Now in 0.4.0, I've made the macros even more abstract. Time will tell
|
||
|
whether this was such a great idea that I think it is, but now the
|
||
|
complete list of fields in each message is available in `.pb.h` file.
|
||
|
This allows a kind of metaprogramming using [X-macros]()
|
||
|
|
||
|
One feature that this can be used for is binding the message descriptor
|
||
|
to a custom structure or C++ class type. You could have a bunch of other
|
||
|
fields in the structure and even the datatypes can be different to an
|
||
|
extent, and nanopb will automatically detect the size and position of
|
||
|
each field. The generated `.pb.c` files now just have calls of
|
||
|
`PB_BIND(msgname, structname, width)`. Adding a similar
|
||
|
call to your own code will bind the message to your own structure.
|
||
|
|
||
|
### UTF-8 validation
|
||
|
|
||
|
Protobuf format defines that strings should consist of valid UTF-8
|
||
|
codepoints. Previously nanopb has not enforced this, requiring extra
|
||
|
care in the user code. Now optional UTF-8 validation is available with
|
||
|
compilation option `PB_VALIDATE_UTF8`.
|
||
|
|
||
|
### Double to float conversion
|
||
|
|
||
|
Some platforms such as `AVR` do not support the `double`
|
||
|
datatype, instead making it an alias for `float`. This has resulted in
|
||
|
problems when trying to process message types containing `double` fields
|
||
|
generated on other machines. There has been an example on how to
|
||
|
manually perform the conversion between `double` and
|
||
|
`float`.
|
||
|
|
||
|
Now that example is integrated as an optional feature in nanopb core. By
|
||
|
defining `PB_CONVERT_DOUBLE_FLOAT`, the required conversion between 32-
|
||
|
and 64-bit floating point formats happens automatically on decoding and
|
||
|
encoding.
|
||
|
|
||
|
### Improved testing
|
||
|
|
||
|
Testing on embedded platforms has been integrated in the continuous
|
||
|
testing environment. Now all of the 80+ test cases are automatically run
|
||
|
on STM32 and AVR targets. Previously only a few specialized test cases
|
||
|
were manually tested on embedded systems.
|
||
|
|
||
|
Nanopb fuzzer has also been integrated in Google's [OSSFuzz](https://google.github.io/oss-fuzz/)
|
||
|
platform, giving a huge boost in the CPU power available for randomized
|
||
|
testing.
|