262 lines
7.9 KiB
Plaintext
262 lines
7.9 KiB
Plaintext
xdrgen - Linux Kernel XDR code generator
|
|
|
|
Introduction
|
|
------------
|
|
|
|
SunRPC programs are typically specified using a language defined by
|
|
RFC 4506. In fact, all IETF-published NFS specifications provide a
|
|
description of the specified protocol using this language.
|
|
|
|
Since the 1990's, user space consumers of SunRPC have had access to
|
|
a tool that could read such XDR specifications and then generate C
|
|
code that implements the RPC portions of that protocol. This tool is
|
|
called rpcgen.
|
|
|
|
This RPC-level code is code that handles input directly from the
|
|
network, and thus a high degree of memory safety and sanity checking
|
|
is needed to help ensure proper levels of security. Bugs in this
|
|
code can have significant impact on security and performance.
|
|
|
|
However, it is code that is repetitive and tedious to write by hand.
|
|
|
|
The C code generated by rpcgen makes extensive use of the facilities
|
|
of the user space TI-RPC library and libc. Furthermore, the dialect
|
|
of the generated code is very traditional K&R C.
|
|
|
|
The Linux kernel's implementation of SunRPC-based protocols hand-roll
|
|
their XDR implementation. There are two main reasons for this:
|
|
|
|
1. libtirpc (and its predecessors) operate only in user space. The
|
|
kernel's RPC implementation and its API are significantly
|
|
different than libtirpc.
|
|
|
|
2. rpcgen-generated code is believed to be less efficient than code
|
|
that is hand-written.
|
|
|
|
These days, gcc and its kin are capable of optimizing code better
|
|
than human authors. There are only a few instances where writing
|
|
XDR code by hand will make a measurable performance different.
|
|
|
|
In addition, the current hand-written code in the Linux kernel is
|
|
difficult to audit and prove that it implements exactly what is in
|
|
the protocol specification.
|
|
|
|
In order to accrue the benefits of machine-generated XDR code in the
|
|
kernel, a tool is needed that will output C code that works against
|
|
the kernel's SunRPC implementation rather than libtirpc.
|
|
|
|
Enter xdrgen.
|
|
|
|
|
|
Dependencies
|
|
------------
|
|
|
|
These dependencies are typically packaged by Linux distributions:
|
|
|
|
- python3
|
|
- python3-lark
|
|
- python3-jinja2
|
|
|
|
These dependencies are available via PyPi:
|
|
|
|
- pip install 'lark[interegular]'
|
|
|
|
|
|
XDR Specifications
|
|
------------------
|
|
|
|
When adding a new protocol implementation to the kernel, the XDR
|
|
specification can be derived by feeding a .txt copy of the RFC to
|
|
the script located in tools/net/sunrpc/extract.sh.
|
|
|
|
$ extract.sh < rfc0001.txt > new2.x
|
|
|
|
|
|
Operation
|
|
---------
|
|
|
|
Once a .x file is available, use xdrgen to generate source and
|
|
header files containing an implementation of XDR encoding and
|
|
decoding functions for the specified protocol.
|
|
|
|
$ ./xdrgen definitions new2.x > include/linux/sunrpc/xdrgen/new2.h
|
|
$ ./xdrgen declarations new2.x > new2xdr_gen.h
|
|
|
|
and
|
|
|
|
$ ./xdrgen source new2.x > new2xdr_gen.c
|
|
|
|
The files are ready to use for a server-side protocol implementation,
|
|
or may be used as a guide for implementing these routines by hand.
|
|
|
|
By default, the only comments added to this code are kdoc comments
|
|
that appear directly in front of the public per-procedure APIs. For
|
|
deeper introspection, specifying the "--annotate" flag will insert
|
|
additional comments in the generated code to help readers match the
|
|
generated code to specific parts of the XDR specification.
|
|
|
|
Because the generated code is targeted for the Linux kernel, it
|
|
is tagged with a GPLv2-only license.
|
|
|
|
The xdrgen tool can also provide lexical and syntax checking of
|
|
an XDR specification:
|
|
|
|
$ ./xdrgen lint xdr/new.x
|
|
|
|
|
|
How It Works
|
|
------------
|
|
|
|
xdrgen does not use machine learning to generate source code. The
|
|
translation is entirely deterministic.
|
|
|
|
RFC 4506 Section 6 contains a BNF grammar of the XDR specification
|
|
language. The grammar has been adapted for use by the Python Lark
|
|
module.
|
|
|
|
The xdr.ebnf file in this directory contains the grammar used to
|
|
parse XDR specifications. xdrgen configures Lark using the grammar
|
|
in xdr.ebnf. Lark parses the target XDR specification using this
|
|
grammar, creating a parse tree.
|
|
|
|
xdrgen then transforms the parse tree into an abstract syntax tree.
|
|
This tree is passed to a series of code generators.
|
|
|
|
The generators are implemented as Python classes residing in the
|
|
generators/ directory. Each generator emits code created from Jinja2
|
|
templates stored in the templates/ directory.
|
|
|
|
The source code is generated in the same order in which they appear
|
|
in the specification to ensure the generated code compiles. This
|
|
conforms with the behavior of rpcgen.
|
|
|
|
xdrgen assumes that the generated source code is further compiled by
|
|
a compiler that can optimize in a number of ways, including:
|
|
|
|
- Unused functions are discarded (ie, not added to the executable)
|
|
|
|
- Aggressive function inlining removes unnecessary stack frames
|
|
|
|
- Single-arm switch statements are replaced by a single conditional
|
|
branch
|
|
|
|
And so on.
|
|
|
|
|
|
Pragmas
|
|
-------
|
|
|
|
Pragma directives specify exceptions to the normal generation of
|
|
encoding and decoding functions. Currently one directive is
|
|
implemented: "public".
|
|
|
|
Pragma big_endian
|
|
------ ----------
|
|
|
|
pragma big_endian <enum> ;
|
|
|
|
For variables that might contain only a small number values, it
|
|
is more efficient to avoid the byte-swap when encoding or decoding
|
|
on little-endian machines. Such is often the case with error status
|
|
codes. For example:
|
|
|
|
pragma big_endian nfsstat3;
|
|
|
|
In this case, when generating an XDR struct or union containing a
|
|
field of type "nfsstat3", xdrgen will make the type of that field
|
|
"__be32" instead of "enum nfsstat3". XDR unions then switch on the
|
|
non-byte-swapped value of that field.
|
|
|
|
Pragma exclude
|
|
------ -------
|
|
|
|
pragma exclude <RPC procedure> ;
|
|
|
|
In some cases, a procedure encoder or decoder function might need
|
|
special processing that cannot be automatically generated. The
|
|
automatically-generated functions might conflict or interfere with
|
|
the hand-rolled function. To avoid editing the generated source code
|
|
by hand, a pragma can specify that the procedure's encoder and
|
|
decoder functions are not included in the generated header and
|
|
source.
|
|
|
|
For example:
|
|
|
|
pragma exclude NFSPROC3_READDIRPLUS;
|
|
|
|
Excludes the decoder function for the READDIRPLUS argument and the
|
|
encoder function for the READDIRPLUS result.
|
|
|
|
Note that because data item encoder and decoder functions are
|
|
defined "static __maybe_unused", subsequent compilation
|
|
automatically excludes data item encoder and decoder functions that
|
|
are used only by excluded procedure.
|
|
|
|
Pragma header
|
|
------ ------
|
|
|
|
pragma header <string> ;
|
|
|
|
Provide a name to use for the header file. For example:
|
|
|
|
pragma header nlm4;
|
|
|
|
Adds
|
|
|
|
#include "nlm4xdr_gen.h"
|
|
|
|
to the generated source file.
|
|
|
|
Pragma public
|
|
------ ------
|
|
|
|
pragma public <XDR data item> ;
|
|
|
|
Normally XDR encoder and decoder functions are "static". In case an
|
|
implementer wants to call these functions from other source code,
|
|
s/he can add a public pragma in the input .x file to indicate a set
|
|
of functions that should get a prototype in the generated header,
|
|
and the function definitions will not be declared static.
|
|
|
|
For example:
|
|
|
|
pragma public nfsstat3;
|
|
|
|
Adds these prototypes in the generated header:
|
|
|
|
bool xdrgen_decode_nfsstat3(struct xdr_stream *xdr, enum nfsstat3 *ptr);
|
|
bool xdrgen_encode_nfsstat3(struct xdr_stream *xdr, enum nfsstat3 value);
|
|
|
|
And, in the generated source code, both of these functions appear
|
|
without the "static __maybe_unused" modifiers.
|
|
|
|
|
|
Future Work
|
|
-----------
|
|
|
|
Finish implementing XDR pointer and list types.
|
|
|
|
Generate client-side procedure functions
|
|
|
|
Expand the README into a user guide similar to rpcgen(1)
|
|
|
|
Add more pragma directives:
|
|
|
|
* @pages -- use xdr_read/write_pages() for the specified opaque
|
|
field
|
|
* @skip -- do not decode, but rather skip, the specified argument
|
|
field
|
|
|
|
Enable something like a #include to dynamically insert the content
|
|
of other specification files
|
|
|
|
Properly support line-by-line pass-through via the "%" decorator
|
|
|
|
Build a unit test suite for verifying translation of XDR language
|
|
into compilable code
|
|
|
|
Add a command-line option to insert trace_printk call sites in the
|
|
generated source code, for improved (temporary) observability
|
|
|
|
Generate kernel Rust code as well as C code
|