Tag Archives: bazel

DART now in bazel_deps

A previous simulator I had built for Super Mega Microbot was based on the “DART” robotics toolkit.  It is a C++ library with python bindings that includes kinematics, dynamics, and graphical rendering capabilities under a BSD license.  I wanted to use some of its dynamics capabilities for future gait work on the quad A0, and eventually re-incorporate its simulation capabilities, so integrated a subset of it into mjbots/bazel_deps.

STM32G4 for mbed

While working on the next revision of the moteus controller, I started by bringing up a software toolchain on a NUCLEO-G474RE board.  Unfortunately, even the most recent mbed 5.14 doesn’t yet support that processor.  Thus, I have a half-baked solution which pulls in the ST CUBE sources for that controller into the mbed tree as a patch.

https://github.com/mjbots/rules_mbed/blob/master/tools/workspace/mbed/stm32g4.patch

The patch only implements wrappers for the things I care about, so it isn’t a complete solution. Since I am not really using any mbed libraries anymore in any project, that isn’t a whole lot.  Right now I’m just using it for the one function that sets up the board, a linker script, and the pin mappings.  I will probably eventually just make a rules_stm32 and ditch mbed entirely but for now that is more work than it is worth.

 

rpi_bazel updated to clang 7.0

When I initially created rpi_bazel, I set it up to use a host provided clang.  I decided to update that so that the rpi_bazel rules themselves download a binary version of an arbitrary clang.  This lets you decouple the version of clang from what is available in any given Linux distribution and improves the reproducibility of builds, since you are no longer dependent on whatever PPA you used to grab clang from.

To make that work in a cross compilation environment, the rules also cross compile libcxx and libcxxapi.  However, since bazel’s support for C++ toolchains is still in flux, for now binaries must explicitly depend upon the C++ standard library if they want it.  On the plus side, that now makes it trivial to fully support C++17 on the raspberry pi.  It should also be easy to update this to clang 8, although I haven’t gotten around to doing so yet.

bazel for gstreamer – plan

After OpenCV, the other major dependency the mjmech software has, which is necessary to complete the raspberry pi 3b+ bazel build setup, was gstreamer. Unlike the previous dependencies, this one is a doozy — gstreamer has an enormous transitive dependency set. Additionally, we needed to use its ffmpeg wrappers, which brings in more dependencies.

In this post, I’ll just try to map out the dependencies that we ended up actually needing, so that they can be tackled one by one.

gstreamer core

The core gstreamer base, actually has a minimal dependency set, only really glib. glib in turn has a small dependency set, just libffi, pcre, and zlib.

gst-plugins-base

Most of the functionality within the gstreamer ecosystem is contained within “plugins”, i.e. modules of software which implement a constrained interface and can be installed into media pipelines. All plugins use the common routines defined within gst-plugins-base, and a few low level plugins are defined here. This begins the dependency explosion, as some of these base plugins start to call into non-trivial pieces of the free desktop graphical stack. Direct dependencies here include libx11, libxv, from x.org, and pango.

In turn those packages have a large fanout. libx11 and libxv depend upon videoproto, libxext, xorgproto, libxcb, libxau, libxdcmp, xtrans, and xcb-proto. pango depends upon freetype, harfbuzz, fribidi, cairo, bzip2, freetype, pixman, fontconfig, util-linux, and expat.

gst-plugins-good and gst-plugins-bad

mjmech doesn’t rely on any of the plugins in gst-plugins-good, so nothing was necessary there! We do use some from gst-plugins-bad, but those have no additional dependencies.

gst-plugins-ugly

Here, mjmech uses the x264 plugin, which in turn depends on the x264 external library.

gst-libav

These are the plugins which wrap ffmpeg (nee libav, nee ffmpeg). ffmpeg in turn depends additional on nasm to assemble for x86.

Next steps

Now that we’ve enumerated all the dependent packages, we can start to work on packaging them. First up are those necessary for gstreamer core, so libffi, pcre, and zlib.

bazel for opencv

The next level of difficulty in bazel-ifying packages for mjmech was opencv.

First, for the impatient, Apache 2.0 licensed sources are available on github: https://github.com/mjbots/bazel_deps/tree/master/tools/workspace/opencv

OpenCV’s native build system consists of nearly 200 cmake files with over 20,000 total lines of code, plus assorted helper scripts and prototype files which are substituted into.  Fortunately, I didn’t need to support the full complexity of the opencv build system.  Things I didn’t bother to touch:

  • Autodetecting the location of any packages:  Since this is embedded within a bazel project, all the dependencies I was going to use were already bazel-ified.
  • GPU support:  I didn’t bother with CUDA, OpenCL, or anything else GPU, as the GPU on the raspberry pi3 has only a minimally supported OpenCL stack, at best.
  • Examples: I had no real need to build any of the sample or demonstration applications.
  • Modules: I only needed, for now, a small fraction of the total opencv module set.

Intended result

OpenCV is broken up into numerous “modules”, each of which is largely independent and implements a relatively self-contained piece of functionality.  Each module can depend upon other opencv modules and other external packages.  I wanted to reach a point with the bazel configuration where each module could be described at a relatively high level, which didn’t seem too implausible since the module structure is relatively constant across modules.

Think something like:

cvmodule(
  name = "core",
)

cvmodule(
  name = "imgproc",
  deps = [":core"],
)

# ... etc

Bazel implementation

To achieve this in bazel requires three phases: first, a repository rule to actually download the opencv tarball, second something written in Starlark which could implement a cvmodule like rule, and finally a BUILD file equivalent to call it and enumerate the opencv modules I cared about.

The repository rule was relatively straightforward.  It just uses the standard bazel mechanisms to download a tarball and template expand a known BUILD file into place.

The Starlark file is where all the real work lies:

First, we just define a common set of preprocessor defines which will be applied to all opencv translation units:

_OPENCV_COPTS = [
    "-D_USE_MATH_DEFINES",
    "-D__OPENCV_BUILD=1",
    "-D__STDC_CONSTANT_MACROS",
    "-D__STDC_FORMAT_MACROS",
    "-D__STDC_LIMIT_MACROS",
    "-I$(GENDIR)/external/opencv/private/",
]

The only real interesting piece there is the $(GENDIR) fragment, which causes bazel to substitute in the path to the generated output directory. Since a number of the headers that we will be using later on are generated as part of the bazel build process, we need to ensure that the compiler can actually find them.  The alternate formulation which doesn’t require knowledge of GENDIR is to list it in the includes attribute.  That however, exposes the headers to all downstream modules, which we don’t want.

Next, we enumerate the set of processor specific options that we will support:

_KNOWN_OPTS = [
    ("neon", "armeabihf"),
    ("vfpv3", "armeabihf"),
    ("avx", "x86_64"),
    ("avx2", "x86_64"),
    ("avx512_skx", "x86_64"),
    # ...

OpenCV has support for a comprehensive set of special instruction set extensions for a variety of different processors. They can either be compiled in and always used, or dispatched at runtime, although these bazel rules ignore runtime dispatching entirely. Some of the generated headers need to know the full possible set of options for a given architecture, thus this list here.

Module definition macros

Now we can start getting into the module definition Starlark macros.  To handle some generated headers that are common to every opencv module, I ended up creating an opencv_base macro that is module independent.  It needs to be called from the BUILD file, to generate a number of the private headers, but exposes no logically public labels.  It does require a passed in “configuration” dictionary (which will also be passed in to the primary module generation macro).

CONFIG = {
    "modules" : [
        "aruco",
        "calib3d",
        "core",
        "imgcodecs",
        "imgproc",
    ],
# ...

The big annoyance here is that you have to enumerate the set of modules twice.  Once in that config, and again in the set of invocations to the module creation macro.  The reason is the generated opencv_modules.hpp file, which provides preprocessor defines describing which modules are included in the build.

Next is the implementation of the module generation macro, which I ended up calling opencv_module.  It needs just a few arguments for the modules I ended up converting:

  • name – should be self-evident
  • config – the same configuration dictionary from above
  • excludes – source files to skip compiling from this module
  • dispatched_files – the list of files in this module which contain processor specific specializations
  • deps – a set of bazel labels describing the dependency set

This macro goes about generating the appropriate .simd_declarations.hpp and .simd.hpp files, a non-functional stub OpenCL implementation, and then goes on to define the actual bazel cc_library.

Resulting BUILD file

At this point, the BUILD file can glue all these things together. It ended up being not too far away from my initial ideal:

opencv_base(config=CONFIG)

opencv_module(
    name = "core",
    config = CONFIG,
    dispatched_files = {
        "stat" : ["sse4_2", "avx"],
        "mathfuncs_core" : ["sse2", "avx", "avx2"],
    },
    deps = ["@eigen"],
)

opencv_module(
    name = "imgproc",
    config = CONFIG,
    dispatched_files = {
        "accum" : ["sse2", "avx", "neon"],
    },
    deps = [":core"],
)

# ...

Download

As mentioned at the start, all the source are available on github:

bazel-ifying simple autoconf packages

This is part N in a series describing how I created the bazel infrastructure to build all the third party packages for mjmech.  Previously we have:

We left off with the first, very simple packages configured to build with bazel.  In this installment we will tackle those that require at least minimal configuration, i.e. those that have some files which are normally generated as part of the build process.

snappy / template_file

snappy (https://github.com/google/snappy), a file compression library, is largely a pure C++ project, but it does have a single generated header file.  There are a few possible options within bazel to handle this.  The simplest would be to use a macro, although that has complications with respect to namespacing and label resolution.  Here, I created a custom rule, called “template_file”.

As seen in the source on github the interface is relatively simple:

template_file(
  src,
  is_executable,
  substitutions,
  substitution_list,
)

There are two possible forms for passing in substitutions, either a string_dict when using “substitutions”, or a list of KEY=VALUE strings in “substitution_list”.  The two forms are present because starlark can make it awkward to deal with dictionaries, so often working with a list is much more convenient.

The snappy build rule uses this to generate snappy-stubs-public.h:

template_file(
    name = "snappy-stubs-public.h",
    src = "snappy-stubs-public.h.in",
    substitutions = {
        "${HAVE_STDINT_H_01}": "1",
        "${HAVE_STDDEF_H_01}": "1",
        "${HAVE_SYS_UIO_H_01}": "1",
        "${SNAPPY_MAJOR}": "1",
        "${SNAPPY_MINOR}": "1",
        "${SNAPPY_PATCHLEVEL}": "7",
    },
)

log4cpp / autoconf

Next up in the difficulty level is log4cpp, which uses autoconf as its native build system.  autoconf typically has a config.h.in file, defined in a semi-standardized format, mostly comprised of lines like:

/* Define to 1 if you have the  header file. */
#undef HAVE_STRING_H

The pre-processing stage turns each of those lines into either an actual #define, or just a a comment saying that it continues to be undefined.

That is substantive enough that the simple built-in bazel template rule won’t cut it though.  For this, we split the work into two pieces, 1) a python helper script to do the actual formatting and 2) a starlark bazel rule which defaults in some common autoconf defines.  For the bazel_deps repository, the bazel autoconf rule has basically every autoconf define that is shared by more than one project in it, to reduce duplication and make it more likely that all the dependencies are configured consistently.

With the two of them together, the resulting BUILD file defines a cc_library as normal, but it depends upon the new autoconf rule like found in the log4cpp instance:

autoconf_config(
    name = "include/log4cpp/config.h",
    src = "include/config.h.in",
    package = "log4cpp",
    version = "1.1.3",
    defines = autoconf_standard_defines + [
        "DISABLE_SMTP",
        "DISABLE_REMOTE_SYSLOG",
    ],
    prefix = "LOG4CPP_",
)

rules_mbed – bazel for mbed

When working on the firmware for Super Mega Microbot’s improved actuators, I decided to try using mbed-os from ARM for the STM32 libraries instead of the HAL libraries.  I always found that the HAL libraries had a terrible API, and yet they still required that any non-trivial usage of the hardware devolve into register twiddling to be effective.  mbed presents a pretty nice C++ API for the features they do support, which is only a little less capable than HAL, but still makes it trivial to drop down to register twiddling when necessary (and includes all of the HAL by reference).

Most people probably use mbed through their online IDE.  While this is a remarkable convenience, I am a big fan of reproducibility of builds and also of host based testing.  mbed provides mbed-cli which gets part of the way there by letting you build offline.  Unfortunately, it still doesn’t provide great support for projects that both deploy to a target and have automated unit tests.  It actually has zero support for unit tests that can run on the host.

Enter bazel

As many have guessed, I’m a big fan of bazel https://bazel.build, for building software projects.  Despite being raw around the edges, it does a lot of things right.  Its philosophy is to be fast, correct, and reproducible.  While it doesn’t support flagless multi-platform builds yet (#6519), it does provide some minimal multi-platform support.  It also has robust capabilities for pulling in external projects, including entire toolchains and build environments.  Finally, it has excellent support for host based unit tests.

To make it work, you do have to configure a toolchain though.  That I’ve now done for at least STM32F4 based mbed targets.  It is published under an Apache 2.0 license at: https://github.com/mjbots/rules_mbed

rules_mbed features

  • Seamless toolchain provisioning: It configures bazel to automatically download a recent GNU gcc toolchain from ARM (2018q2).
  • Processor support: Currently all STM32F4 processors are supported, although others could be added with minimal effort.
  • Host based testing: Common modules that rely only on the standard library can have unit tests run on the host using any C/C++ testing tools such as boost::test.
  • Simultaneous multi-target configuration: Multiple targets (currently limited to the same CPU family) can be built with a single bazel command.  Multiple families can be configured within the same source tree.

Once you have it installed in a project, building all host based tests is as simple as:

tools/bazel test //...

and building all binary output files is as simple as:

tools/bazel test -c opt --cpu=stm32f4 //...

 

First bazel-ified packages

In “Building mjmech dependencies with bazel“, I described my rationale as it were for attempting to build all of the mjmech dependencies within bazel for cross compilation onto the raspberry pi.  mjmech has two big dependencies which were going to cause most of the transitive fallout:

  • gstreamer – We use gstreamer to interface with the webcam, format RTSP streams for FPV on the control station, and to render the control station and heads up display.  Granted, not all of gstreamer is used, but we do depend on features that require ffmpeg and X11.
  • opencv – The use of opencv had been minimal to non-existant previously, as we hadn’t actually done any computer vision on the robot itself.  However, one of the big motivations for switching to the raspberry pi in the first place was to at least to be able to do active target tracking onboard.

And then there are a few other direct dependencies that are “easy”, if nothing else because they have such few transitive dependencies.

  • boost – The use of boost is almost exclusively the header only parts, boost::date_time, boost::filesystem, boost::program_options, boost::test, and boost::python.  Of these, only boost::python has any transitive dependencies.
  • fmt – This text formatting library has no further dependencies whatsoever.
  • log4cpp – This is just used for writing textual debug output and has no transitive dependencies.
  • snappy – We use snappy to compress logged data, but it depends on nothing.

Simple packages

I started with the simple, no dependency packages from the second set.  The strategy here is to, for each package, create a tools/workspace/FOO/repository.bzl with a FOO_repository method to download the upstream tarball, and a corresponding tools/workspace/FOO/package.BUILD which contains the bazel BUILD file describing how to build that package.

The most straightforward package was “fmt”, from https://github.com/fmtlib/fmt.  It’s repository.bzl looks like:

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

def fmt_repository(name):
    http_archive(
        name = name,
        urls = [
            "https://github.com/fmtlib/fmt/archive/5.0.0.tar.gz",
        ],
        sha256 = "fc33d64d5aa2739ad2ca1b128628a7fc1b7dca1ad077314f09affc57d59cf88a",
        strip_prefix = "fmt-5.0.0",
        build_file = Label("//tools/workspace/fmt:package.BUILD"),
    )

It basically just contains the URL of the tarball to download, its hash, the prefix to strip, and where to locate the BUILD file.

Next, the package.BUILD file is:

package(default_visibility = ["//visibility:public"])

cc_library(
    name = "fmt",
    hdrs = glob(["include/**"]),
    srcs = [
        "src/format.cc",
    ],
    includes = ["include"],
)

Since the fmt library is mostly header only, this has a single cc_library definition which compiles the one file and sets up the paths to correctly find the header files.

The other easiest package, boost, was handled in a similar manner.

Next level

Moving up the difficulty ladder is snappy and log4cpp, both of which require at least minimally more complicated build rules.  That will be the topic for the next time.

 

Building mjmech dependencies with bazel

Previously, I set up bazel to be able to cross compile for the raspberry pi using an extracted sysroot.  That sysroot was very minimal, basically just glibc and the kernel headers.  The software used for SMMB has many dependencies beyond that though, including some heavyweight ones such as gstreamer and I needed some solution for building against them.

Options

There were two basic options:

  1. Install all the dependencies I cared about on an actual raspberry pi, and extract them into the sysroot.
  2. Build all the dependencies I cared about using bazel’s external projects mechanism.

The former would certainly be quicker in the short term, at the expense of needing to check in or otherwise version a very large sysroot.  It would also be annoying to update, as I would need to keep around a physical raspberry pi and continually reset it to zero in order to generate suitably pristine sysroots.

The second option had the benefit of requiring a small version control footprint — just the URLs and hashes of each project, along with suitable bazel build configuration.  It is also perfectly compatible with a fully hermetic build result.  However, it had the significant downside that I would need to write the bazel build configuration for all the transitive dependencies of the project.

I decided to take a stab at the second route, partly because of the benefits, but also to see just how hard it would be.

Structure

Since bazel does not yet have recursive WORKSPACE parsing I went with a structure that is used in the Drake open source project.  The top level project has a tools/workspace directory that contains one sub-directory for each dependency or transitive dependency.  Within that directory is a default.bzl that contains one exported function, add_default_repositories. It is intended to be called from the top level WORKSPACE file, and it creates bazel rules for all necessary external dependencies.

The drake project doesn’t support cross compilation, so most of their BUILD rules are of the form, “grab the pre-compiled flags from pkg-config”.  However, the same structure will work just fine even with non-trivial compilation steps.

mjbots/bazel_deps

So that this could be easily used across multiple dependent projects, I put all the resultant rules into a new github project: https://github.com/mjbots/bazel_deps  Like the drake repository, it has a default.bzl with a single export, add_default_repositories that is used by dependent projects.

Next up, working through actually packaging the dependencies!

Configuring bazel to cross compile for the Raspberry Pi 3

In the previous post, I described the motivation for switching the mjmech build system to bazel.  For that to be useful with Super Mega Microbot, I first had to get a toolchain configured for bazel such that it could produce binaries that would run on the raspberry pi.

All the work in this post is publicly available on github: https://github.com/mjbots/rpi_bazel

Compiler and sysroot

First, I needed to pick the compiler I would be using and how to get the target system libraries available for cross compilation.  In the past I’ve always done the gcc/binutils/gnu everything cross toolchain dance, however here I thought I would try something a bit more reproducible and see if I could make clang work.  Amazingly, a single clang binary supports all possible target types!  clang-6, which can be had through a PPA on Ubuntu 16.04 and natively on Ubuntu 18.04 supports it out of the box.

For the target system libraries, I wrote a small script: make_sysroot.py which when aimed at a live raspberry pi, will extract the Linux kernel headers and glibc headers and binaries.  These are stored in a single tar.xz file, with the latest version checked into the tree: 2018-06-10-sysroot.tar.xz.

Bazel configuration

bazel has a legacy mechanism for configuring toolchains, called the CROSSTOOL file.  It is not exactly pretty, is moderately underdocumented, but at least there are a few write ups online for how to create one.  The CROSSTOOL I created here is minimally functional, with options for both host/clang and rpi/clang with a few caveats:

  1. It isn’t hermetic, it relies on the system installation of clang
  2. It includes hard coded paths to micro releases of clang
  3. The C++ main function isn’t handled correctly yet, and you have to ‘extern “C”‘ it to link applications

With a functioning CROSSTOOL, the next step is to declare the “cc_toolchain_suite” and “cc_toolchain” definitions.  Those are defined in the tools/cc_toolchain/BUILD file and don’t have anything particularly complex in them.

Finally, I created a repository.bzl which is intended to be imported in client projects.  This provides a relatively simple API for a client project to import the toolchain, and gets the sysroot extracted properly.

Using it in a bazel project

The top level README gives the steps to use the toolchain, which while not too bad, still requires touching 4 non-empty files (and 2 possibly empty files).  Once that is done, you can use:

bazel test --config=pi //...

To build your entire project with binaries targeted to run on the raspberry pi!