Tag Archives: opencv

bazel for opencv

The next level of difficulty in bazel-ifying packages for mjmech was opencv.

First, for the impatient, Apache 2.0 licensed sources are available on github: https://github.com/mjbots/bazel_deps/tree/master/tools/workspace/opencv

OpenCV’s native build system consists of nearly 200 cmake files with over 20,000 total lines of code, plus assorted helper scripts and prototype files which are substituted into.  Fortunately, I didn’t need to support the full complexity of the opencv build system.  Things I didn’t bother to touch:

  • Autodetecting the location of any packages:  Since this is embedded within a bazel project, all the dependencies I was going to use were already bazel-ified.
  • GPU support:  I didn’t bother with CUDA, OpenCL, or anything else GPU, as the GPU on the raspberry pi3 has only a minimally supported OpenCL stack, at best.
  • Examples: I had no real need to build any of the sample or demonstration applications.
  • Modules: I only needed, for now, a small fraction of the total opencv module set.

Intended result

OpenCV is broken up into numerous “modules”, each of which is largely independent and implements a relatively self-contained piece of functionality.  Each module can depend upon other opencv modules and other external packages.  I wanted to reach a point with the bazel configuration where each module could be described at a relatively high level, which didn’t seem too implausible since the module structure is relatively constant across modules.

Think something like:

cvmodule(
  name = "core",
)

cvmodule(
  name = "imgproc",
  deps = [":core"],
)

# ... etc

Bazel implementation

To achieve this in bazel requires three phases: first, a repository rule to actually download the opencv tarball, second something written in Starlark which could implement a cvmodule like rule, and finally a BUILD file equivalent to call it and enumerate the opencv modules I cared about.

The repository rule was relatively straightforward.  It just uses the standard bazel mechanisms to download a tarball and template expand a known BUILD file into place.

The Starlark file is where all the real work lies:

First, we just define a common set of preprocessor defines which will be applied to all opencv translation units:

_OPENCV_COPTS = [
    "-D_USE_MATH_DEFINES",
    "-D__OPENCV_BUILD=1",
    "-D__STDC_CONSTANT_MACROS",
    "-D__STDC_FORMAT_MACROS",
    "-D__STDC_LIMIT_MACROS",
    "-I$(GENDIR)/external/opencv/private/",
]

The only real interesting piece there is the $(GENDIR) fragment, which causes bazel to substitute in the path to the generated output directory. Since a number of the headers that we will be using later on are generated as part of the bazel build process, we need to ensure that the compiler can actually find them.  The alternate formulation which doesn’t require knowledge of GENDIR is to list it in the includes attribute.  That however, exposes the headers to all downstream modules, which we don’t want.

Next, we enumerate the set of processor specific options that we will support:

_KNOWN_OPTS = [
    ("neon", "armeabihf"),
    ("vfpv3", "armeabihf"),
    ("avx", "x86_64"),
    ("avx2", "x86_64"),
    ("avx512_skx", "x86_64"),
    # ...

OpenCV has support for a comprehensive set of special instruction set extensions for a variety of different processors. They can either be compiled in and always used, or dispatched at runtime, although these bazel rules ignore runtime dispatching entirely. Some of the generated headers need to know the full possible set of options for a given architecture, thus this list here.

Module definition macros

Now we can start getting into the module definition Starlark macros.  To handle some generated headers that are common to every opencv module, I ended up creating an opencv_base macro that is module independent.  It needs to be called from the BUILD file, to generate a number of the private headers, but exposes no logically public labels.  It does require a passed in “configuration” dictionary (which will also be passed in to the primary module generation macro).

CONFIG = {
    "modules" : [
        "aruco",
        "calib3d",
        "core",
        "imgcodecs",
        "imgproc",
    ],
# ...

The big annoyance here is that you have to enumerate the set of modules twice.  Once in that config, and again in the set of invocations to the module creation macro.  The reason is the generated opencv_modules.hpp file, which provides preprocessor defines describing which modules are included in the build.

Next is the implementation of the module generation macro, which I ended up calling opencv_module.  It needs just a few arguments for the modules I ended up converting:

  • name – should be self-evident
  • config – the same configuration dictionary from above
  • excludes – source files to skip compiling from this module
  • dispatched_files – the list of files in this module which contain processor specific specializations
  • deps – a set of bazel labels describing the dependency set

This macro goes about generating the appropriate .simd_declarations.hpp and .simd.hpp files, a non-functional stub OpenCL implementation, and then goes on to define the actual bazel cc_library.

Resulting BUILD file

At this point, the BUILD file can glue all these things together. It ended up being not too far away from my initial ideal:

opencv_base(config=CONFIG)

opencv_module(
    name = "core",
    config = CONFIG,
    dispatched_files = {
        "stat" : ["sse4_2", "avx"],
        "mathfuncs_core" : ["sse2", "avx", "avx2"],
    },
    deps = ["@eigen"],
)

opencv_module(
    name = "imgproc",
    config = CONFIG,
    dispatched_files = {
        "accum" : ["sse2", "avx", "neon"],
    },
    deps = [":core"],
)

# ...

Download

As mentioned at the start, all the source are available on github: