Skip to content
README.md 11.1 KiB
Newer Older
# `toolchain_utils`
Matthew Clarkson's avatar
Matthew Clarkson committed

> A Bazel ruleset to enable concise toolchain registration.
Matthew Clarkson's avatar
Matthew Clarkson committed

## Getting Started
Matthew Clarkson's avatar
Matthew Clarkson committed

Add the module following to `MODULE.bazel`:
Matthew Clarkson's avatar
Matthew Clarkson committed

```py
bazel_dep(name = "toolchain_utils", version = "<...>")
### Register a built target

The `toolchain_info` provides a way to create the `ToolchainInfo` provider around a executable Bazel target.
# Assuming a ruleset named `rules_placeholder` and a tool named `binary`
# This would be in `//placeholder/toolchain/binary/BUILD.bazel`
load("@toolchain_utils//toolchain/info:defs.bzl", "toolchain_info")
# Used for registration. Public so that downstream can register other toolchains
toolchain_type(
    name = "type",
    visibility = ["//visibility:public"],
# Create toolchain information around a binary target
# This could be a `go_binary`, `py_binary`, etc.
# The theoretical binary generates code that targets `amd64-linux-gnu`
# It will always be built for the current execution platform
toolchain_info(
    name = "info-amd64-linux-gnu",
    target = "//placeholder/binary:arm64-linux-gnu",
    # We want a consistent `$(BINARY)` Make variable
    # This will default to `basename.upper()`
    variable = "BINARY",
)

# Register the toolchain
# The `target_compatible_with` describes the code generated by the toolchain
# No `exec_compatible_with` as Bazel will build the binary for the execution platform
    name = "built-amd64-linux-gnu",
    toolchain = ":info-amd64-linux-gnu",
    toolchain_type = ":type",
    target_compatible_with = [
        "@toolchain_utils//toolchain/constraint/cpu:amd64",
        "@toolchain_utils//toolchain/constraint/os:linux",
        "@toolchain_utils//toolchain/constraint/libc:gnu",
    ],
The toolchain can be implicitly registered by the current module in `MODULE.bazel`:
register_toolchains("//placeholder/toolchain/binary:built-amd64-linux-gnu")
# Or use a recursive registration
# register_toolchains("//placeholder/toolchain/...")
```

### Register a local binary on `PATH`

It is often useful to use a binary defined on `PATH` as the toolchain binary.

The downside is the binary is not hermetic and not available to use in remote execution.

However, it can still be useful for experimentation and quick setup when a hermetic tool does not exist.
The project provides a repository rule to detect a binary on `PATH`.

Add the following to `MODULE.bazel`:

```py
which = use_repo_rule("@toolchain_utils//toolchain/local/which:defs.bzl", "toolchain_local_which")

# Assuming a binary named `amd64-linux-gnu-binary` on `PATH`
which(
    name = "which-amd64-linux-gnu-binary",
The repository rule provides a `toolchain_info` target that can be registered against a toolchain type:
# Assuming a ruleset named `rules_placeholder` and a tool named `binary`
# This would be in `//placeholder/toolchain/binary/BUILD.bazel`
# Used for registration. Public so that downstream can register other toolchains
toolchain_type(
    name = "type",
    visibility = ["//visibility:public"],
# Register the local toolchain
# The `target_compatible_with` describes the code generated by the toolchain
# `exec_compatible_with` set to local constraints.
# Will not be available for remote execution due to local symlink path
    name = "local-amd64-linux-gnu",
    toolchain = "@which-amd64-linux-gnu-binary",
    toolchain_type = ":type",
    exec_compatible_with = [
        "@toolchain_utils//toolchain/constraint/cpu:local",
        "@toolchain_utils//toolchain/constraint/os:local",
        "@toolchain_utils//toolchain/constraint/libc:local",
    ],
    target_compatible_with = [
        "@toolchain_utils//toolchain/constraint/cpu:amd64",
        "@toolchain_utils//toolchain/constraint/os:linux",
        "@toolchain_utils//toolchain/constraint/libc:gnu",
    ],
)

```

The toolchain can be implicitly registered by the current module in `MODULE.bazel`:

```py
register_toolchains("//placeholder/toolchain/binary:local-amd64-linux-gnu")
# Or use a recursive registration
# register_toolchains("//placeholder/toolchain/...")
```

### Register downloaded binaries

A common use-case is to download pre-built binaries for different execution architectures and register them as a Bazel toolchain.

The binaries will need to be downloaded in `MODULE.bazel`:

```py
bazel_dep(name = "download_utils", version = "<...>")

# Download a binary that runs on `amd64-linux-gnu`
# It will generate code for `arm64-windows-ucrt`
download_file = use_repo_rule("@download_utils//download/file:defs.bzl", "download_file")
download_file(
    name = "binary-amd64-linux-gnu",
    output = "arm64-windows-ucrt-binary",
    executable = True,
    urls = ["https://some.thing/amd64-linux-gnu/arm64-windows-ucrt-binary"],
A toolchain type must be defined to register the tool against:
# Assuming a ruleset named `rules_placeholder` and a tool named `binary`
# This would be in `//placeholder/toolchain/binary/BUILD.bazel`

# Used for registration. Public so that downstream can register other toolchains
toolchain_type(
    name = "type",
    visibility = ["//visibility:public"],
)
```

The downloaded binary can now be registered against the toolchain type.

To make organisation easier, it can be useful to create a Bazel package for each execution architecture.

# This would be in `//placeholder/toolchain/binary/amd64-linux-gnu/BUILD.bazel`
# Where `amd64-linux-gnu` is the execution architecture these toolchains work on
# Create toolchain information around the downloaded binary
    name = "info-arm64-windows-ucrt",
    target = "@binary-amd64-linux-gnu//:arm64-windows-ucrt-binary",
    # We want a consistent `$(BINARY)` Make variable
    # This will default to `basename.upper()`
    variable = "BINARY",
# Register the downloaded toolchain
# The `target_compatible_with` describes the code generated by the toolchain
# The `exec_compatible_with` describes which execution platform the tool can run on
    name = "downloaded-arm64-windows-ucrt",
    toolchain = ":info-arm64-windows-ucrt",
    toolchain_type = "//placholder/toolchain/binary:type",
        "@toolchain_utils//toolchain/constraint/cpu:amd64",
        "@toolchain_utils//toolchain/constraint/os:linux",
        "@toolchain_utils//toolchain/constraint/libc:gnu",
    ],
    target_compatible_with = [
        "@toolchain_utils//toolchain/constraint/cpu:arm64",
        "@toolchain_utils//toolchain/constraint/os:windows",
        "@toolchain_utils//toolchain/constraint/libc:ucrt",
The toolchain can be implicitly registered by the current module in `MODULE.bazel`:

```py
register_toolchains("//placeholder/toolchain/binary/amd64-linux-gnu:downloaded-amd64-windows-ucrt")
# Or use a recursive registration
# register_toolchains("//placeholder/toolchain/...")
```
### Use a toolchain in a Bazel `rule`

Declare the usage of the toolchain in a rule definition:

```py
def implementation(ctx):
    toolchain = ctx.toolchains["//placeholder/toolchain/binary:type"]

    # The `ToolchainInfo` generated by `toolchain_info` is always the same shape

    # `toolchain.variable` is the Make variable for the tool

    # `toolchain.default` is the `DefaultInfo` of the wrapped tool
    # Use the `DefaultInfo` to forward runfiles through rules, if needed

    # `toolchain.executable` is the Bazel executable `File`

    # `toolchain.run` can be passed to `ctx.actions.run` to execute the tool
    # It will correctly forward runfiles associated with the tool
    output = ctx.actions.declare_file(ctx.label.name)
    args = ctx.actions.args()
    args.add(output)
    ctx.actions.run(
        outputs = [output],
        executable = toolchain.run,
        arguments = [args],
    )
    return DefaultInfo(files = depset([output]))

example = rule(
    implementation = implementation,
    toolchains = ["//placeholder/toolchain/binary:type"],
)
```

### Provide a resolved toolchain Make variable

Due to a [quirk] in Bazel, to retrieve the resolved toolchain as a Make variable another `rule` implementation must perform the toolchain resolution.

The project provides a repository rule that implements the boilerplate necessary for this.

Use the `resolved` repository rule in `MODULE.bazel`:

```py
resolved = use_repo_rule("@toolchain_utils//toolchain/resolved:defs.bzl", "toolchain_resolved")
resolved(
    name = "resolved-binary",
    toolchain_type = "//placholder/toolchain/binary:type",
)
```

A `alias` can then be provided to expose the resolved Make variable to downstream users:

```py
# Assuming a ruleset named `rules_placeholder` and a tool named `binary`
# This would be in `//placeholder/toolchain/binary/BUILD.bazel`

# Used for registration. Public so that downstream can register other toolchains
toolchain_type(
    name = "type",
    visibility = ["//visibility:public"],
)

# Provides `TemplateVariableInfo` so that the resolved toolchain can be used as a Make variable
alias(
    name = "resolved",
    actual = "@resolved-binary",
    visibility = ["//visibility:public"],
### Use a resolved toolchain in a `genrule`

A resolved toolchain Make variable can be used in any rule that expands Make variables.
Commonly, this functionality is used in `genrule` targets:
    name = "generate",
    outputs = ["stdout.log"],
    cmd = "$(BINARY) --help > $@",
    toolchains = ["@rules_placeholder//placeholder/toolchain/binary:resolved"],
)
```

### Test a resolved toolchain

When developing a toolchain it can be useful to do simple testing of the toolchain resolution.

This can be performed with `toolchain_test` and a resolved toolchain target.

```py
# Assuming a ruleset named `rules_placeholder` and a tool named `binary`
# This would be in `//placeholder/toolchain/binary/BUILD.bazel`
load("@toolchain_utils//toolchain/test:defs.bzl", "toolchain_test")

# Used for registration. Public so that downstream can register other toolchains
toolchain_type(
    name = "type",
    visibility = ["//visibility:public"],
)

# Assuming that some `toolchain` targets are added here and registered

# Provides `TemplateVariableInfo` so that the resolved toolchain can be used as a Make variable
alias(
    name = "resolved",
    actual = "@resolved-binary",
    visibility = ["//visibility:public"],
)

# Add a simple test to ensure that the toolchain is resolved, exits with zero and outputs _something_ to `stdout`
toolchain_test(
    name = "test",
    args = ["--version"],
    toolchains = [":resolved"],
The `toolchain_test` can provide `diff` on `stdout`/`stderr`. See the documentation for the rule for more insight.

## Hermeticity

### POSIX

On POSIX systems, this ruleset is entirely hermetic and only requires a POSIX compatible shell and `/usr/bin/env` to find that shell.

### NT

The rule set has Batch implementation on Windows so does not require Bash.

A binary Windows launcher is created by compiling [C# code][launcher-cs] with the .NET `csc`. This is provided by the base install of Windows.

The `toolchain_test` uses the `FC.exe` binary to compare `stdout`/`stderr` of toolchain binaries. This is provided in the base install of Windows.

Effectively, the ruleset is hermetic.

[launcher-cs]: toolchain/launcher/launcher.cs
[quirk]: https://github.com/bazelbuild/bazel/issues/14009