Skip to content
README.md 10.7 KiB
Newer Older
# `toolchain_utils`
Matthew Clarkson's avatar
Matthew Clarkson committed

> A Bazel ruleset to enable concise toolchain registration.
Matthew Clarkson's avatar
Matthew Clarkson committed

## Getting Started
Matthew Clarkson's avatar
Matthew Clarkson committed

Add the module following to `MODULE.bazel`:
Matthew Clarkson's avatar
Matthew Clarkson committed

```py
bazel_dep(name = "toolchain_utils", version = "<...>")
### Register a built target

The `toolchain_info` provides a way to create the `ToolchainInfo` provider around a executable Bazel target.
# Assuming a ruleset named `rules_placeholder` and a tool named `binary`
# This would be in `//placeholder/toolchain/binary/BUILD.bazel`
load("@toolchain_utils//toolchain/info:defs.bzl", "toolchain_info")
# Used for registration. Public so that downstream can register other toolchains
toolchain_type(
    name = "type",
    visibility = ["//visibility:public"],
# Create toolchain information around a binary target
# This could be a `go_binary`, `py_binary`, etc.
# The theoretical binary generates code that targets `amd64-linux-gnu`
# It will always be built for the current execution platform
toolchain_info(
    name = "info-amd64-linux-gnu",
    target = "//placeholder/binary:arm64-linux-gnu",
    # We want a consistent `$(BINARY)` Make variable
    # This will default to `basename.upper()`
    variable = "BINARY",
)

# Register the toolchain
# The `target_compatible_with` describes the code generated by the toolchain
# No `exec_compatible_with` as Bazel will build the binary for the execution platform 
    name = "built-amd64-linux-gnu",
    toolchain = ":info",
    toolchain_type = ":type",
    target_compatible_with = [
        "@toolchain_utils//toolchain/constraint/cpu:amd64",
        "@toolchain_utils//toolchain/constraint/os:linux",
        "@toolchain_utils//toolchain/constraint/libc:gnu",
    ],
The toolchain can be implicitly registered by the current module in `MODULE.bazel`:
```py
register_toolchains("//placeholder/toolchain/binary:local-amd64-linux-gnu")
# Or use a recursive registration
# register_toolchains("//placeholder/toolchain/...")
```

### Register a local binary on `PATH`

It is often useful for debugging or quick setup to use a binary defined on `PATH` as the toolchain binary.

The project provides a repository rule to detect a binary on `PATH`. Add the following to `MODULE.bazel`:

```py
# Assuming a binary named `amd64-linux-gnu-binary` on `PATH`
which = use_repo_rule("@toolchain_utils//toolchain/local/which:defs.bzl", "toolchain_local_which")
which(
    name = "which-amd64-linux-gnu-binary",
The repository rule provides a `toolchain_info` target that can be registered against a toolchain type:
# Assuming a ruleset named `rules_placeholder` and a tool named `binary`
# This would be in `//placeholder/toolchain/binary/BUILD.bazel`
# Used for registration. Public so that downstream can register other toolchains
toolchain_type(
    name = "type",
    visibility = ["//visibility:public"],
# Register the local toolchain
# The `target_compatible_with` describes the code generated by the toolchain
# No `exec_compatible_with` as Bazel will build the binary for the execution platform 
    name = "local-amd64-linux-gnu",
    toolchain = "@which-amd64-linux-gnu-binary",
    toolchain_type = ":type",
    exec_compatible_with = [
        "@toolchain_utils//toolchain/constraint/cpu:local",
        "@toolchain_utils//toolchain/constraint/os:local",
        "@toolchain_utils//toolchain/constraint/libc:local",
    ],
    target_compatible_with = [
        "@toolchain_utils//toolchain/constraint/cpu:amd64",
        "@toolchain_utils//toolchain/constraint/os:linux",
        "@toolchain_utils//toolchain/constraint/libc:gnu",
    ],
)

```

The toolchain can be implicitly registered by the current module in `MODULE.bazel`:

```py
register_toolchains("//placeholder/toolchain/binary:local-amd64-linux-gnu")
# Or use a recursive registration
# register_toolchains("//placeholder/toolchain/...")
```

### Register downloaded binaries

A common use-case is to download pre-built binaries for different execution architectures and register them as a Bazel toolchain.

The binaries will need to be downloaded in `MODULE.bazel`:

```py
bazel_dep(name = "download_utils", version = "<...>")

# Download a binary that runs on `amd64-linux-gnu`
# It will generate code for `arm64-windows-ucrt`
download_file = use_repo_rule("@download_utils//download/file:defs.bzl", "download_file")
download_file(
    name = "binary-amd64-linux-gnu",
    output = "arm64-windows-ucrt-binary",
    executable = True,
    urls = ["https://some.thing/amd64-linux-gnu/arm64-windows-ucrt-binary"],
The downloaded binary can now be registered:
# Assuming a ruleset named `rules_placeholder` and a tool named `binary`
# This would be in `//placeholder/toolchain/binary/BUILD.bazel`

# Used for registration. Public so that downstream can register other toolchains
toolchain_type(
    name = "type",
    visibility = ["//visibility:public"],
)
```

To make organisation easier, it can be useful to create a Bazel package for each execution architecture.

# This would be in `//placeholder/toolchain/binary/amd64-linux-gnu/BUILD.bazel`
# Where `amd64-linux-gnu` is the execution architecture these toolchains work on
# Create toolchain information around the downloaded binary
    name = "info-arm64-windows-ucrt",
    target = "@binary-amd64-linux-gnu//:arm64-windows-ucrt-binary",
    # We want a consistent `$(BINARY)` Make variable
    # This will default to `basename.upper()`
    variable = "BINARY",
# Register the downloaded toolchain
# The `target_compatible_with` describes the code generated by the toolchain
# The `exec_compatible_with` describes which execution platform the tool can run on
    name = "downloaded-arm64-linux-gnu",
    toolchain = ":info-arm64-linux-gnu",
    toolchain_type = "//placholder/toolchain/binary:type",
        "@toolchain_utils//toolchain/constraint/cpu:amd64",
        "@toolchain_utils//toolchain/constraint/os:linux",
        "@toolchain_utils//toolchain/constraint/libc:gnu",
    ],
    target_compatible_with = [
        "@toolchain_utils//toolchain/constraint/cpu:arm64",
        "@toolchain_utils//toolchain/constraint/os:windows",
        "@toolchain_utils//toolchain/constraint/libc:ucrt",
The toolchain can be implicitly registered by the current module in `MODULE.bazel`:

```py
register_toolchains("//placeholder/toolchain/binary/amd64-linux-gnu:downloaded-amd64-windows-ucrt")
# Or use a recursive registration
# register_toolchains("//placeholder/toolchain/...")
```
### Use a toolchain in a Bazel `rule`

Declare the usage of the toolchain in a rule definition:

```py
def implementation(ctx):
    toolchain = ctx.toolchains["//placeholder/toolchain/binary:type"]

    # The `ToolchainInfo` generated by `toolchain_info` is always the same shape

    # `toolchain.default` is the `DefaultInfo` of the wrapped tool
    # Use the `DefaultInfo` to forward runfiles through rules, if needed

    # `toolchain.run` can be passed to `ctx.actions.run` to execute the tool
    output = ctx.actions.declare_file(ctx.label.name)
    args = ctx.actions.args()
    args.add(output)
    ctx.actions.run(
        outputs = [output],
        # inputs = [], # likely need some inputs to the rule
        executable = toolchain.run,
        arguments = [args],
    )
    return DefaultInfo(files = depset([output]))

example = rule(
    implementation = implementation,
    toolchains = ["//placeholder/toolchain/binary:type"],
)
```

### Provide a resolved toolchain Make variable

Due to a [quirk] in Bazel, to retrieve the resolved toolchain as a Make variable another `rule` implementation must perform the toolchain resolution.

The project provides a repository rule the implements the boilerplate necessary for this.

Use the `resolved` repository rule in `MODULE.bazel`:

```py
resolved = use_repo_rule("@toolchain_utils//toolchain/resolved:defs.bzl", "toolchain_resolved")
resolved(
    name = "resolved-binary",
    toolchain_type = "//placholder/toolchain/binary:type",
)
```

A `alias` can then be provided to expose the resolved Make variable to downstream users:

```py
# Assuming a ruleset named `rules_placeholder` and a tool named `binary`
# This would be in `//placeholder/toolchain/binary/BUILD.bazel`

# Used for registration. Public so that downstream can register other toolchains
toolchain_type(
    name = "type",
    visibility = ["//visibility:public"],
)

# Provides `TemplateVariableInfo` so that the resolved toolchain can be used as Make variable
alias(
    name = "resolved",
    actual = "@resolved-binary",
    visibility = ["//visibility:public"],
### Use a resolved toolchain in a `genrule`

A resolved toolchain Make variable can be used in any rule that expands Make variables.
Commonly, this functionality is used in `genrule` targets:
    name = "generate",
    outputs = ["stdout.log"],
    cmd = "$(BINARY) --help > $@",
    toolchains = ["@rules_placeholder//placeholder/toolchain/binary:resolved"],
)
```

### Test a resolved toolchain

When developing a toolchain is can be useful to do simple testing of the toolchain resolution.

This can be performed with `toolchain_test` and a resolved toolchain target.

```py
# Assuming a ruleset named `rules_placeholder` and a tool named `binary`
# This would be in `//placeholder/toolchain/binary/BUILD.bazel`
load("@toolchain_utils//toolchain/test:defs.bzl", "toolchain_test")

# Used for registration. Public so that downstream can register other toolchains
toolchain_type(
    name = "type",
    visibility = ["//visibility:public"],
)

# Assuming that some `toolchain` targets are added here and registered

# Provides `TemplateVariableInfo` so that the resolved toolchain can be used as Make variable
alias(
    name = "resolved",
    actual = "@resolved-binary",
    visibility = ["//visibility:public"],
)

# Add a simple test to ensure that the toolchain is resolved, exits with zero and outputs _something_ to `stdout`
toolchain_test(
    name = "test",
    args = ["--version"],
    toolchains = [":resolved"],
The `toolchain_test` can provide `diff` on `stdout`/`stderr`. See the documentation for the rule for more insight.

## Hermeticity

### POSIX

On POSIX systems, this ruleset is entirely hermetic and only requires a POSIX compatible shell and `/usr/bin/env` to find that shell.

### NT

The rule set has Batch implementation on Windows so does not require Bash.

A binary Windows launcher is created by compiling [C# code][launcher-cs] with the .NET `csc`. This is provided by the base install of Windows.

The `toolchain_test` uses the `FC.exe` binary to compare `stdout`/`stderr` of toolchain binaries. This is provided in the base install of Windows.

Effectively, the ruleset is hermetic.

[launcher-cs]: toolchain/launcher/launcher.cs
[quirk]: https://github.com/bazelbuild/bazel/issues/14009