Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

fastverk

proven systems, built fast — the hermetic software works.

A vertically-integrated, Bazel-native platform for complex, multi-modal software, and the rules_* constellation it’s built on — every module one concern, composed into hermetic, reproducible builds.

Managed cloud RBE + cache, a WireGuard mesh, and source hosting compose on top: the build, the network, the source, and the machine — reproducible and verified, end to end.

Where to go next

  • The platform — the products: fvkit, the desktop app, the brand, and the turnkey AWS install.
  • The constellation — the rules_* registry: one module per concern, composed into hermetic builds.
  • Quick start — wire the registry into your own Bazel build.
  • Philosophy — Bazel-native, hermetic, honest about gaps.

The platform

The products that sit on top of the constellation.

fvkitcore/runtime — the platform library + the fvd daemon (volumes, bazelrc, connections, maintenance, updater)
fastverk-appthe macOS desktop app — a menu-bar control plane for the works
brandthe visual identity — one parametric source for the mark, icons, brandbook, decks, and this docs theme

Managed cloud RBE + cache, a WireGuard mesh, and source hosting compose on top: the build, the network, the source, and the machine — reproducible and verified, end to end.

Turnkey AWS install

The server side installs into your own AWS account in one launch — a single CloudFormation stack brings up the cluster, nodes, TLS, and DNS, reachable at your own domain with no manual steps. The plugin runtime (gRPC-service plugins, discovered and routed like QueryRPC) lets features ship as narrowly-scoped repos without forking the core.

Keyless workload identity

CI gets a short-lived fastverk credential by exchanging its own OIDC token (no shared secrets): the workload-identity broker verifies a GitHub Actions token and mints a scoped, Cognito-backed token — the machine-identity sibling of the interactive login.

The constellation

The rules_* registry — one module per concern, composed into hermetic builds. Each lands in the fastverk bazel-registry; per-module API reference (stardoc-generated) and this catalog are kept current by the nightly rebuild.

Categories

  • Language toolchainsrules_uv, rules_lean, rules_postgres, rules_autoconf
  • API + schemarules_jsonschema, rules_openapi, rules_aip
  • Web + bundlersrules_bun, rules_chrome, rules_nextjs, rules_storybook, rules_vite, rules_docker
  • CI + cloudrules_github, rules_gitlab, rules_ci, rules_cloudformation, rules_helm
  • Docs + publishingrules_mdbook, rules_tectonic, rules_readme, rules_markdown
  • Semantic webrules_jena, rules_rdf, rules_schema_org

Modules

The full catalog, generated from the registry by tools/harvest-catalog.sh (nightly + on repository_dispatch), so new modules + releases appear here automatically. See each module’s API reference (under Reference) for setup.

ModuleLatestSource
botnoc0.1.0fastverk/botnoc
brand0.3.1fastverk/brand
buildbarn0.0.2fastverk/buildbarn
fastverk-app0.0.2fastverk/fastverk-app
forge0.0.1fastverk/forge
fvkit0.0.6fastverk/fvkit
meridian0.2.2mattmarshall/meridian
pinax0.1.0mattmarshall/pinax
rules_agentic_ide0.0.4fastverk/rules_agentic_ide
rules_aip0.2.2fastverk/rules_aip
rules_autoconf0.1.0fastverk/rules_autoconf
rules_beam0.0.2fastverk/rules_beam
rules_bibtex0.0.6fastverk/rules_bibtex
rules_bun0.4.0fastverk/rules_bun
rules_cc_cross0.1.0fastverk/rules_cc_cross
rules_cc_host0.1.0fastverk/rules_cc_host
rules_chrome0.1.0fastverk/rules_chrome
rules_ci_ir0.0.1fastverk/rules_ci_ir
rules_cloudformation0.8.0fastverk/rules_cloudformation
rules_docker0.2.6fastverk/rules_docker_compose
rules_eslint0.1.0fastverk/rules_eslint
rules_fastverk0.0.2fastverk/rules_fastverk
rules_github0.1.2fastverk/rules_github
rules_gitlab0.3.2fastverk/rules_gitlab
rules_graphviz0.1.0fastverk/rules_graphviz
rules_helm0.1.0fastverk/rules_helm
rules_huggingface0.0.3fastverk/rules_huggingface
rules_jena0.3.2fastverk/rules_jena
rules_jsonschema0.3.0fastverk/rules_jsonschema
rules_lang0.4.0fastverk/rules_lang
rules_lean0.5.3fastverk/rules_lean
rules_lora0.1.3fastverk/rules_lora
rules_macvm0.0.1fastverk/rules_macvm
rules_markdown0.0.3fastverk/rules_markdown
rules_mdbook0.3.1fastverk/rules_mdbook
rules_meridian0.2.1mattmarshall/meridian
rules_meson0.0.1fastverk/rules_meson
rules_nextjs0.3.0fastverk/rules_nextjs
rules_openapi0.2.1fastverk/rules_openapi
rules_podman0.0.2fastverk/rules_podman
rules_postgres0.8.0fastverk/rules_postgres
rules_puml0.0.2fastverk/rules_puml
rules_rdf0.4.0fastverk/rules_rdf
rules_readme0.0.3fastverk/rules_readme
rules_runpod0.0.11fastverk/rules_runpod
rules_schema_org0.0.1fastverk/rules_schema_org
rules_spec0.5.1fastverk/rules_spec
rules_ssh_tui0.0.5fastverk/rules_ssh_tui
rules_storybook0.2.0fastverk/rules_storybook
rules_systemd0.0.1fastverk/rules_systemd
rules_tap0.0.3fastverk/rules_tap
rules_tectonic0.2.0fastverk/rules_tectonic
rules_uv0.7.4fastverk/rules_uv
rules_vite0.1.1fastverk/rules_vite
rules_vscode0.0.2fastverk/rules_vscode
rules_walkthrough0.1.0fastverk/rules_walkthrough
rules_web0.0.1fastverk/rules_web
rules_xsd0.0.1fastverk/rules_xsd
spec0.5.2fastverk/spec
vpn0.0.1fastverk/vpn
wave0.0.1fastverk/wave

Registry split

fastverk and citizen-sh publish from separate registries:

  • fastverk modules: https://registry.fastverk.com/
  • citizen-sh modules: https://raw.githubusercontent.com/citizen-sh/bazel-registry/main/

Use the registry chain that matches the modules you consume.

Quick start

Wire the fastverk registry into your Bazel build, then add the modules you need.

1. Add the registry chain

.bazelrc:

common --registry=https://registry.fastverk.com/
common --registry=https://bcr.bazel.build/

2. Depend on modules

MODULE.bazel:

bazel_dep(name = "rules_uv", version = "0.7.4")
bazel_dep(name = "rules_cloudformation", version = "0.8.0")
# … etc.

See each module’s API reference for module-specific setup (toolchains, extensions, use_repo).

3. Build

bazel build //...

A fresh checkout / CI resolves every fastverk module from the registry — no git submodules, no local_path_override. To hack on one locally, override it ad hoc:

bazel build //… --override_module=<name>=path/to/<name>

Philosophy

  • Bazel-native first. Cross-module workflows are expressible as Bazel targets, not out-of-band scripts.
  • Hermetic by default. Each module either pins its upstream artifact’s sha256 + extracts deterministically, or vendors a source tarball with the same. Host-tool dependencies are limited to OS-provided utilities that don’t drift.
  • Honest about gaps. Modules ship at 0.0.x with explicit “no smoke” labels when not yet verified end-to-end. We don’t pretend.
  • One thing per module. Splitting beats coupling.

Contributing

Each module has its own issues + PRs. For org-wide coordination (cross-module bumps, registry-tier moves, agent dispatch), botnoc — the bot-driven Network Operations Center — is the entry point. botnoc renders the module catalog above and orchestrates work across the constellation.

rules_mdbook

API reference, generated from the module’s .bzl docstrings (stardoc).


User-facing Bazel rules for rules_mdbook.

Exports mdbook_book, which runs mdbook build over a staged source tree and packages the rendered HTML into a tarball. Optional plugin executables (e.g. mdbook-mermaid) are staged onto PATH so mdbook can resolve them by their bare names.

Targets returning MdbookSiteInfo expose the site tarball programmatically so future rules (a deploy step, a link checker, a mdbook serve wrapper) can consume the output without re-running mdbook.

mdbook_book

load("@rules_mdbook//mdbook:defs.bzl", "mdbook_book")

mdbook_book(name, srcs, out, book_toml, plugins, src_strip_prefix)

Run mdbook build over a staged source tree and produce an HTML tarball.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
srcsAll source files (Markdown, SUMMARY.md, theme assets, etc.). Each file is staged at its package-relative path minus src_strip_prefix. A directory (tree artifact produced by an upstream rule) is copied recursively into its computed relative path, so a rule that stages a generated chapter tree can feed it here directly.List of labelsrequired
outThe rendered site, packaged as a .tar.gz.Labelrequired
book_tomlThe mdbook configuration file. Staged at the root of the build sandbox.Labelrequired
pluginsmdbook plugin executables (e.g. @mdbook_mermaid//:mdbook-mermaid). Staged onto PATH so mdbook can resolve them by bare name.List of labelsoptional[]
src_strip_prefixPrefix to strip from each src’s package-relative path before staging. Empty means files land at their package-relative paths.Stringoptional""

mdbook_serve

load("@rules_mdbook//mdbook:defs.bzl", "mdbook_serve")

mdbook_serve(name, plugins)

Run mdbook serve (with watch + live reload) against the live user source tree under $BUILD_WORKSPACE_DIRECTORY/<package>. Invoke via bazel run //path/to:target. The target’s package directory must contain the book.toml; mdbook’s own watch picks up edits without Bazel re-running.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
pluginsmdbook plugin executables, staged onto PATH so mdbook resolves them by bare name. Match the plugins listed in your book.toml.List of labelsoptional[]

MdbookSiteInfo

load("@rules_mdbook//mdbook:defs.bzl", "MdbookSiteInfo")

MdbookSiteInfo(tarball)

A rendered mdbook site.

FIELDS

NameDescription
tarballFile: the gzipped tar of the rendered HTML tree.

Module extension for rules_mdbook.

Auto-fetches prebuilt mdbook + mdbook-mermaid binaries for the host platform. Versions are pinned by sha256 in private/known_versions.bzl. Consumers can override the version per tool via the toolchain tag class.

Default usage (pulls the default-pinned mdbook + mdbook-mermaid):

mdbook = use_extension("@rules_mdbook//mdbook:extensions.bzl", "mdbook")
use_repo(mdbook, "mdbook", "mdbook_mermaid")

Pin a specific version:

mdbook = use_extension("@rules_mdbook//mdbook:extensions.bzl", "mdbook")
mdbook.toolchain(mdbook_version = "0.5.2", mermaid_version = "0.17.0")
use_repo(mdbook, "mdbook", "mdbook_mermaid")

Release fetching is delegated to @rules_github//github:repositories.bzl%github_binary_repository so all our rules_* repos share one URL-shape + sha-pinning impl.

mdbook

mdbook = use_extension("@rules_mdbook//mdbook:extensions.bzl", "mdbook")
mdbook.toolchain(mdbook_version, mermaid_version)

Sets up @mdbook and @mdbook_mermaid as Bazel-fetched prebuilt binaries.

TAG CLASSES

toolchain

Attributes

NameDescriptionTypeMandatoryDefault
mdbook_versionOverride mdbook version. Defaults to the value in known_versions.bzl.Stringoptional""
mermaid_versionOverride mdbook-mermaid version. Defaults to the value in known_versions.bzl.Stringoptional""

Toolchain rule for rules_mdbook.

mdbook_toolchain wraps a single mdbook binary as a Bazel toolchain. Consumers (the mdbook_book and mdbook_serve rules) resolve mdbook through @rules_mdbook//mdbook:toolchain_type, so users can register custom mdbook binaries (locally-built fork, alternate version, …) via register_toolchains(...) without modifying rule attributes.

The module extension at @rules_mdbook//mdbook:extensions.bzl generates a default toolchain (@mdbook//:mdbook_toolchain_def) wrapping the prebuilt binary. Users register it from their MODULE.bazel:

register_toolchains("@mdbook//:mdbook_toolchain_def")

mdbook_toolchain

load("@rules_mdbook//mdbook:toolchains.bzl", "mdbook_toolchain")

mdbook_toolchain(name, mdbook)

Declare an mdbook binary as a Bazel toolchain.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
mdbookPath to the mdbook executable.Labelrequired

MdbookToolchainInfo

load("@rules_mdbook//mdbook:toolchains.bzl", "MdbookToolchainInfo")

MdbookToolchainInfo(mdbook)

The mdbook binary, resolved via a toolchain.

FIELDS

NameDescription
mdbookFile: the mdbook executable.

rules_cloudformation

API reference, generated from the module’s .bzl docstrings (stardoc).


rules_cloudformation roadmap

Three milestones to first useful release. Numbering matches the rules_docker_compose cadence: v0.1 = schema-derived primitives, v0.2 = hand-written orchestration, v0.3 = deploy wrappers + linter.

v0.1 — schema fetch + codegen

Get the schema into the repo as Bazel-fetched data, run codegen, ship the first typed rule end-to-end.

  • Schema fetch. cloudformation/private/extensions.bzl defines an http_archive-backed module extension pinning aws-cloudformation/cloudformation-template-schema to a specific commit + sha256. Same shape as rules_docker_compose’s compose_spec_extension, except http_archive (not http_file) because the upstream packages the schema as part of a Maven build, not a single JSON file — see docs/SCHEMA_SOURCE.md.

  • MODULE.bazel wires rules_jsonschema. Adds bazel_dep(name = "rules_jsonschema", version = "0.2.0") and a use_extension block consuming the schema repo.

  • Codegen pipeline. A single jsonschema_starlark_codegen invocation reads the master Schema.template and emits cloudformation/cloudformation_rules.bzl — one rule() per AWS::* definition. Estimated ~1000+ rules (e.g. cloudformation_aws_s3_bucket, cloudformation_aws_lambda_function, cloudformation_aws_ec2_instance, cloudformation_aws_iam_role, cloudformation_aws_dynamodb_table, …). The committed .bzl is diff-tested against fresh codegen on every CI build, exactly like compose_rules.bzl in rules_docker_compose.

  • Smoke test. One end-to-end example: a single cloudformation_aws_s3_bucket target rendered through a placeholder aggregator into golden YAML. Validates that the schema-fetch → codegen → typed-attr → JSON-shard → YAML pipeline works for at least one resource type before the v0.2 aggregator arrives.

v0.2 — hand-written orchestration

Replace the placeholder aggregator with the real graph-walking implementation, plus cross-stack ref resolution.

  • cloudformation_stack. Aggregator rule. Collects shards from deps, validates the Ref graph (every logical ID referenced is defined or has a matching cloudformation_resource_ref), and renders one canonical template.yaml via a Rust cfn-gen binary. Same shape as docker_compose: shard JSON → typed struct (from rules_jsonschema’s jsonschema_rust_library) → canonical YAML. Stable key ordering so re-renders are byte-identical.

  • cloudformation_resource_ref. Cross-stack reference resolver. Given a target stack label and an output name, resolves to the exported value at build time (via either a checked-in outputs.json or a stack-output index file), then rewrites a property of a named resource in the rendered template to that value. Same role as docker_compose_oci_image_ref: a build-time override that turns a symbolic reference into a concrete pinned value before deploy.

  • Providers. CloudformationResourceInfo, CloudformationStackInfo, CloudformationResourceRefInfo.

v0.3 — deploy + lint

Ship the runtime wrappers and the Java-based linter.

  • cloudformation_up. bazel run wrapper that invokes aws cloudformation deploy --template-file <rendered> --stack-name <stack> against the rendered template, with --parameter-overrides flowing through from rule attrs. Same shape as docker_compose_up’s bazel run wrapper.

  • cloudformation_down. bazel run wrapper for aws cloudformation delete-stack. Mirrors docker_compose_down.

  • Java linter. Port of cfn-lint–style rules built with rules_java. Why Java: the upstream schema repo is a Maven project whose intrinsic-function and reference tables already exist in Java — reusing them is cheaper than reimplementing. Packaged as a java_binary invoked from a cloudformation_lint_test rule that runs against every cloudformation_stack.


Schema source

Where rules_cloudformation’s typed rules ultimately come from.

Choice (v0.1)

We run the upstream Java assembler (aws.cfn.codegen.json.Main from aws-cloudformation/cloudformation-template-schema) at build time against a sha-pinned snapshot of the AWS CloudFormation Resource Specification. The assembler emits one JSON Schema per resource group; we feed the storage group’s output (scoped to AWS::S3.* in v0.1) through rules_jsonschema’s jsonschema_starlark_codegen to produce the typed Bazel rules.

Two artifacts are pinned in cloudformation/private/extensions.bzl:

  • aws-cloudformation/cloudformation-template-schema at commit 5d7815b14fd533c15c30f9046a76cdcb89afd32a (sha256 7f40b919bbea6109244903744262074f6afa32fdd780a6dca0540ef1b57bd774). Fetched but not on the compile path — see the Lombok wrinkle section below. Vendored under cloudformation/private/assembler_src/ in delomboked form.
  • The us-east-1 CloudFormationResourceSpecification.json at sha256 3bf0f8b5034b51c622da82f7cec9499112a40719f28fff5c6d2050a0c3a24459. Endpoint: https://d1uauaxba7bl26.cloudfront.net/latest/CloudFormationResourceSpecification.json.

How the build composes

@cfn_resource_spec//file:CloudFormationResourceSpecification.json
                          │
                          ▼
                  //cloudformation:assembled_storage   (cfn_assemble)
                          │
                          │  storage-spec.json (JSON Schema, ~280 KB,
                          │   223 AWS::S3.* + Tag definitions)
                          ▼
              //cloudformation:aws_s3_bucket_gen        (jsonschema_starlark_codegen)
                          │
                          ▼
                  aws_s3_bucket.bzl                     (committed, diff_test-gated)

cfn_assemble synthesizes a YAML config that points the assembler at the local pinned spec (the upstream bundled config.yml has all 25 region URLs hard-coded to the AWS CDN, which would defeat build-time reproducibility), narrows the region set to us-east-1 (the source-of-truth region), and declares a single custom group with the requested includes/excludes.

Lombok wrinkle

The upstream sources use Lombok 1.16.22 (released 2018). That release predates JDK 21+. The current Lombok release line (1.18.x) fails to initialize under JDK 25 with com.sun.tools.javac.code.TypeTag :: UNKNOWN, and Bazel 9.1.0’s rules_java toolchain runs the JavaBuilder on remotejdk25 by default without an easy override.

After running the prompt’s listed fallbacks (bump Lombok, pin --java_runtime_version=remotejdk_21, pin Lombok 1.18.36 — none of which sidestepped the issue because the JavaBuilder process itself runs on remotejdk25), we took the documented nuclear option: ran lombok.jar delombok against the upstream sources locally (java -jar lombok.jar delombok src/main/java -d cloudformation/private/assembler_src), stripped the @lombok.Generated annotations the delomboker leaves on each generated method, and committed the result.

Trade-off: refreshing the assembler from a newer upstream commit isn’t a one-line bump anymore — it’s a delombok + commit. In exchange, the build has no annotation-processor at compile time and no Lombok runtime dep, so it stays buildable on whichever JDK Bazel ships with going forward.

The patched upstream Codegen has one rules_cloudformation-local fix: newer CFN spec entries can have Type: Json with no PrimitiveType set, which the upstream code treats as a primitive but then NPEs on. The patch in Codegen.addPrimitiveType falls back to “Json” when the primitive name is null.

Known gap: registry-only resources

The legacy Resource Specification we pin covers ~1582 of the ~1600+ types AWS publishes. A handful of newer types (post-2023 additions — e.g. AWS::EC2::Image, AWS::EC2::SnapshotBlockPublicAccess) only ship via the newer CloudFormation Registry schema source (per-resource JSON files at schema.cloudformation.us-east-1.amazonaws.com/) and never landed in the legacy spec. Surfacing them would mean pulling from the Registry endpoint as a second source — same per-resource- file shape as the v0.0.1 source, but only for the resources the legacy spec is missing. v0.7+ work item; not on the current roadmap because demand is low (savvi-ops, the design’s stress test, hits ~1 of 87 in-use AWS types as a registry-only).

Alternatives considered

SourceWhy not chosen
Per-resource AWS endpoint (schema.cloudformation.us-east-1.amazonaws.com/<resource>.json)The v0.0.1 / first-cut v0.1 used this. It works but it’s a per-resource fetch (1200+ URLs to track for full coverage) and the schema content is the AWS resource-provider schema, which is divergent from the CloudFormation Resource Specification. Pivoting now keeps the same source-of-truth as cfn-lint and the CFN Linter docs.
aws-cloudformation/cloudformation-cli registry schemasSame per-resource shape, different repository. No advantage.
Hand-curated subsetrules_jsonschema’s whole point is avoiding drift between hand-written rules and upstream. Hard-no.

Refreshing

Three independent bumps:

  1. CFN Resource Specification (typical: track AWS-published spec versions):

    curl -fsSL https://d1uauaxba7bl26.cloudfront.net/latest/CloudFormationResourceSpecification.json | shasum -a 256
    # bump _RESOURCE_SPEC_SHA256 in cloudformation/private/extensions.bzl
    bazel run //cloudformation:update
    
  2. Upstream assembler source (rare: only when upstream changes how groups are computed or fixes a Codegen bug):

    # Compute the new tarball hash
    curl -fsSL https://github.com/aws-cloudformation/cloudformation-template-schema/archive/<commit>.tar.gz | shasum -a 256
    # Re-delombok + commit
    curl -fsSL https://projectlombok.org/downloads/lombok-1.18.36.jar -o /tmp/lombok.jar
    java -jar /tmp/lombok.jar delombok \
         <unpacked-src>/src/main/java \
         -d cloudformation/private/assembler_src
    find cloudformation/private/assembler_src -name '*.java' -exec sed -i '' 's/@lombok\.Generated//g' {} +
    # Bump _TEMPLATE_SCHEMA_COMMIT + _TEMPLATE_SCHEMA_SHA256 in extensions.bzl
    bazel run //cloudformation:update
    
  3. Maven deps (rare: only when upstream pom.xml shifts):

    # Edit MODULE.bazel's maven.install(artifacts=[...]) list
    REPIN=1 bazel run @cfn_assembler_maven//:pin
    

Path to ~1200 resource types

v0.1 covers AWS::S3::Bucket as a codegen smoke. v0.2 lifts the hard-coded resource set into a tag class:

cfn_resources = use_extension(
    "@rules_cloudformation//cloudformation/private:extensions.bzl",
    "cfn_sources_extension",
)
cfn_resources.bundle(
    name = "storage",
    includes = ["AWS::S3.*", "AWS::DynamoDB.*"],
)
cfn_resources.bundle(
    name = "compute",
    includes = ["AWS::EC2.*", "AWS::Lambda.*"],
)

so consumers opt into the resource set they care about — declaring 1200 typed Bazel rules per consumer when they use 10 is wasted analysis time. Bundling lands in v0.2 (see ROADMAP.md).

rules_gitlab

API reference, generated from the module’s .bzl docstrings (stardoc).


Public Bazel rules for working with GitLab CI configuration.

Today (v0.1.0):

  • gitlab_ci_validate(name, src) — build-action rule. Validates a .gitlab-ci.yml against the official GitLab JSON Schema pinned by sha via the gitlab_schemas module extension. Hermetic; no network, no auth.
  • gitlab_ci_lint(name, src, host, repo)bazel run-able target. Wraps glab ci lint <src> via the glab toolchain. Hits the GitLab API for full pipeline validation (semantic checks beyond pure schema + include: resolution). Requires glab auth login to the target instance.

Future surface:

  • gitlab_ci_lint_remote(name, src, project) — call /api/v4/projects/:id/ci/lint directly (no glab CLI indirection), bake the project context.
  • Deploy + registry helpers, schema-derived typed Starlark rules for authoring .gitlab-ci.yml from Bazel (mirroring the rules_jsonschema + rules_cloudformation pattern).

Limitations of gitlab_ci_validate:

  • Does not follow include: directives. A .gitlab-ci.yml that imports another project’s snippets is validated only at its own leaf level; chain validations on the included files by registering each as a separate gitlab_ci_validate target. gitlab_ci_lint handles includes server-side.

gitlab_ci_lint

load("@rules_gitlab//gitlab:defs.bzl", "gitlab_ci_lint")

gitlab_ci_lint(name, src, host, repo)

bazel run-able target that lints a .gitlab-ci.yml via glab ci lint. Network-bound: hits the GitLab API, requires the user to be glab auth login-ed to the target instance. For hermetic schema-only validation, use gitlab_ci_validate instead.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
srcLabel of the .gitlab-ci.yml (or fragment) to lint.Labelrequired
hostGitLab host (e.g. gitlab.savvifi.com). Used to anchor glab’s API target when the runfiles cwd doesn’t have a gitlab remote. Ignored if repo is set (which carries host).Stringoptional""
repoOWNER/REPO or full URL passed as glab -R. Strongly recommended — lets glab pick the right GitLab instance + project context without inspecting the sandbox’s git state.Stringoptional""

gitlab_ci_validate

load("@rules_gitlab//gitlab:defs.bzl", "gitlab_ci_validate")

gitlab_ci_validate(name, src)

Validate a .gitlab-ci.yml against the official GitLab JSON Schema (pinned by sha256 via the gitlab_schemas module extension). Output: a stamp file Bazel checks for caching; on schema violation the build fails with check-jsonschema’s diagnostic on stderr.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
srcLabel of the .gitlab-ci.yml (or sibling fragment) to validate.Labelrequired

gitlab_ci

load("@rules_gitlab//gitlab:defs.bzl", "gitlab_ci")

gitlab_ci(name, stages, variables, default, image, include, workflow, jobs, extra, out, write_to,
          validate, **kwargs)

Generate a .gitlab-ci.yml from a typed Starlark spec.

Assembles the spec in a fixed top-level order (include, workflow, default, image, stages, variables, jobs sorted by name, extra), emits it deterministically as YAML, and (by default) schema-validates the result. Set write_to (e.g. ".gitlab-ci.yml") to also create <name>.updatebazel run …:<name>.update writes the file into the source tree; bazel test …:<name>.update checks it is up to date.

PARAMETERS

NameDescriptionDefault Value
nametarget name.none
stageslist of stage names (order preserved).[]
variablesglobal CI variables (dict).{}
defaultthe default: job-config block (dict).None
imagetop-level default image (str or dict).None
includeinclude: entries (list).None
workflowthe workflow: block (dict).None
jobsdict of job-name -> job (a gitlab_job(...) dict or a raw dict).{}
extraescape hatch — raw dict merged at the top level last.{}
outoutput filename; defaults to <name>.gitlab-ci.yml.None
write_tosource-relative path to also create <name>.update.None
validatechain gitlab_ci_validate on the generated file (default True).True
kwargsforwarded to the underlying rule (visibility, tags, …).none

gitlab_job

load("@rules_gitlab//gitlab:defs.bzl", "gitlab_job")

gitlab_job(stage, script, image, services, before_script, after_script, rules, needs, artifacts,
           variables, cache, tags, environment, when, allow_failure, interruptible, timeout, retry,
           parallel, coverage, extends, dependencies, extra)

Build one GitLab CI job as a None-stripped, key-ordered dict.

Returns a plain dict (Starlark structs aren’t json.encode-able), so pass the result as a value in gitlab_ci(jobs = {...}). Any key not modeled here can be supplied via extra (a raw dict, merged last).

PARAMETERS

NameDescriptionDefault Value
stage

-

None
script

-

None
image

-

None
services

-

None
before_script

-

None
after_script

-

None
rules

-

None
needs

-

None
artifacts

-

None
variables

-

None
cache

-

None
tags

-

None
environment

-

None
when

-

None
allow_failure

-

None
interruptible

-

None
timeout

-

None
retry

-

None
parallel

-

None
coverage

-

None
extends

-

None
dependencies

-

None
extra

-

{}

gitlab_reference

load("@rules_gitlab//gitlab:defs.bzl", "gitlab_reference")

gitlab_reference(*parts)

Emit a GitLab !reference [job, key, ...] tag value.

Usable as a value anywhere in a spec; survives json.encode as a sentinel the emitter turns back into a real !reference YAML tag.

PARAMETERS

NameDescriptionDefault Value
parts

-

none

rules_jsonschema

API reference, generated from the module’s .bzl docstrings (stardoc).


RFC-001 — Codegen Plugin Protocol

Status: draft, revised. Captures the architecture pivot from “Rust-binary-per-output-language” to “per-language plugins reading the schema directly via a minimal stdin/stdout contract”.

Earlier drafts of this RFC proposed a protoc-style architecture with a frontend, a parsed AST proto, and a dual ast / raw plugin mode. That design was abandoned because (a) JSON Schema is already JSON — every plugin language can parse it directly — and (b) most realistic plugins wrap upstream tools (typify, atombender/go-jsonschema, oapi-codegen, …) that have their own parsing anyway. The AST was a small spec language we’d be inventing for marginal benefit. See “Why we abandoned the AST” below for the full reasoning.

Goal

Decouple rules_jsonschema’s user-facing rules from a hardcoded codegen language. After this RFC lands, adding a new output language is:

  1. Write a plugin binary in that language so it leverages native AST tooling — go/format for Go, quote/syn for Rust, ts-morph for TypeScript.
  2. Register a jsonschema_codegen_toolchain pointing at it.
  3. Add a jsonschema_<lang>_library user-facing rule that wraps the target language’s *_library Bazel rule.

The plugin reads the schema bytes from stdin, options from argv, writes the generated file content to stdout, and signals errors via stderr + exit code. No protobuf dep, no AST proto, no frontend binary. Stdlib-only plugins are achievable in any language.

The contract

A plugin is any executable that conforms to:

INPUT
  stdin              the schema file contents (raw bytes)
  argv               --key=value pairs, repeated. Plugin-specific.
                     The rule may also pass standard flags it owns.

OUTPUT
  stdout             the generated file content (raw bytes)
  stderr             diagnostics / error messages

EXIT
  0                  success — stdout is the generated file
  non-zero           failure — stderr explains why

That’s it. A plugin in Go is:

package main

import (
    "encoding/json"
    "io"
    "os"
)

func main() {
    schemaBytes, _ := io.ReadAll(os.Stdin)
    var schema map[string]any
    if err := json.Unmarshal(schemaBytes, &schema); err != nil {
        fmt.Fprintln(os.Stderr, "parse:", err)
        os.Exit(1)
    }
    // ... generate Go source from schema ...
    os.Stdout.Write([]byte(generated))
}

A plugin in Rust is the same thing with serde_json. A plugin in Python wraps json.load(sys.stdin.buffer). There is no contract- specific dep in any language.

Standard argv conventions

The rule passes a fixed set of flags every plugin receives, plus whatever the consumer set in options:

FlagSet byMeaning
--schema-name=NAMEruleOriginal schema file basename (e.g. compose-spec.json). For error messages and stable codegen header comments.
--rule-name=NAMEruleThe Bazel target’s name. Useful for picking output identifiers.
--<consumer-flag>=VALconsumerFree-form per-plugin options from the rule attrs.

Plugins should treat unknown flags as a hard error so misconfigured options don’t silently degrade output.

Bazel output declaration

Bazel rules must declare their outputs at analysis time, before any action runs. Three real options were considered:

ApproachProsCons
A. Single file per rule invocationOutput path known at analysis. Simple. Matches protoc-gen-go in practice.Plugin authors can’t naturally split output.
B. declare_directory (tree artifact)Plugin emits arbitrarily many files.Downstream rust_library / go_library rules have to glob the directory or expand it. Awkward, non-standard.
C. Two-pass: pre-flight + emitPlugin advertises outputs given a schema, then generates.Two plugin invocations per build. Doubles action overhead.

Decision: A. Plugin produces exactly one file (on stdout) per rule invocation. Multi-output needs (types vs validators, client vs server) split into separate rule targets:

jsonschema_go_types(name = "person_types", schema = "person.json")
jsonschema_go_validators(name = "person_validators", schema = "person.json")

Each target is independently cacheable; the build graph is clearer. Tree artifacts (B) remain available as an escape hatch for the rare genuinely-multi-file plugin.

Bazel rule shape

Each per-language user-facing rule has the same structure:

def _jsonschema_rust_codegen_impl(ctx):
    out = ctx.actions.declare_file(ctx.label.name + ".rs")
    tc = ctx.toolchains[_RUST_TOOLCHAIN].codegen_info

    args = [
        "--schema-name=" + ctx.file.schema.basename,
        "--rule-name=" + ctx.label.name,
    ]
    # Plugin-specific options passed through from rule attrs.
    for k, v in ctx.attr.options.items():
        args.append("--{}={}".format(k, v))

    ctx.actions.run_shell(
        inputs = [ctx.file.schema],
        outputs = [out],
        tools = [tc.binary],
        command = '{plugin} {args} < {schema} > {out}'.format(
            plugin = tc.binary.path,
            args = " ".join([shell.quote(a) for a in args]),
            schema = ctx.file.schema.path,
            out = out.path,
        ),
    )
    return [DefaultInfo(files = depset([out]))]

User-facing macro composes that codegen with the target language’s library rule:

def jsonschema_rust_library(name, schema, **kwargs):
    gen_name = name + "_rs_gen"
    _jsonschema_rust_codegen(name = gen_name, schema = schema)
    rust_library(
        name = name,
        srcs = [":" + gen_name],
        edition = "2021",
        deps = [...],
        **kwargs
    )

Same shape per language.

Why we abandoned the AST

The first draft of this RFC proposed a protoc-style architecture: a frontend parses the schema into a canonical AST proto, plugins consume that AST instead of raw bytes. After looking at it harder I think this was the wrong call. Reasons:

  1. The protoc analogy doesn’t transfer. protoc has an AST because .proto files have a grammar nobody else has implemented. Plugin authors would otherwise re-implement parsing. JSON Schema is already JSON — every plugin language has a JSON parser in stdlib or one-line dep. The “no plugin reparses” argument is ~free to ignore for us.

  2. Most plugins wrap upstream tools. typify, atombender/go-jsonschema, oapi-codegen, openapi-generator all take raw schema bytes and have their own parsing. Our AST would be throwaway work for them. The dual mode = "ast" | "raw" we briefly proposed was evidence the AST wasn’t the natural fit.

  3. Cross-plugin consistency was illusory. Different upstream tools interpret edge cases differently (recursive refs, allOf ordering, oneOf discriminator behavior). Putting an AST in front doesn’t unify them — each wrapping plugin still defers to its underlying library.

  4. Maintenance cost is real. Defining Schema / Type / UnionType / IntersectionType is a small spec language we invent and ship. Every JSON Schema feature we don’t model becomes an extra_json escape hatch. We’d end up maintaining a parallel type system that nothing consumes natively.

  5. Plugin author ergonomics matter. “Read stdin, write stdout” is the lowest possible barrier to entry. A Bash script could be a plugin. Adding “deserialise a protobuf request” pushes plugin authors into language-specific toolchain setup before they write the first line of codegen logic.

The toolchain pattern (toolchain types per output language, register your own plugin to override) survives the simplification unchanged.

Why we also abandoned the proto envelope

Even without an AST, we considered keeping a thin proto wrapper: CodeGenRequest{raw_schema, options, version} in, CodeGenResponse{file, error, features} out. Forward-compat without the AST baggage.

The argument against:

  • The structured-options part is the only piece of the proto that isn’t trivially expressible as stdin/argv/stderr/exit-code. argv handles structured options fine.
  • For ~5 plugins over the foreseeable future, “add a field without breaking old plugins” isn’t load-bearing; we can coordinate.
  • Plugin author barrier matters more than abstract evolvability. A one-file Python plugin (15 lines) beats a Rust plugin with protobuf codegen deps for any reasonable measure.
  • We can always add a proto envelope later if we hit a real wall. Migrating plugins is straightforward — only the stdin-parsing changes, the codegen logic doesn’t.

Open questions

  1. Stable JSON Schema spec-version handling. Plugins should probably refuse to operate on schemas whose $schema doesn’t match what they expect. Convention: plugins error with --schema-name=… : unsupported $schema: <value> rather than producing wrong output. Each plugin owns its own version detection.

  2. Cross-plugin shared parsing. If we ever need it (we don’t yet), a future RFC could add an optional sidecar artifact: the rule runs a one-time jsonschema_parse action that emits a normalised JSON form, and plugins opt into reading that instead of the original schema. Backward compatible — old plugins still consume raw.

  3. Diagnostic format. stderr is freeform today. If we ever want structured diagnostics (file:line:col annotations), we’d define a stderr-line format like WARNING:path:line:col:msg. Not v1.

  4. Toolchain attr surface. Currently the toolchain rule just carries binary. Future fields might include: supported_drafts (list of $schema values), default_options (dict), version (for diagnostic banners). All additive.

Decisions to lock in before Phase 1

  1. Plugin contract: stdin = schema bytes, argv = options, stdout = generated file content, stderr + exit code for errors. No proto, no AST.
  2. Bazel outputs: single file per rule invocation. Multi-output needs split into separate targets. Tree-artifact escape hatch for genuine many-file plugins.
  3. Plugin discovery: toolchain types per output language (already in place).
  4. Repo naming: stay rules_jsonschema.

Phases

Phase 1: nail down the contract in code

  • //jsonschema:plugin_contract.md (or similar) — a concise written spec of stdin/argv/stdout/stderr the contract docs reference.
  • Refit the existing Rust + Starlark codegen binaries to the new contract. schema_to_rust already mostly does this (it reads a path from --schema); switch to stdin and the standard argv flags.
  • Update //rust:defs.bzl and //starlark:defs.bzl to invoke plugins via the contract.
  • Existing rules_docker_compose tests should pass byte-identical.

Phase 2: Go plugin (in Go)

  • tools/plugin_go/main.go reads schema bytes from stdin, parses via encoding/json, emits Go types using go/format. Uses rules_go.
  • //go:defs.bzl with jsonschema_go_library.
  • Smoke example: person.json → Go types → round-trip decode test.

This validates the cross-language contract works as cleanly as the RFC claims. If implementing the Go plugin is harder than the “15 lines” pitch, the contract needs tightening.

Phase 3: contract testing

A small integration-test rule that runs an arbitrary plugin against a curated set of “interesting” schemas (compose-spec subset, edge cases, malformed input) and asserts on stdout/stderr/exit behavior. Lets plugin authors verify conformance before registering as a toolchain.

Phase 4: rules_docker_compose migration

Should be a no-op end-user-visibly — the codegen binaries still exist, just invoked through the new contract. Tests pass byte-identical.


Plugin conformance test.

jsonschema_plugin_contract_test(name, plugin) runs the contract test driver against any executable that claims to implement the rules_jsonschema plugin contract (see plugin_contract.md). The driver exercises:

  1. Minimum-viable invocation produces non-empty stdout + exit 0.
  2. Malformed JSON input → non-zero exit, stderr explanation, empty stdout (the discipline most likely to be violated by plugins emitting partial output before erroring).
  3. Unknown flags are rejected.
  4. Output is deterministic across identical invocations.

Plugin authors use it to gate their toolchain registration:

load("@rules_jsonschema//jsonschema:contract_test.bzl",
     "jsonschema_plugin_contract_test")

jsonschema_plugin_contract_test(
    name = "my_plugin_conforms",
    plugin = "//my:rust_codegen",
)

jsonschema_plugin_contract_test

load("@rules_jsonschema//jsonschema:contract_test.bzl", "jsonschema_plugin_contract_test")

jsonschema_plugin_contract_test(name, plugin)

Run the rules_jsonschema plugin contract scenarios against a plugin binary.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
pluginThe plugin binary to test. Any executable that claims to implement the rules_jsonschema plugin contract.Labelrequired

Go user-facing rules for rules_jsonschema.

jsonschema_go_library is the Go-specific shape of the schema → code pipeline:

  1. Resolves the go_codegen_toolchain_type toolchain.
  2. Runs the toolchain’s binary on the schema (stdin/argv/stdout per //jsonschema/plugin_contract.md), producing a .go file.
  3. Wraps the .go in a go_library from @rules_go.

The default toolchain (registered by rules_jsonschema’s MODULE.bazel) points at the in-repo schema_to_go Go binary. Coverage is minimal — primitives, structs, slices, maps, optional pointers, refs. For fuller JSON-Schema-to-Go support, register your own jsonschema_codegen_toolchain pointing at a different binary (e.g. atombender/go-jsonschema).

jsonschema_go_library

load("@rules_jsonschema//go:defs.bzl", "jsonschema_go_library")

jsonschema_go_library(name, schema, importpath, package, extra_args, visibility,
                      **go_library_kwargs)

Generate a go_library of typed schema bindings.

The emitted package exports one Go type per schema $defs / definitions entry plus a top-level type from the schema’s title (if set). Required properties become value-typed fields; optional properties become pointer-typed with ,omitempty tags.

PARAMETERS

NameDescriptionDefault Value
namego_library target name. Consumers add to deps.none
schemalabel of a .json schema file.none
importpathGo import path for the generated package.none
packageGo package name. Defaults to a sanitised rule name.None
extra_argsextra --key=value flags appended to the plugin’s argv. Use to set plugin-specific options without registering a new toolchain.None
visibilityforwarded to go_library.None
go_library_kwargsforwarded to go_library.none

Helpers used by schema_to_starlark-generated rule code.

Kept in a separate file (rather than inlined per generated .bzl) so the codegen output stays small and any helper fix benefits every consumer at once. Generated .bzl files load from this module:

load("@rules_jsonschema//runtime:helpers.bzl", "strip_empty", "parse_json_or_none")

parse_json_or_none

load("@rules_jsonschema//runtime:helpers.bzl", "parse_json_or_none")

parse_json_or_none(s)

Return None for empty input, otherwise json.decode(s).

Used for typed schema attrs whose value is a structured object or array. Generated rule callers pass json.encode({...}) (or leave the attr empty); the generated impl invokes this to expand the encoded payload back into a Starlark dict/list that gets merged into the shard.

PARAMETERS

NameDescriptionDefault Value
s

-

none

strip_empty

load("@rules_jsonschema//runtime:helpers.bzl", "strip_empty")

strip_empty(d)

Drop dict entries whose values are absent / zero / empty.

Matches the JSON omitempty convention so generated shards stay terse — Bazel attr.* zero values (0, False, “”, [], {}) shouldn’t serialise as explicit overrides. Distinguishing “user set to 0” from “user didn’t set” isn’t possible at the Starlark layer, so we conflate them: every typed schema field that wants to mean something non-default ships a non-zero/-empty value.

PARAMETERS

NameDescriptionDefault Value
d

-

none

Providers exposed by rules_jsonschema.

JsonschemaCodegenToolchainInfo is the contract every codegen toolchain provides: a single binary File that implements the schema → output-language conversion. Per-language user-facing rules resolve a toolchain by type (@rules_jsonschema//jsonschema:<lang>_codegen_toolchain_type), fetch this provider, and run the binary.

Splitting it out from defs.bzl lets language modules (//rust:, //starlark:, //go:, …) load just the provider without dragging in language-specific BUILD machinery.

JsonschemaCodegenToolchainInfo

load("@rules_jsonschema//jsonschema:providers.bzl", "JsonschemaCodegenToolchainInfo")

JsonschemaCodegenToolchainInfo(binary)

A schema → code codegen tool.

FIELDS

NameDescription
binaryFile: the codegen executable. Invoked with --schema PATH --out PATH and any language-specific flags the calling rule passes through.

Rust user-facing rules for rules_jsonschema.

jsonschema_rust_library is the Rust-specific shape of the schema → code pipeline:

  1. Resolves the rust_codegen_toolchain_type toolchain.
  2. Runs the toolchain’s binary on the schema, producing a .rs.
  3. Wraps the .rs in a rust_library with serde / serde_json / regress threaded as direct deps.

The default toolchain (registered by rules_jsonschema’s MODULE.bazel) points at the in-repo typify-based schema_to_rust binary. Swap by declaring your own jsonschema_codegen_toolchain + registering it ahead of the default.

jsonschema_rust_library

load("@rules_jsonschema//rust:defs.bzl", "jsonschema_rust_library")

jsonschema_rust_library(name, schema, extra_args, serde, serde_json, regress, visibility,
                        **rust_library_kwargs)

Generate a rust_library of typed schema bindings.

The emitted library exports one Rust struct/enum per top-level JSON-Schema definition, with #[derive(Serialize, Deserialize)] plus #[serde(deny_unknown_fields)] wherever the source schema sets additionalProperties: false.

PARAMETERS

NameDescriptionDefault Value
namerust_library target name. Consumers add this to deps.none
schemalabel of a .json schema file.none
extra_argsextra --key=value flags appended to the plugin’s argv. Use to set plugin-specific options without registering a new toolchain. The default plugin (schema_to_rust) accepts no extra flags today; consumers of custom toolchains will.None
serdelabel of the serde crate to use as a direct dep. Defaults to rules_jsonschema’s own @crates//:serde. Consumers whose binary also depends on serde must point this at their own crate repo, otherwise the generated types’ trait impls live in a different compile unit than the consumer’s and Rust treats them as distinct types (error[E0277]: the trait bound Service: serde::Serialize is not satisfied).None
serde_jsonsame story for serde_json.None
regresssame story for regress (typify uses it for pattern-validated string newtypes).None
visibilityforwarded to rust_library.None
rust_library_kwargsforwarded to rust_library (e.g. extra deps).none

Starlark user-facing rule for rules_jsonschema.

jsonschema_starlark_codegen emits typed Bazel rule() definitions from a JSON Schema:

  1. Resolves the starlark_codegen_toolchain_type toolchain.
  2. Runs the toolchain’s binary on the schema, producing a .bzl.

The default toolchain (registered by rules_jsonschema’s MODULE.bazel) points at the in-repo schema_to_starlark binary. Swap by declaring your own jsonschema_codegen_toolchain and registering it ahead of the default.

The output is meant to be committed in the consumer repo; pair with a diff_test to catch drift (re-runs codegen on every CI build and asserts the committed .bzl matches what the toolchain emits).

jsonschema_starlark_codegen

load("@rules_jsonschema//starlark:defs.bzl", "jsonschema_starlark_codegen")

jsonschema_starlark_codegen(name, schema, kinds, extra_args, **kwargs)

Generate a .bzl of typed rules from a JSON Schema.

PARAMETERS

NameDescriptionDefault Value
nametarget name; output file is <name>.bzl.none
schemalabel of a .json schema document.none
kindslist of (id, pointer, rule_name, provider_name) 4-tuples. - id: short tag used in generated symbol names + the rule-name attr (e.g. service). - pointer: JSON-pointer into the schema for the definition whose properties become attrs (e.g. #/definitions/service). - rule_name: the public Starlark symbol the emitted rule binds to. - provider_name: the public Starlark symbol the rule’s companion provider binds to. Optional — if omitted, extra_args typically enables the plugin’s auto-kinds derivation (e.g. --kinds-pointer-base=... for the default schema_to_starlark toolchain). Leaving both empty produces a preamble-only .bzl (legal but rarely useful).None
extra_argsextra --key=value flags appended to the plugin’s argv. Use to set plugin-specific options without registering a new toolchain.None
kwargsforwarded to the underlying rule (visibility, etc.).none

Toolchain rules for rules_jsonschema codegen.

jsonschema_codegen_toolchain wraps a single codegen executable (schema_to_rust, schema_to_starlark, schema_to_go, …) as a Bazel toolchain. The matching toolchain_type lives in //jsonschema:BUILD.bazel — one type per output language so a consumer can independently swap, say, the Rust generator without touching the Starlark or Go ones.

Default toolchains are registered in //rust:BUILD.bazel, //starlark:BUILD.bazel, //go:BUILD.bazel. To swap an implementation, declare your own jsonschema_codegen_toolchain and register_toolchains(...) it ahead of rules_jsonschema’s default in your MODULE.bazel.

jsonschema_codegen_toolchain

load("@rules_jsonschema//jsonschema:toolchains.bzl", "jsonschema_codegen_toolchain")

jsonschema_codegen_toolchain(name, binary)

Declare a schema → code codegen executable as a Bazel toolchain.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
binaryThe codegen executable for this toolchain. Must accept --schema PATH --out PATH plus any language-specific flags.Labelrequired

write_source_files: copy generated outputs back into source.

The canonical Bazel pattern for committed-codegen workflows. A typical setup pairs a codegen rule (whose output sits under bazel-bin/...) with a write_source_files target that copies the output to a path under source control:

jsonschema_starlark_codegen(
    name = "compose_rules_gen",
    schema = "...",
    kinds = [...],
)

write_source_files(
    name = "update_compose_rules",
    files = {
        "compose_rules.bzl": ":compose_rules_gen",
    },
)
  • bazel build //compose:update_compose_rules — no-op.
  • bazel run //compose:update_compose_rules — copies each generated file to its source-tree destination, respecting BUILD_WORKSPACE_DIRECTORY so multi-repo workspaces still work.

Pair with a diff_test to gate freshness:

diff_test(
    name = "compose_rules_up_to_date",
    file1 = "compose_rules.bzl",
    file2 = ":compose_rules_gen",
)

This rule replaces ad-hoc sh_binary + update.sh pairs throughout rules_jsonschema’s consumers. Functionally equivalent to @aspect_bazel_lib//lib:write_source_files.bzl, but in-repo so we don’t take on aspect_bazel_lib as a dep for a single rule.

write_source_files

load("@rules_jsonschema//util:write_source_files.bzl", "write_source_files")

write_source_files(name, files)

bazel run-able target that copies generated files back into source control.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
filesMap of package-relative destination path → label whose single output file should be copied there. Each source label must produce exactly one output file.Dictionary: String -> Labelrequired

rules_uv

API reference, generated from the module’s .bzl docstrings (stardoc).


rules_uv roadmap

v0.1

  • @uv//:binary built from source via cargo_bootstrap_repository.
  • uv_run macro: sandbox-escaping bazel run wrapper.
  • pip.parse module extension: uv.lock@pip hub + per-package repos.
  • Pure-Python wheel materialization (py3-none-any).
  • Sdist fallback (raw download; no build step yet).
  • End-to-end smoke test in examples/smoke.

v0.2 (this release)

  • Prebuilt-uv toolchain alternative. uv.toolchain(source = "prebuilt") fetches the official release asset for the host platform from astral-sh/uv releases. Supported hosts today: darwin_{aarch64,x86_64} and linux_{aarch64,x86_64}. musl + 32-bit + Windows triples are intentionally omitted until someone needs them — pinning shas we never test is security theater.
  • Unified target shape. Both build and prebuilt produce @uv//:binary as a File; uv_toolchain accepts the file directly (no more :install rust_binary indirection).

v0.3 (this release)

  • Native wheel selection. PEP 425 / PEP 600 tag scoring in pip/private/wheel_selection.bzl: parse wheel filenames, fan out compressed tag fields, score against a host-specific ordered tag list (pip/private/platform.bzl). MVP covers the 4 fastverk hosts (darwin_{aarch64,x86_64}, linux_{aarch64,x86_64}); rules_python’s whl_target_platforms is more thorough and will be the backing implementation once their internals stabilize.
  • Sdist installation via uv. sdist_install_repo (pip/private/sdist_install.bzl): downloads the sdist, shells to @uv//:uv (uv pip install --target=. --no-deps) at repo-rule time. Python interpreter via python = "host" (python3 on PATH) or python = "uv" (uv python install into a per-repo scratch dir).
  • python_version + python attrs on pip.parse. Wheel tag matching consults python_version; sdist install dispatches on python.

v0.4 (this release)

  • Extras: requirement("pkg[extra]") resolves to a per-extra Bazel target that re-exports :pkg plus the extra’s deps. Generated from each package’s [package.optional-dependencies] table.
  • Markers: PEP 508 subset evaluated at extension time against python_version + host platform. Edges whose markers fail are filtered out. Cross-platform select() is v0.5.
  • Git sources (source = { git = "…", rev = "…" }): new_git_repository with the BUILD wrapper.
  • Path sources (source = { path = "…" }): new_local_repository-style symlink rule.
  • Editable sources: explicit failure with a clear message (editable installs don’t translate to Bazel).
  • Hermetic uv invocation: --no-config on all uv pip and uv python install calls so the user’s ~/.config/uv/uv.toml (which on many machines points at a private index) doesn’t leak into sandbox builds.

v0.5 (this release)

  • Cross-platform wheels. pip.parse(platforms = [...]) opts the hub into multi-platform mode. Packages with platform-divergent native wheels fan out into per-platform repos (@<hub>__<pkg>__<platform>) behind a selector repo that emits alias(name = "pkg", actual = select(...)) over @platforms//os + @platforms//cpu constraint values. Non-host platform repos are declared but lazy-fetched — they only land on disk when Bazel’s configuration triggers that branch of the select().
  • Multi-platform smoke (examples/multiplatform/): pure-python wheel (idna) flows through the single-repo path; native wheel (markupsafe) flows through the per-platform select.

v0.6 (next)

Smoke fixtures for git + path sources

v0.4 wires git/path source materialization, but no smoke fixture exercises either. A fixture that lock-files a tiny pure-Python package from a pinned GitHub commit + a sibling local path package would catch regressions.

Sdist install in multi-platform mode

Today sdist install is host-only — if a multi-platform lockfile references an sdist-only package, the extension fails fast rather than silently producing a broken cross-platform target. v0.6 could support per-platform sdist installs by running uv pip install --target once per requested platform (each producing its own per-platform repo). Requires either cross-compilation toolchains on the host (rare) or Bazel platform-transition magic.

musl + Windows platform tag tables

pip/private/platform.bzl ships tag tables for the four fastverk hosts only. Adding musllinux + Windows entries (with @platforms//os:windows and a musl libc constraint) is mechanical once a consumer needs them.

Marker evaluator: spot tests

pip/private/markers.bzl is a hand-rolled PEP 508 subset parser. A skylib unittest suite covering operators, precedence, and the python_full_version vs python_version edge cases would lock the behavior down.

Beyond v0.6

  • uv_pip_compile: bazel run-able workflow to regenerate requirements.txt from a pyproject.toml (analogous to rules_uv upstream’s compile workflow).
  • Cross-platform wheels: support emitting select() deps when a package has multiple platform wheels but the consumer wants to target several configurations from one tree.
  • Stardoc-generated reference in /docs.

Delete uv/ when rules_python’s uv is stable

rules_python ships its own experimental uv toolchain primitive at @rules_python//python/uv:uv_toolchain.bzl and a binary-fetching module extension at @rules_python//python/uv:uv.bzl. Both are marked EXPERIMENTAL: This is experimental and may be removed without notice, so today rules_uv carries its own toolchain + fetch + build paths.

When rules_python promotes these out of experimental, rules_uv’s uv/ directory becomes pure duplication and should be removed:

  • Drop uv/extensions.bzl, uv/toolchains.bzl, uv/private/known_versions.bzl, uv/private/uv_source.BUILD.bazel.
  • Replace our uv_run macro with one that resolves through rules_python’s uv_toolchain_type.
  • The pip extension keeps using @uv//:binary at repo-rule time (just pointing at whichever target rules_python’s extension materializes by then).

This trims rules_uv down to its actual reason for existing: the uv.lock TOML → @pip materializer. Track upstream status at https://github.com/bazelbuild/rules_python/issues/ (search for “uv toolchain experimental”).


pip_parse module extension — uv.lock → @ + per-pkg repos.

Counterpart to rules_python’s pip_parse, but driven by uv.lock instead of requirements.txt. For each package the lockfile resolves to, we create a Bazel-fetched repo containing the unpacked wheel (or installed sdist, or fetched git/path source). A hub repo aggregates these and exposes a requirement("<name>") macro plus pre-aliased @<hub>//<name>:pkg labels.

Consumer:

pip = use_extension("@rules_uv//pip:extensions.bzl", "pip")
pip.parse(
    hub_name = "pip",
    lock = "//:uv.lock",
    python_version = "3.12",
)
use_repo(pip, "pip")

Extras are exposed as additional sub-targets on the package repo:

load("@pip//:requirements.bzl", "requirement")
py_library(
    name = "app",
    deps = [
        requirement("requests"),              # base package
        requirement("requests[security]"),    # base + extra deps
    ],
)

Markers (e.g. marker = "python_version < '3.11'") are evaluated at extension time against the configured python_version + host platform. Edges whose markers fail are silently dropped from the generated BUILD — keeping the host-only view simple. Cross-platform select() is v0.5.

pip

pip = use_extension("@rules_uv//pip:extensions.bzl", "pip")
pip.parse(hub_name, lock, platforms, python, python_version, uv)

Materialize @ + per-pkg repos from a uv.lock.

TAG CLASSES

parse

Attributes

NameDescriptionTypeMandatoryDefault
hub_nameName of the hub repo (the @<hub_name>//… namespace).Stringoptional"pip"
lockLabel pointing at a uv.lock file.Labelrequired
platformsOptional list of <os>_<arch> platforms this lockfile should support. Default is host-only (the v0.4 behavior — select() is not introduced). Supported entries: darwin_aarch64, darwin_x86_64, linux_aarch64, linux_x86_64. Packages with platform-divergent native wheels fan out into per-platform repos behind a select() alias; sdist/git/path packages remain host-only and the build will fail loudly if a non-host platform tries to resolve them.List of stringsoptional[]
pythonHow to find a Python interpreter for sdist install. host uses python3 on PATH; uv runs uv python install <python_version> per package.Stringoptional"host"
python_versionPython major.minor used for wheel-tag matching and (when python = “uv”) the uv-managed interpreter.Stringoptional"3.12"
uvLabel of the uv binary used to install sdists.Labeloptional"@uv"

User-facing rules for rules_uv.

  • uv_run — sh_binary macro: bazel run //path:NAME invokes uv <subcommand> against the live workspace source. Intentionally non-hermetic (escapes the runfiles sandbox) for the dev loop (uv pip sync, uv lock, uv run …).

Lockfile-driven Python repo materialization lives in @rules_uv//pip:extensions.bzl (pip_parse), which is the rules_uv analogue of rules_python’s pip_parse but reads uv.lock rather than requirements.txt.

uv_run

load("@rules_uv//uv:defs.bzl", "uv_run")

uv_run(name, subcommand, args, **kwargs)

bazel run-able wrapper around uv <subcommand>.

Escapes the runfiles sandbox via BUILD_WORKSPACE_DIRECTORY so uv operates on the user’s source tree (uv lock, uv pip sync … both need to write into the workspace).

PARAMETERS

NameDescriptionDefault Value
nametarget name.none
subcommandfirst arg passed to uv (e.g. pip, lock, run).none
argsextra args appended after the subcommand.None
kwargsforwarded to the underlying sh_binary.none

Toolchain wrapper for the uv binary.

UvToolchainInfo.uv is a File for the uv executable. Consumers resolve it via ctx.toolchains["@rules_uv//uv:toolchain_type"].

The attr uses allow_single_file = True rather than executable = True because the bootstrapped binary at @uv//:binary is an alias to a source File (cargo_bootstrap_repository’s output) — Bazel rejects source files as executable attr inputs, so we accept the file and let the consuming rule mark it executable itself.

uv_toolchain

load("@rules_uv//uv:toolchains.bzl", "uv_toolchain")

uv_toolchain(name, uv)

Declares a uv toolchain.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
uvLabel of the uv binary (either built via cargo_bootstrap_repository or fetched as a prebuilt release asset).Labelrequired

UvToolchainInfo

load("@rules_uv//uv:toolchains.bzl", "UvToolchainInfo")

UvToolchainInfo(uv)

Information about a uv toolchain.

FIELDS

NameDescription
uvFile pointing at the uv executable.

rules_openapi

API reference, generated from the module’s .bzl docstrings (stardoc).


OpenAPI plugin conformance test.

openapi_plugin_contract_test(name, plugin) runs the rules_openapi plugin contract scenarios against any plugin executable. Mirrors rules_jsonschema’s jsonschema_plugin_contract_test but with OpenAPI-flavored fixtures (a minimal OpenAPI 3.1 document instead of a JSON Schema).

openapi_plugin_contract_test

load("@rules_openapi//openapi:contract_test.bzl", "openapi_plugin_contract_test")

openapi_plugin_contract_test(name, plugin)

Run the rules_openapi plugin contract scenarios against a plugin binary.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
pluginThe plugin binary to test.Labelrequired

Rust user-facing rules for rules_openapi.

openapi_rust_client is the Rust client codegen rule:

  1. Resolves the rust_client_codegen_toolchain_type toolchain.
  2. Runs the toolchain’s binary on the OpenAPI spec (stdin/argv/ stdout per //openapi/plugin_contract.md), producing a .rs.
  3. Wraps the .rs in a rust_library whose deps include progenitor-client, reqwest, serde, serde_json, and any additional crates the consumer threads through.

The default toolchain (registered by MODULE.bazel) points at the in-repo openapi_to_rust_client binary, which wraps progenitor under the hood. Swap by declaring your own openapi_codegen_toolchain and registering it ahead of the default.

openapi_rust_client

load("@rules_openapi//rust:defs.bzl", "openapi_rust_client")

openapi_rust_client(name, spec, extra_args, progenitor_client, reqwest, serde, serde_json, regress,
                    visibility, **rust_library_kwargs)

Generate a rust_library of a typed OpenAPI HTTP client.

The library exports a Client struct with one method per OpenAPI operation, plus a types module containing serde structs for components/schemas.

PARAMETERS

NameDescriptionDefault Value
namerust_library target name. Consumers add this to deps.none
speclabel of an OpenAPI .yaml / .yml / .json document.none
extra_argsextra --key=value flags passed to the plugin.None
progenitor_clientlabel of the progenitor_client runtime crate the generated code references. Defaults to @openapi_crates//:progenitor-client. Consumers using their own crates_universe must thread this through (and likewise the other runtime-dep attrs below) to avoid the same trait- identity mismatch rules_jsonschema documents.None
reqwestlabel of reqwest (HTTP client the generated code uses).None
serdelabel of serde.None
serde_jsonlabel of serde_json.None
regresslabel of regress (used by typify-generated types nested inside progenitor’s output).None
visibilityforwarded to rust_library.None
rust_library_kwargsforwarded to rust_library (e.g. extra deps).none

Providers exposed by rules_openapi.

Same shape as rules_jsonschema’s JsonschemaCodegenToolchainInfo — the plugin contract is identical (stdin/argv/stdout), the only difference is the schema content shipped on stdin (OpenAPI document rather than a JSON Schema).

OpenapiCodegenToolchainInfo

load("@rules_openapi//openapi:providers.bzl", "OpenapiCodegenToolchainInfo")

OpenapiCodegenToolchainInfo(binary)

An OpenAPI → code codegen tool.

FIELDS

NameDescription
binaryFile: the codegen executable. Invoked with --schema-name=NAME --rule-name=NAME plus per-plugin flags the calling rule passes through.

Toolchain rules for rules_openapi codegen.

openapi_codegen_toolchain wraps a single codegen executable as a Bazel toolchain. Toolchain types are split per (language, use_case) pair — Rust clients, Go clients, Rust servers, etc. — so a consumer can swap one plugin without affecting the rest.

Default toolchains are registered in the per-language directories (//rust:BUILD.bazel, …). To swap an implementation, declare your own openapi_codegen_toolchain and register_toolchains(...) it ahead of rules_openapi’s default in your MODULE.bazel.

openapi_codegen_toolchain

load("@rules_openapi//openapi:toolchains.bzl", "openapi_codegen_toolchain")

openapi_codegen_toolchain(name, binary)

Declare an OpenAPI → code codegen executable as a Bazel toolchain.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
binaryThe codegen executable. Must accept --schema-name=NAME --rule-name=NAME plus any per-plugin flags the calling rule passes through.Labelrequired

rules_bun

API reference, generated from the module’s .bzl docstrings (stardoc).


User-facing rules for rules_bun.

Four pieces:

  • bun_test — runs bun test as a hermetic Bazel test action with explicit srcs + deps. Returns a BunTestInfo provider wrapping the test result file (for downstream consumers; the main consumer is the test framework, which only cares about exit codes).

  • bun_run — sh_binary macro: bazel run //path:NAME invokes bun run <script> against the live workspace source. Intentionally non-hermetic (escapes the runfiles sandbox) for the dev loop. Counterpart to bun_test’s hermetic execution.

  • bun_bundle — bundle a JS/TS entry point into one self-contained file with bun build. Returns BunBundleInfo.

  • bun_compile — compile a JS/TS entry point into a standalone native executable with bun build --compile (Bun runtime + bundled JS). Returns BunBinaryInfo and is bazel run-nable.

All resolve the Bun binary via @rules_bun//bun:toolchain_type (set up by register_toolchains("@bun//:bun_toolchain_def") in your MODULE.bazel).

bun_bundle / bun_compile have two ways to provision node_modules:

  • Bun-native (recommended; no aspect_rules_js, no pnpm-lock): pass a node_modules label (a @<name>//:node_modules from a bun_deps.install tag — see extensions.bzl) plus srcs (the entry

    • local modules). bun build runs directly via the toolchain Bun; a small shell driver stages the entry into a real tree and symlinks the closure so Bun resolves the import graph natively.
  • Legacy aspect_rules_js: pass a driver js_binary whose entry point is @rules_bun//bun:bun-build-driver and whose data stages the build entry plus its full linked node_modules closure; aspect materializes that closure into the action runfiles.

driver and node_modules are mutually exclusive — set exactly one. bun_test likewise takes an optional node_modules for dep resolution.

bun_bundle

load("@rules_bun//bun:defs.bzl", "bun_bundle")

bun_bundle(name, srcs, out, driver, entry, external, format, node_modules, target)

Bundle a JS/TS entry into one file via the hermetic Bun toolchain. Either Bun-native (node_modules from bun_deps.install, no aspect_rules_js) or the legacy aspect driver js_binary path.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
srcsBun-native path. The entry file + any local modules it imports, declared as action inputs. Ignored on the legacy driver path (that stages sources via the js_binary’s data).List of labelsoptional[]
outThe single bundled output file (conventionally *.mjs).Labelrequired
driverLEGACY aspect_rules_js path. A js_binary whose entry point is @rules_bun//bun:bun-build-driver and whose data stages the bundle entry + its full linked node_modules closure. Mutually exclusive with node_modules; set exactly one.LabeloptionalNone
entryPath of the entry point relative to the workspace root (e.g. packages/aion-cli/index.js). On the native path this is the execroot-relative path; on the legacy path it is relative to the driver’s _main runfiles root (same string in practice).Stringrequired
externalModule names to exclude from the bundle (passed as --external <name>, repeatable). Use for native addons and runtime requires that must stay external, e.g. pg-native, @aws-sdk/*, encoding, source-map-support.List of stringsoptional[]
formatBun --format. Defaults to esm so import.meta in deps stays valid under Node.Stringoptional"esm"
node_modulesBun-native path. A node_modules closure (typically @<name>//:node_modules from a bun_deps.install tag). When set, bun build runs directly via the toolchain Bun (no js_binary driver, no aspect_rules_js): the closure is symlinked to the execroot root so Bun resolves the import graph by walking up from entry. Mutually exclusive with driver. Pair with srcs (the entry + local modules).LabeloptionalNone
targetBun --target: the intended execution environment for the bundle. Defaults to node.Stringoptional"node"

bun_compile

load("@rules_bun//bun:defs.bzl", "bun_compile")

bun_compile(name, srcs, out, driver, entry, external, node_modules, target)

Compile a JS/TS entry into a standalone native executable (Bun runtime + bundled JS) via bun build --compile. Either Bun-native (node_modules) or the legacy aspect driver path.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
srcsBun-native path. The entry file + any local modules it imports, declared as action inputs. Ignored on the legacy driver path.List of labelsoptional[]
outThe standalone executable output. On --target bun-windows-* give it a .exe suffix.Labelrequired
driverLEGACY aspect_rules_js path. A js_binary whose entry point is @rules_bun//bun:bun-build-driver and whose data stages the build entry + its full linked node_modules closure. Mutually exclusive with node_modules; set exactly one.LabeloptionalNone
entryPath of the entry point relative to the workspace root (e.g. apps/studio-cli/index.js).Stringrequired
externalModule names to keep external (--external <name>, repeatable). NOTE: native .node addons are NOT embedded by --compile — list them here and provide the .node files at runtime alongside the produced binary.List of stringsoptional[]
node_modulesBun-native path. A node_modules closure (typically @<name>//:node_modules from a bun_deps.install tag). When set, bun build --compile runs directly via the toolchain Bun (no js_binary driver, no aspect_rules_js). Mutually exclusive with driver. Pair with srcs.LabeloptionalNone
targetBun compile target triple. Empty (the default) compiles for the host platform. Cross-compile values: bun-linux-x64, bun-linux-x64-modern, bun-linux-x64-baseline, bun-linux-arm64, bun-darwin-x64, bun-darwin-arm64, bun-windows-x64, and the *-musl libc variants (e.g. bun-linux-x64-musl). A future enhancement could derive this from the Bazel --platforms via a transition; for v1 pass the string.Stringoptional""

bun_test

load("@rules_bun//bun:defs.bzl", "bun_test")

bun_test(name, srcs, data, node_modules)

Run bun test over the listed source files as a Bazel test target.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
srcsTest files (typically *.test.ts, *.test.js). Each is passed to bun test explicitly so Bazel tracks them as inputs.List of labelsrequired
dataAdditional runtime inputs (fixtures, bunfig.toml, etc.).List of labelsoptional[]
node_modulesOptional node_modules closure (typically @<name>//:node_modules from a bun_deps.install tag). Staged at the workspace runfiles root as node_modules/ so bun test resolves dependency imports without bun install. The Bun-native replacement for aspect_rules_js’s npm_link_all_packages.LabeloptionalNone

BunBinaryInfo

load("@rules_bun//bun:defs.bzl", "BunBinaryInfo")

BunBinaryInfo(binary, target)

A standalone native executable produced by bun build --compile.

FIELDS

NameDescription
binaryFile: the standalone executable.
targetstring: the Bun compile target triple (empty = host).

BunBundleInfo

load("@rules_bun//bun:defs.bzl", "BunBundleInfo")

BunBundleInfo(bundle, format)

A single-file bundle produced by bun build.

FIELDS

NameDescription
bundleFile: the bundled output.
formatstring: the Bun output format (esm/cjs/iife).

BunTestInfo

load("@rules_bun//bun:defs.bzl", "BunTestInfo")

BunTestInfo(result)

Result metadata for a bun test run.

FIELDS

NameDescription
resultFile: the captured test output (stdout + stderr concatenated).

bun_run

load("@rules_bun//bun:defs.bzl", "bun_run")

bun_run(name, script, args, **kwargs)

Invoke bun run <script> against the live workspace source.

Escapes the runfiles sandbox via BUILD_WORKSPACE_DIRECTORY so Bun resolves modules + reads files from the user’s actual source tree. Intentionally NOT hermetic — that’s bun_test’s job.

PARAMETERS

NameDescriptionDefault Value
nametarget name.none
scriptpackage-relative path to the Bun script entry point.none
argsextra args passed to bun run after the script name.None
kwargsforwarded to the underlying sh_binary.none

Module extensions for rules_bun.

Two extensions:

  • bun — auto-fetches a prebuilt Bun binary for the host platform. Versions are sha256-pinned in private/known_versions.bzl. Consumers can override via the toolchain tag class.

    bun = use_extension("@rules_bun//bun:extensions.bzl", "bun")
    use_repo(bun, "bun")
    register_toolchains("@bun//:bun_toolchain_def")
    

    Pin a specific version:

    bun.toolchain(version = "1.3.14")
    
  • bun_deps — Bun-native node_modules staging. Each install tag produces a repo @<name> whose :node_modules filegroup is a bun install --frozen-lockfile-ed tree. The pure-Bun replacement for aspect_rules_js’s npm_translate_lock + npm_link_all_packages (no pnpm-lock, no aspect_rules_js):

    bun_deps = use_extension("@rules_bun//bun:extensions.bzl", "bun_deps")
    bun_deps.install(
        name = "npm",
        package_json = "//:package.json",
        lock = "//:bun.lock",
    )
    use_repo(bun_deps, "npm")
    

    then bun_test(node_modules = "@npm//:node_modules", ...) and bun_bundle(node_modules = "@npm//:node_modules", ...).

The actual release fetching is delegated to @rules_github//github:repositories.bzl%github_binary_repository so that the URL-shape + sha-pinning logic stays consistent across all our rules_* repos.

bun

bun = use_extension("@rules_bun//bun:extensions.bzl", "bun")
bun.toolchain(version)

Sets up @bun as a Bazel-fetched prebuilt Bun binary.

TAG CLASSES

toolchain

Attributes

NameDescriptionTypeMandatoryDefault
versionOverride Bun version. Defaults to the value in known_versions.bzl.Stringoptional""

bun_deps

bun_deps = use_extension("@rules_bun//bun:extensions.bzl", "bun_deps")
bun_deps.install(name, bun_version, ignore_scripts, install_flags, lock, package_json,
                 trusted_dependencies)

Bun-native node_modules staging — @<name>//:node_modules from a bun install --frozen-lockfile. Replaces aspect_rules_js’s npm_translate_lock + npm_link_all_packages for pure-Bun repos.

TAG CLASSES

install

Stage a node_modules tree from a package.json + bun.lock.

Attributes

NameDescriptionTypeMandatoryDefault
nameName of the generated repo. Reference its node_modules as @<name>//:node_modules.Namerequired
bun_versionBun version to fetch for the install. Empty = the toolchain extension’s default.Stringoptional""
ignore_scriptsSkip dependency lifecycle scripts (--ignore-scripts). Default True.BooleanoptionalTrue
install_flagsExtra raw flags appended to bun install.List of stringsoptional[]
lockThe bun.lock pinning the install (--frozen-lockfile).Labelrequired
package_jsonThe package.json to install from.Labelrequired
trusted_dependenciesPackages to --trust (run lifecycle scripts for) even when ignore_scripts is True.List of stringsoptional[]

Toolchain rule for rules_bun.

bun_toolchain wraps a single Bun binary as a Bazel toolchain. Consumers (the bun_test and bun_run rules) resolve Bun through @rules_bun//bun:toolchain_type, so users can register custom Bun binaries (locally-built fork, alternate version, baseline-CPU variant) via register_toolchains(...) without modifying rule attrs.

The module extension at @rules_bun//bun:extensions.bzl generates a default toolchain (@bun//:bun_toolchain_def) wrapping the prebuilt binary. Users register it from MODULE.bazel:

register_toolchains("@bun//:bun_toolchain_def")

bun_toolchain

load("@rules_bun//bun:toolchains.bzl", "bun_toolchain")

bun_toolchain(name, bun)

Declare a Bun binary as a Bazel toolchain.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
bunPath to the Bun executable.Labelrequired

BunToolchainInfo

load("@rules_bun//bun:toolchains.bzl", "BunToolchainInfo")

BunToolchainInfo(bun)

The Bun binary, resolved via a toolchain.

FIELDS

NameDescription
bunFile: the bun executable.

rules_postgres

API reference, generated from the module’s .bzl docstrings (stardoc).


User-facing rules for rules_postgres.

  • pg_parse_valid_test wraps the parse_check C binary as a sh_test that gates a .sql file against PostgreSQL’s own parser (via libpg_query). Passes iff parse_check exits 0, fails with the parser’s error + cursor position on stderr otherwise. Use this to keep emitted-SQL or hand-written-DDL in sync with what PostgreSQL accepts.

  • pg_parse_tree runs the sql_to_protobuf C binary on a .sql file and captures the marshalled pg_query.ParseResult protobuf bytes as a .pgpb artifact. This is the single-file convenience macro; multi-file pipelines should use sql_library + sql_ast_library from @rules_lang//polyglot:sql.bzl instead.

pg_parse_tree

load("@rules_postgres//postgres:defs.bzl", "pg_parse_tree")

pg_parse_tree(name, sql, out, **kwargs)

Run libpg_query over a .sql file, capture the protobuf AST.

Single-file convenience around @rules_postgres//tools:sql_to_protobuf. For multi-file pipelines, prefer sql_library + sql_ast_library from @rules_lang//polyglot:sql.bzl, which use the same C tool via pg_sql_toolchain and propagate SqlAstInfo so downstream projections (json, lean, catalog) compose cleanly.

PARAMETERS

NameDescriptionDefault Value
namegenrule target name.none
sqllabel of the .sql file to parse.none
outoutput filename. Defaults to name + ".pgpb".None
kwargsforwarded to the underlying genrule.none

RETURNS

A .pgpb file whose bytes are exactly the marshalled pg_query.ParseResult (see @libpg_query//:pg_query.proto).

pg_parse_valid_test

load("@rules_postgres//postgres:defs.bzl", "pg_parse_valid_test")

pg_parse_valid_test(name, sql, **kwargs)

Assert that a SQL file parses cleanly under PostgreSQL’s parser.

PARAMETERS

NameDescriptionDefault Value
nametest target name.none
sqllabel of the .sql file to validate (source, genrule output, or filegroup member — anything with a single-file location).none
kwargsforwarded to the underlying sh_test (e.g. tags, size, timeout).none

Module extension for rules_postgres.

Exposes two tag classes:

pg.query(version = …) — fetches libpg_query and builds it as a cc_library. Creates @libpg_query.

pg.source(version = …) — fetches the full PostgreSQL source tarball and lays a minimal BUILD overlay on top (filegroups for source dirs + a probe pg_common_string cc_library). Creates @postgres_src.

The two paths are independent. Most consumers want only pg.query for SQL parse-validation gates; pg.source is for advanced tooling that needs the full PG codebase under Bazel.

Default usage:

pg = use_extension("@rules_postgres//postgres:extensions.bzl", "pg")
pg.query(version = "17-6.2.2")
use_repo(pg, "libpg_query")

With full PG source as well:

pg.source(version = "17.6")
use_repo(pg, "libpg_query", "postgres_src")

For generating compile_commands.json (consumable by rules_lang’s c_ast_dump_from_compdb), see pg_meson_configure in postgres/meson.bzl. That rule runs a hermetic meson setup as a Bazel build action using rules_foreign_cc’s meson + ninja toolchains.

pg

pg = use_extension("@rules_postgres//postgres:extensions.bzl", "pg")
pg.query(version)
pg.source(lay_overlay, version)

Module extension fetching libpg_query and/or the full PostgreSQL source tree.

TAG CLASSES

query

Pull libpg_query as @libpg_query.

Attributes

NameDescriptionTypeMandatoryDefault
versionlibpg_query release tag (e.g. “17-6.2.2”).Stringrequired

source

Pull the PostgreSQL source tarball as @postgres_src.

Attributes

NameDescriptionTypeMandatoryDefault
lay_overlayIf True (default), symlink hand-written pg_config.h overlay headers into src/include/. Set False when using pg_meson_configure for AST extraction — see the lay_overlay attr on _postgres_src_repository for the shadowing-via-same-dir-#include explanation.BooleanoptionalTrue
versionPostgreSQL release version (e.g. “17.6”).Stringrequired

rules_rdf

API reference, generated from the module’s .bzl docstrings (stardoc).


rules_rdf roadmap

Two waypoints between today’s scaffold and a usable abstract RDF toolchain layer. Each waypoint is one published bazel-registry release.

v0.1 — toolchain types + plugin contract + placeholder rules

The goal is for a consumer to be able to declare every planned target type (rdf_dataset, sparql_query_test, rdf_validate_test, rdf_transform, rdf_reason) today, against a no-op default toolchain, then swap in a real implementation (e.g. rules_jena) without touching their BUILD files. This makes rules_rdf adoptable incrementally — consumers can wire their build graph before any engine is integrated.

Deliverables:

  • Plugin contract document at rdf/plugin_contract.md (draft already in tree). Same shape as rules_jsonschema’s plugin_contract.md, adjusted for RDF semantics:
    • stdin = the RDF document bytes (the dataset; format declared via --in-format), not a JSON schema.
    • argv = --key=value pairs (same as jsonschema). Standard flags: --rule-name, --in-format. Per-toolchain flags: --query, --shapes, --out-format, --profile.
    • stdout = generated output (query results / validation report / converted graph / inferred triples). Same single-file-per- invocation discipline.
    • stderr = diagnostics.
    • exit = 0 / non-zero.
  • All four toolchain types defined in //rdf:BUILD.bazel: sparql_engine_toolchain_type, rdf_validator_toolchain_type, rdf_serializer_toolchain_type, rdf_reasoner_toolchain_type.
  • Providers: RdfDatasetInfo, RdfEngineToolchainInfo, RdfValidatorToolchainInfo, RdfSerializerToolchainInfo, RdfReasonerToolchainInfo. Each toolchain info wraps a single binary File, matching the jsonschema pattern.
  • Default user-facing rules implemented as _no_op placeholders:
    • rdf_dataset — real (returns RdfDatasetInfo; no toolchain needed).
    • sparql_query_test, sparql_query_run, rdf_validate_test, rdf_transform, rdf_reason — declare their toolchain dependency and accept all their final attrs, but the in-repo default toolchain points at a _no_op binary that writes an empty stdout and exits 0. Consumers can declare targets and they build; swapping in rules_jena makes them actually run.
  • Conformance test driver rdf_plugin_contract_test covering the same scenarios as the jsonschema driver — valid_minimal (small dataset round-trips), malformed_input (garbage on stdin → exit non-zero, empty stdout), unknown_flag (rejects unknown argv), determinism (byte-identical stdout on identical invocations). One driver, parameterised by toolchain type.
  • stardoc for the public surface, with diff_test freshness.

Out of scope for v0.1: chained pipelines, real-engine examples, result-set diff helpers.

v0.2 — cross-toolchain wiring + real-engine examples

Once rules_jena is published and registered, rules_rdf grows the glue that ties multiple toolchains together in one pipeline.

Deliverables:

  • Chained pipelinesrdf_validate_test and sparql_query_test accept the output of rdf_reason as their dataset, so a consumer can express “materialise inferences, then run shape validation on the closure” as a typed build graph. The intermediate inferred graph is a real RdfDatasetInfo-bearing target, not a hidden side effect.
  • Result-set helpers — a small Starlark helper for the common zero-row-CSV gate pattern, plus an rdf_results_diff_test for golden SPARQL result sets (SRX/JSON normalisation).
  • Examples directory using a real RDF corpus:
    • W3C example datasets fetched via http_file with a pinned sha256 (the same fetch-and-pin discipline rules_docker_compose uses for the compose-spec schema).
    • One end-to-end smoke target per toolchain type, registered against rules_jena.
  • CI matrix running the conformance test driver against every registered concrete implementation we know about, gating rules_rdf releases on at least one concrete backend passing.

After v0.2 the abstract layer is feature-complete; further work moves into the concrete-implementation repos.


rdf_plugin_contract_test(name, plugin, toolchain_type) runs the rules_rdf conformance test driver against any executable claiming to implement the plugin contract for the named toolchain type. See plugin_contract.md for what the driver asserts.

Plugin authors gate toolchain registration on it:

load("@rules_rdf//rdf:contract_test.bzl", "rdf_plugin_contract_test")

rdf_plugin_contract_test(
    name = "jena_sparql_conforms",
    plugin = "//jena:jena_sparql",
    toolchain_type = "sparql_engine",
)

The four toolchain types each have their own minimum-valid input inside the driver; pass the bare name (without the _toolchain_type suffix or @rules_rdf//rdf: prefix).

rdf_plugin_contract_test

load("@rules_rdf//rdf:contract_test.bzl", "rdf_plugin_contract_test")

rdf_plugin_contract_test(name, plugin, toolchain_type)

Run the rules_rdf conformance test driver against a plugin binary. See plugin_contract.md.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
pluginThe plugin binary to test. Any executable that claims to implement the rules_rdf plugin contract.Labelrequired
toolchain_typeWhich toolchain type’s scenarios to run: one of sparql_engine, rdf_validator, rdf_serializer, rdf_reasoner.Stringrequired

rdf_dataset(name, srcs, in_format) — declare a labeled collection of RDF files.

This is the single source of “what triples are in this graph?” that every other rule consumes. Carrying both the file depset and the format string up-front lets sparql_query_test / rdf_validate_test / … avoid sniffing extensions at action time and lets consumers mix datasets with declared formats in one BUILD target without ambiguity.

Multi-file datasets are concatenated by the consuming rule in lexicographic order before being piped to the plugin’s stdin (see rdf/plugin_contract.md). Consumers that care about ordering should name files to sort accordingly.

rdf_dataset

load("@rules_rdf//rdf:dataset.bzl", "rdf_dataset")

rdf_dataset(name, deps, srcs, in_format)

A labeled collection of RDF source files + linked-graph deps.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
depsOther rdf_datasets this graph links to (imported ontologies, vocabulary modules). Their files are folded into this dataset’s transitive_files closure, so reasoning/query over the linked vocabularies resolves. Deps should share in_format (normalize otherwise).List of labelsoptional[]
srcsRDF source files. Concatenated in lexicographic order by consuming rules before being piped to the plugin’s stdin.List of labelsrequired
in_formatSerialization of every file in srcs. Mixed-format datasets aren’t supported in v0.1 — use rdf_transform first.Stringoptional"turtle"

Providers for the four rules_rdf toolchain types.

Each provider wraps both the executable and the runfiles needed to invoke it. Carrying runfiles in the provider matters for plugin implementations that aren’t a single self-contained binary — py_binary, java_binary, sh_binary all stage helper files via runfiles. Consuming rules merge the provider’s runfiles into their own to make the plugin actually executable inside a Bazel sandbox.

RdfDatasetInfo

load("@rules_rdf//rdf:providers.bzl", "RdfDatasetInfo")

RdfDatasetInfo(files, transitive_files, in_format)

A declared RDF dataset.

FIELDS

NameDescription
filesdepset[File]: this dataset’s own source files (excludes deps).
transitive_filesdepset[File]: the full graph closure — this dataset’s files plus the transitive closure of every deps dataset. Consumers needing all linked triples (sparql_query, rdf_reason, rdf_validate) operate over this; the subclass/import closure of a grounding ontology (schema.org + SKOS + DC + modules) is assembled here.
in_formatstr: serialization of the dataset files. One of turtle, ntriples, nquads, trig, jsonld, rdfxml. The whole closure must share this format (normalize a differing dep with rdf_transform first).

RdfReasonerToolchainInfo

load("@rules_rdf//rdf:providers.bzl", "RdfReasonerToolchainInfo")

RdfReasonerToolchainInfo(binary, runfiles, files_to_run)

An RDF inference engine. Resolved by rdf_reason.

FIELDS

NameDescription
binaryFile: an executable that runs RDFS / OWL / custom-rule inference and emits derived triples.
runfilesrunfiles: the plugin binary’s runfiles bundle.
files_to_runFilesToRunProvider: pass in an action’s tools= to materialize the plugin’s runfiles tree.

RdfSerializerToolchainInfo

load("@rules_rdf//rdf:providers.bzl", "RdfSerializerToolchainInfo")

RdfSerializerToolchainInfo(binary, runfiles, files_to_run)

An RDF format converter. Resolved by rdf_transform.

FIELDS

NameDescription
binaryFile: an executable that converts between RDF serializations (Turtle / N-Triples / N-Quads / JSON-LD / RDF/XML / TriG).
runfilesrunfiles: the plugin binary’s runfiles bundle.
files_to_runFilesToRunProvider: pass in an action’s tools= to materialize the plugin’s runfiles tree.

RdfValidatorToolchainInfo

load("@rules_rdf//rdf:providers.bzl", "RdfValidatorToolchainInfo")

RdfValidatorToolchainInfo(binary, runfiles, files_to_run)

An RDF validator (SHACL today; ShEx in scope for v0.2). Resolved by rdf_validate_test.

FIELDS

NameDescription
binaryFile: an executable that validates an RDF dataset against a shapes graph per the contract.
runfilesrunfiles: the plugin binary’s runfiles bundle.
files_to_runFilesToRunProvider: pass in an action’s tools= to materialize the plugin’s runfiles tree.

SparqlEngineToolchainInfo

load("@rules_rdf//rdf:providers.bzl", "SparqlEngineToolchainInfo")

SparqlEngineToolchainInfo(binary, runfiles, files_to_run)

A SPARQL query engine. Resolved by sparql_query_test and sparql_query_run.

FIELDS

NameDescription
binaryFile: an executable that runs SPARQL queries per the rules_rdf plugin contract.
runfilesrunfiles: the plugin binary’s runfiles bundle.
files_to_runFilesToRunProvider: pass in an action’s tools= so Bazel materializes the plugin’s runfiles tree (java_binary / py_binary plugins fail to locate runfiles otherwise).

User-facing inference rules.

rdf_reason runs the registered rdf_reasoner toolchain over an RDF dataset and emits the derived-triples graph (Turtle) as a build artifact. Unlike sparql_query_test / rdf_validate_test, this is a regular rule — its output is a file that downstream rules can declare as a src or data dependency.

load("@rules_rdf//rdf:dataset.bzl", "rdf_dataset")
load("@rules_rdf//reason:defs.bzl", "rdf_reason")

rdf_dataset(name = "ontology", srcs = glob(["*.ttl"]))

rdf_reason(
    name = "inferred",
    base = ":ontology",
    profile = "rdfs",
)

For custom rule sets (Jena RETE rules):

rdf_reason(
    name = "inferred",
    base = ":ontology",
    profile = "custom",
    rules = "rules/transitive.rule",
)

The reasoner toolchain implementation decides which profiles are supported; the abstract layer only validates that profile = "custom" is paired with rules and vice versa.

rdf_reason

load("@rules_rdf//reason:defs.bzl", "rdf_reason")

rdf_reason(name, base, include_base, profile, rules)

Run inference over an RDF dataset; emit the derived-triples graph (Turtle).

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
baseRDF dataset to run inference over.Labelrequired
include_baseIf True, emit base + derived triples; otherwise only the derived (default).BooleanoptionalFalse
profileReasoning profile. custom requires rules.Stringoptional"rdfs"
rulesCustom rule file (Jena RETE syntax). Required iff profile = ‘custom’.LabeloptionalNone

User-facing SPARQL rules.

sparql_query_test is the zero-row gate idiom: declare an invariant as a SPARQL query whose result set is empty when the graph satisfies the invariant. CI runs it as a Bazel test; any non-empty row triggers a failure.

It’s the rules_rdf analog of the production GateZeroRows.java pattern in the Aion RFC repo’s kg/java/. v0.1 wires the rule through sparql_engine_toolchain_type; the actual SPARQL execution comes from whichever concrete toolchain the consumer registered (rules_jena, a future rules_rdflib, etc.).

load("@rules_rdf//rdf:dataset.bzl", "rdf_dataset")
load("@rules_rdf//sparql:defs.bzl", "sparql_query_test")

rdf_dataset(name = "corpus", srcs = glob(["*.ttl"]))

sparql_query_test(
    name = "no_dangling_refs",
    dataset = ":corpus",
    query = "queries/dangling.rq",
)

sparql_query

load("@rules_rdf//sparql:defs.bzl", "sparql_query")

sparql_query(name, dataset, out_format, query)

Run a SPARQL query and emit the results as a build artifact (the producer counterpart to sparql_query_test’s gate). Turns a reasoned graph into queryable, downstream-consumable data — e.g. grounding tuples for training-data generation.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
datasetThe rdf_dataset (closure) to query.Labelrequired
out_formatResult serialization. Tabular (tsv/csv/json/xml) for SELECT/ASK; RDF (turtle/ntriples/…) for CONSTRUCT/DESCRIBE (also yields an rdf_dataset).Stringrequired
queryThe SPARQL query file (SELECT/ASK → tabular; CONSTRUCT/DESCRIBE → graph).Labelrequired

sparql_query_smoke_test

load("@rules_rdf//sparql:defs.bzl", "sparql_query_smoke_test")

sparql_query_smoke_test(name, dataset, queries)

Assert that a set of SPARQL queries all parse + execute against a dataset. The query-smoke gate idiom — catches syntax errors and reference rot after schema changes.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
datasetAn rdf_dataset the queries run against.Labelrequired
queriesSPARQL query files. The test passes iff every one parses and executes without error (no row-count assertion — that’s sparql_query_test).List of labelsrequired

sparql_query_test

load("@rules_rdf//sparql:defs.bzl", "sparql_query_test")

sparql_query_test(name, dataset, query)

Run a SPARQL query against an RDF dataset; fail if the result set is non-empty. The zero-row gate idiom.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
datasetAn rdf_dataset whose triples the query runs against.Labelrequired
queryThe SPARQL query file. Result set must be empty for the test to pass (per --fail-on-nonempty).Labelrequired

Toolchain registration rules for rules_rdf.

One rule per toolchain type. Each takes the plugin binary as a mandatory exec-config label and exposes the matching *ToolchainInfo provider with both the binary File and its runfiles bundle.

Concrete plugins (rules_jena, rules_rdflib, …) register via:

sparql_engine_toolchain(
    name = "jena_arq_sparql_toolchain",
    binary = ":jena_sparql",
)

toolchain(
    name = "jena_arq_sparql",
    toolchain = ":jena_arq_sparql_toolchain",
    toolchain_type = "@rules_rdf//rdf:sparql_engine_toolchain_type",
)

rdf_reasoner_toolchain

load("@rules_rdf//rdf:toolchains.bzl", "rdf_reasoner_toolchain")

rdf_reasoner_toolchain(name, binary)

Declare an RDF reasoner (inference) toolchain.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
binaryThe plugin executable. Must conform to the contract in rdf/plugin_contract.md.Labelrequired

rdf_serializer_toolchain

load("@rules_rdf//rdf:toolchains.bzl", "rdf_serializer_toolchain")

rdf_serializer_toolchain(name, binary)

Declare an RDF serializer (format-converter) toolchain.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
binaryThe plugin executable. Must conform to the contract in rdf/plugin_contract.md.Labelrequired

rdf_validator_toolchain

load("@rules_rdf//rdf:toolchains.bzl", "rdf_validator_toolchain")

rdf_validator_toolchain(name, binary)

Declare an RDF validator toolchain.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
binaryThe plugin executable. Must conform to the contract in rdf/plugin_contract.md.Labelrequired

sparql_engine_toolchain

load("@rules_rdf//rdf:toolchains.bzl", "sparql_engine_toolchain")

sparql_engine_toolchain(name, binary)

Declare a SPARQL engine toolchain.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
binaryThe plugin executable. Must conform to the contract in rdf/plugin_contract.md.Labelrequired

User-facing format-conversion rule.

rdf_transform re-serializes an RDF dataset into a different format via the registered rdf_serializer toolchain. The output is a regular build artifact.

load("@rules_rdf//rdf:dataset.bzl", "rdf_dataset")
load("@rules_rdf//transform:defs.bzl", "rdf_transform")

rdf_dataset(name = "src_turtle", srcs = ["data.ttl"], in_format = "turtle")

rdf_transform(
    name = "data_ntriples",
    dataset = ":src_turtle",
    out_format = "ntriples",
)

Output filename = <name>.<ext> where <ext> is the canonical extension for out_format (.ttl, .nt, .nq, .trig, .jsonld, .rdf).

rdf_transform

load("@rules_rdf//transform:defs.bzl", "rdf_transform")

rdf_transform(name, dataset, out_format)

Convert an RDF dataset between serializations.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
datasetRDF dataset to convert.Labelrequired
out_formatTarget serialization.Stringrequired

User-facing RDF validation rules.

rdf_validate_test runs a SHACL shapes graph against an RDF dataset and fails the build if any violations are reported. Resolves through rdf_validator_toolchain_type so the actual SHACL engine is pluggable (rules_jena’s org.apache.jena.shacl.ShaclValidator, a future rules_pyshacl, …).

load("@rules_rdf//rdf:dataset.bzl", "rdf_dataset")
load("@rules_rdf//validate:defs.bzl", "rdf_validate_test")

rdf_dataset(name = "ontology", srcs = glob(["ontology/*.ttl"]))

rdf_validate_test(
    name = "ontology_conforms",
    dataset = ":ontology",
    shapes = "shapes.ttl",
)

ShEx support is in scope for v0.2 (the toolchain contract leaves room for it via the --shapes-language arg, but for v0.1 the shapes file is assumed Turtle-encoded SHACL).

rdf_validate_test

load("@rules_rdf//validate:defs.bzl", "rdf_validate_test")

rdf_validate_test(name, dataset, severity, shapes)

Validate an RDF dataset against a SHACL shapes graph.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
datasetAn rdf_dataset to validate.Labelrequired
severityMinimum severity that fails the build.Stringoptional"violation"
shapesSHACL shapes graph (Turtle).Labelrequired

rules_jena

API reference, generated from the module’s .bzl docstrings (stardoc).


rules_jena roadmap

Three releases get from “scaffold” to a full Jena-backed implementation of every rules_rdf toolchain type, plus a small set of user-facing convenience rules. Each row points at the production pattern in ~/Documents/rfcs/kg/java/ that gets ported; see SOURCES.md for the full catalog.

v0.1 (next)

Stand up the shared library and the first toolchain. Goal: prove rules_rdf’s contract works for a Jena backend end-to-end.

  • Port Loader.java (corpus-loading) and Writer.java (deterministic Turtle) plus their tests (WriterTest.java) into a public :rdf_io java_library at the rules_jena root.
  • Implement one toolchain — the SPARQL engine — as a java_binary registered under rules_rdf’s sparql_engine_toolchain_type:
    • Read the SPARQL query path from argv (--query=...).
    • Read the data graph as Turtle from stdin.
    • Emit results (TSV or JSON) on stdout, diagnostics on stderr, non-zero exit on parse / execution failure.
    • Same shape as the gate_query_smoke target in kg/java/BUILD.bazel.
  • Gate the toolchain binary with rules_rdf’s plugin-contract conformance test (the analog of jsonschema_plugin_contract_test).
  • Register the default toolchain in MODULE.bazel.
  • Maven coordinates for jena-arq / jena-core / jena-base / jena-iri / slf4j-simple declared inline (no rules_jvm_external pin yet — defer to v0.2 once the full set of binaries is in scope).

v0.2

Round out the toolchain implementations. Goal: every rules_rdf toolchain type has a Jena-backed default.

  • Port GateHarness.java, Gates.java, GateZeroRows.java, GateShacl.java, GateQuerySmoke.java as additional toolchain-backing binaries — one per gate shape.
  • Implement the SHACL validator toolchain on top of org.apache.jena.shacl.ShaclValidator (mirroring GateShacl).
  • Implement the RDF serializer toolchain on top of RDFDataMgr plus the Writer.java invariants.
  • Implement the OWL reasoner toolchain via ReasonerRegistry.getOWLMicroReasoner() (the same call KgReasoner makes in production).
  • Pin Maven artifacts via rules_jvm_external. Single maven.install block in MODULE.bazel; a locked maven_install.json checked in. Removes the hand-rolled @maven//:org_apache_jena_* references that consumers maintain today.
  • Smoke fixture using a tiny pinned ontology (a few classes, a few SHACL shapes, a handful of .rq files) so every toolchain gets an end-to-end test under //examples/smoke.

v0.3

Expose the higher-level patterns the corpus uses every day. Goal: a downstream consumer can replace kg/java/ with a thin BUILD file that loads from rules_jena.

  • Extract kg/lint/ patterns into a reusable jena_lint rule — orphan / consistency checks driven by a user-supplied query set.
  • Extract kg/rules/ patterns into a jena_reason rule — runs a pinned set of Jena rule files over a Dataset and emits a deterministic inferred Turtle output (mirroring kg_reasoner --check).
  • Provide a jena_corpus macro that takes an ontology dir, a TTL glob, a queries dir and stitches together the gate test_suite the corpus uses today.

Source-of-truth patterns to port

The Jena patterns rules_jena packages as toolchains all exist in production today, in the Aion RFC knowledge-graph tree at ~/Documents/rfcs/kg/java/. This file catalogs which files we extract and what each one teaches.

Reference BUILD files (read-only sources of the wiring):

  • ~/Documents/rfcs/kg/java/BUILD.bazel — the JENA_DEPS list, the :loader, :writer, :gate_harness java_library declarations, plus the gate_* java_tests.
  • ~/Documents/rfcs/kg/java/reasoner/BUILD.bazel — the kg_reasoner java_binary (OWL-MICRO inference).

Files

File (under ~/Documents/rfcs/kg/java/)What it teaches
Loader.javaSingle shared library that loads every TTL under a corpus root into one in-memory Jena Dataset. Every downstream binary depends on this — port becomes the public :rdf_io library at the rules_jena root.
Writer.javaDeterministic Turtle serializer. Stable prefix ordering, stable blank-node labels, byte-stable round-trip. The invariants documented in the file header become the rules_jena serializer toolchain’s contract.
WriterTest.javaRound-trip + parse-equivalence test: write(load(write(model))) byte-equals write(model) and the result is isomorphic to the input. Ports as the rules_jena serializer-toolchain conformance test.
GateHarness.javaOrchestrates a set of SPARQL zero-row checks plus a SHACL conformance check against one Dataset. The “compose multiple gates into one suite” pattern.
Gates.javaShared query-plumbing helpers (load a .rq from disk, execute against a Dataset, collect results). Used by every Gate* binary; ports as a private helper for the SPARQL + SHACL toolchain binaries.
GateZeroRows.javaThe “run one .rq and fail if it returns >0 rows” gate shape. Generalizes to the rules_jena SPARQL toolchain’s zero-row mode.
GateQuerySmoke.javaThe “every .rq under a dir parses + executes” gate shape. The conformance-test analog for SPARQL toolchains.
GateShacl.javaThe “load shapes.ttl + data, run ShaclValidator, fail on non-conforming” gate shape. Becomes the rules_jena SHACL validator toolchain core.
reasoner/KgReasoner.javaOWL-MICRO inference via ReasonerRegistry.getOWLMicroReasoner(), plus a Jena rule-file driver. Deterministic, idempotent output. Becomes the rules_jena OWL reasoner toolchain.

What we do not port (yet)

These exist in the kg/java/ tree but are corpus-specific, not generic Jena tooling — they belong in a downstream consumer, not in rules_jena:

  • KgReport.java, CrossCutting.java — Aion-specific reporting.
  • AionPaths.java — XDG/macOS/Windows config paths (not Jena).
  • BuildSummary.javaSUMMARY.md drift gate (not Jena).
  • Anything under kg/java/edit/, kg/java/lint/, kg/java/metrics/, kg/java/research/ — corpus-specific CLIs. The reusable shapes inside them (lint rules, metric formulas) land as jena_lint / jena_reason in v0.3.

jena_dataset(name, default_graph, named_graphs) — a Jena Dataset: a default graph plus a set of named graphs addressable by IRI.

Provider-only. Composes jena_model labels. Like jena_model, also emits RdfDatasetInfo (the union of all triples across the default + named graphs) so rules_rdf rules consume it transparently.

load("@rules_jena//jena:defs.bzl", "jena_model", "jena_dataset")

jena_model(name = "core", srcs = ["core.ttl"], in_format = "turtle")
jena_model(name = "facts", srcs = ["facts.ttl"], in_format = "turtle")
jena_model(name = "claims", srcs = ["claims.ttl"], in_format = "turtle")

jena_dataset(
    name = "corpus",
    default_graph = ":core",
    named_graphs = {
        "http://example.org/g/facts": ":facts",
        "http://example.org/g/claims": ":claims",
    },
)

Datasets without named graphs are a degenerate case — for those, use jena_model directly.

jena_dataset

load("@rules_jena//jena:dataset.bzl", "jena_dataset")

jena_dataset(name, default_graph, named_graphs)

A Jena Dataset composed of named-graph jena_models + an optional default graph.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
default_graphA jena_model whose triples form the dataset’s default graph (unnamed). Optional.LabeloptionalNone
named_graphsMap of graph IRI → jena_model label. Each entry becomes a named graph in the resulting Dataset.Dictionary: String -> Labeloptional{}

Public API surface for rules_jena.

Re-exports the v0.2 user-facing rules (Bazel-idiomatic Jena data primitives) + the JENA_DEPS Maven label set shared with anyone writing their own Jena java_binary.

load("@rules_jena//jena:defs.bzl",
     "JENA_DEPS",
     "jena_model", "jena_dataset", "jena_rule_set", "jena_reasoner",
     "JenaModelInfo", "JenaDatasetInfo", "JenaRuleSetInfo", "JenaReasonerInfo")

Pair with the rules_rdf user-facing test rules (sparql_query_test, rdf_validate_test) — jena_model / jena_dataset emit both JenaModelInfo / JenaDatasetInfo AND RdfDatasetInfo, so they’re drop-in replacements for rdf_dataset in any rules_rdf rule.

rules_jena’s MODULE.bazel auto-registers four toolchains satisfying every rules_rdf toolchain type — pulling in rules_jena is enough to run any of sparql_query_test, rdf_validate_test, rdf_transform, rdf_reason. v0.2’s jena_reason build action is the consumer-facing alternative when a downstream rule wants a concrete file artifact instead of the test-shaped rdf_reason.

jena_dataset

load("@rules_jena//jena:defs.bzl", "jena_dataset")

jena_dataset(name, default_graph, named_graphs)

A Jena Dataset composed of named-graph jena_models + an optional default graph.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
default_graphA jena_model whose triples form the dataset’s default graph (unnamed). Optional.LabeloptionalNone
named_graphsMap of graph IRI → jena_model label. Each entry becomes a named graph in the resulting Dataset.Dictionary: String -> Labeloptional{}

jena_model

load("@rules_jena//jena:defs.bzl", "jena_model")

jena_model(name, srcs, base_iri, in_format)

A Jena Model (single RDF graph) declared as Bazel data.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
srcsSource RDF files for this single graph. Concatenated in lexicographic order by Jena tools.List of labelsrequired
base_iriOptional base IRI for resolving relative references in srcs. Empty = none.Stringoptional""
in_formatSerialization of every file in srcs. Mixed formats aren’t supported — pipe through rdf_transform first if you need to combine.Stringoptional"turtle"

jena_reasoner

load("@rules_jena//jena:defs.bzl", "jena_reasoner")

jena_reasoner(name, profile, rule_set)

A Jena reasoner configuration (provider-only).

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
profileBuilt-in profile name or custom. custom requires rule_set.Stringoptional"rdfs"
rule_setA jena_rule_set label. Required iff profile = ‘custom’.LabeloptionalNone

jena_rule_set

load("@rules_jena//jena:defs.bzl", "jena_rule_set")

jena_rule_set(name, rules)

A set of Jena rule files for the rule-engine reasoner.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
rulesJena rule files. Each must follow the rule-engine syntax at https://jena.apache.org/documentation/inference/#rules.List of labelsrequired

JenaDatasetInfo

load("@rules_jena//jena:defs.bzl", "JenaDatasetInfo")

JenaDatasetInfo(default_graph, named_graphs)

A Jena Dataset (collection of named graphs + an optional default graph). Used by rules that need named-graph addressability (Fuseki, multi-graph SPARQL).

FIELDS

NameDescription
default_graphJenaModelInfo | None: triples that live outside any named graph.
named_graphsdict[str, JenaModelInfo]: graph IRI → model. Order-preserving.

JenaModelInfo

load("@rules_jena//jena:defs.bzl", "JenaModelInfo")

JenaModelInfo(files, in_format, base_iri)

A single Jena Model (RDF graph). Provider-only — the files declared on the rule remain the source of truth.

FIELDS

NameDescription
filesdepset[File]: source files concatenated to form the model.
in_formatstr: serialization (turtle, ntriples, nquads, trig, jsonld, rdfxml). Matches the rules_rdf RDF_FORMATS vocabulary.
base_iristr: optional base IRI for relative references in the source files. Empty string = no base.

JenaReasonerInfo

load("@rules_jena//jena:defs.bzl", "JenaReasonerInfo")

JenaReasonerInfo(profile, rule_set)

A Jena reasoner configuration. Either a built-in profile (rdfs, owl-rl, owl-mini, owl-micro) or a custom rule set; never both. Consumed by jena_reason and by the rdf_reasoner_toolchain_type plugin contract.

FIELDS

NameDescription
profilestr: built-in profile name, or empty if custom.
rule_setJenaRuleSetInfo | None: rule set for the custom profile.

JenaRuleSetInfo

load("@rules_jena//jena:defs.bzl", "JenaRuleSetInfo")

JenaRuleSetInfo(files)

A set of Jena rule files consumed by the rule-engine reasoner (Jena’s RETE-based forward/backward inference). See https://jena.apache.org/documentation/inference/ for the rule syntax. Distinct from SPARQL .rq files.

FIELDS

NameDescription
filesdepset[File]: .rule files in the set.

jena_model(name, srcs, in_format, base_iri) — declare one RDF graph as a Jena-aware data primitive.

Provider-only: no Bazel actions, no parsed-form artifacts. The srcs files remain the source of truth; downstream rules either read them directly or feed them to a Java tool that parses them into an in-memory Model.

Every jena_model ALSO emits RdfDatasetInfo (the abstract provider from rules_rdf) so it’s a drop-in dataset for any rules_rdf rule:

load("@rules_jena//jena:defs.bzl", "jena_model")
load("@rules_rdf//sparql:defs.bzl", "sparql_query_test")

jena_model(
    name = "ontology",
    srcs = ["ontology.ttl"],
    in_format = "turtle",
)

sparql_query_test(  # works: resolves via RdfDatasetInfo
    name = "ontology_well_formed",
    dataset = ":ontology",
    query = "queries/check.rq",
)

For named-graph use cases see jena_dataset.

jena_model

load("@rules_jena//jena:model.bzl", "jena_model")

jena_model(name, srcs, base_iri, in_format)

A Jena Model (single RDF graph) declared as Bazel data.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
srcsSource RDF files for this single graph. Concatenated in lexicographic order by Jena tools.List of labelsrequired
base_iriOptional base IRI for resolving relative references in srcs. Empty = none.Stringoptional""
in_formatSerialization of every file in srcs. Mixed formats aren’t supported — pipe through rdf_transform first if you need to combine.Stringoptional"turtle"

Provider types for the rules_jena public API.

The data primitives (jena_model, jena_dataset, jena_rule_set, jena_reasoner) are provider-only — they carry references to files + small Jena-shaped config, no build actions. Build-action rules (jena_reason, the rules_rdf-driven rdf_validate_test, etc.) consume them.

Every data-providing rule also emits the abstract RdfDatasetInfo from rules_rdf, so jena_model / jena_dataset are drop-in replacements for rdf_dataset in any rules_rdf rule. Consumers who want Jena-aware features (named graphs, rule sets, OWL profiles) reach for the Jena providers; everyone else stays on the abstract interface.

The names use the package-prefixed convention (JenaXInfo) so that an unwrapped JenaModelInfo import is unambiguous next to the rules_rdf RdfDatasetInfo.

JenaDatasetInfo

load("@rules_jena//jena:providers.bzl", "JenaDatasetInfo")

JenaDatasetInfo(default_graph, named_graphs)

A Jena Dataset (collection of named graphs + an optional default graph). Used by rules that need named-graph addressability (Fuseki, multi-graph SPARQL).

FIELDS

NameDescription
default_graphJenaModelInfo | None: triples that live outside any named graph.
named_graphsdict[str, JenaModelInfo]: graph IRI → model. Order-preserving.

JenaModelInfo

load("@rules_jena//jena:providers.bzl", "JenaModelInfo")

JenaModelInfo(files, in_format, base_iri)

A single Jena Model (RDF graph). Provider-only — the files declared on the rule remain the source of truth.

FIELDS

NameDescription
filesdepset[File]: source files concatenated to form the model.
in_formatstr: serialization (turtle, ntriples, nquads, trig, jsonld, rdfxml). Matches the rules_rdf RDF_FORMATS vocabulary.
base_iristr: optional base IRI for relative references in the source files. Empty string = no base.

JenaReasonerInfo

load("@rules_jena//jena:providers.bzl", "JenaReasonerInfo")

JenaReasonerInfo(profile, rule_set)

A Jena reasoner configuration. Either a built-in profile (rdfs, owl-rl, owl-mini, owl-micro) or a custom rule set; never both. Consumed by jena_reason and by the rdf_reasoner_toolchain_type plugin contract.

FIELDS

NameDescription
profilestr: built-in profile name, or empty if custom.
rule_setJenaRuleSetInfo | None: rule set for the custom profile.

JenaRuleSetInfo

load("@rules_jena//jena:providers.bzl", "JenaRuleSetInfo")

JenaRuleSetInfo(files)

A set of Jena rule files consumed by the rule-engine reasoner (Jena’s RETE-based forward/backward inference). See https://jena.apache.org/documentation/inference/ for the rule syntax. Distinct from SPARQL .rq files.

FIELDS

NameDescription
filesdepset[File]: .rule files in the set.

jena_reasoner(name, profile|rule_set) — declare a reasoner configuration.

Provider-only. Either a built-in profile or a custom rule set; the rule rejects both-or-neither.

Built-in profiles map onto Jena’s ReasonerRegistry:

profileJena equivalent
rdfsReasonerRegistry.getRDFSReasoner()
owl-rlReasonerRegistry.getOWLReasoner()
owl-miniReasonerRegistry.getOWLMiniReasoner()
owl-microReasonerRegistry.getOWLMicroReasoner()
customGenericRuleReasoner with the given rule_set.

The Aion production kg_reasoner uses owl-micro plus purpose-written Jena rule files; both shapes have first-class support here.

load("@rules_jena//jena:defs.bzl", "jena_rule_set", "jena_reasoner")

jena_rule_set(name = "kg_rules", rules = glob(["rules/*.rule"]))

jena_reasoner(name = "owl_micro_plus_kg", profile = "custom", rule_set = ":kg_rules")

To actually apply the reasoner to a base model, use jena_reason (which runs a build action) or the abstract rdf_reason rule from rules_rdf (which resolves the reasoner toolchain).

jena_reasoner

load("@rules_jena//jena:reasoner.bzl", "jena_reasoner")

jena_reasoner(name, profile, rule_set)

A Jena reasoner configuration (provider-only).

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
profileBuilt-in profile name or custom. custom requires rule_set.Stringoptional"rdfs"
rule_setA jena_rule_set label. Required iff profile = ‘custom’.LabeloptionalNone

jena_rule_set(name, rules) — a collection of Jena rule files for the rule-engine reasoner.

Provider-only. Consumed by jena_reasoner(profile = "custom") and by the rdf_reasoner_toolchain_type plugin contract when the plugin’s --rules flag points at a file from this set.

Jena rule syntax (the RETE forward-chainer’s input — distinct from SPARQL):

@prefix ex: <http://example.org/> .

[transitiveSubOrg:
    (?a ex:partOf ?b),
    (?b ex:partOf ?c)
    -> (?a ex:partOf ?c)
]

See https://jena.apache.org/documentation/inference/#rules. The file extension is .rule by convention; .txt is tolerated.

jena_rule_set

load("@rules_jena//jena:rules.bzl", "jena_rule_set")

jena_rule_set(name, rules)

A set of Jena rule files for the rule-engine reasoner.

ATTRIBUTES

NameDescriptionTypeMandatoryDefault
nameA unique name for this target.Namerequired
rulesJena rule files. Each must follow the rule-engine syntax at https://jena.apache.org/documentation/inference/#rules.List of labelsrequired