fastverk
proven systems, built fast — the hermetic software works.
A vertically-integrated, Bazel-native platform for complex, multi-modal
software, and the rules_* constellation it’s built on — every module one
concern, composed into hermetic, reproducible builds.
Managed cloud RBE + cache, a WireGuard mesh, and source hosting compose on top: the build, the network, the source, and the machine — reproducible and verified, end to end.
Where to go next
- The platform — the products:
fvkit, the desktop app, the brand, and the turnkey AWS install. - The constellation — the
rules_*registry: one module per concern, composed into hermetic builds. - Quick start — wire the registry into your own Bazel build.
- Philosophy — Bazel-native, hermetic, honest about gaps.
The platform
The products that sit on top of the constellation.
| fvkit | core/runtime — the platform library + the fvd daemon (volumes, bazelrc, connections, maintenance, updater) |
| fastverk-app | the macOS desktop app — a menu-bar control plane for the works |
| brand | the visual identity — one parametric source for the mark, icons, brandbook, decks, and this docs theme |
Managed cloud RBE + cache, a WireGuard mesh, and source hosting compose on top: the build, the network, the source, and the machine — reproducible and verified, end to end.
Turnkey AWS install
The server side installs into your own AWS account in one launch — a single CloudFormation stack brings up the cluster, nodes, TLS, and DNS, reachable at your own domain with no manual steps. The plugin runtime (gRPC-service plugins, discovered and routed like QueryRPC) lets features ship as narrowly-scoped repos without forking the core.
Keyless workload identity
CI gets a short-lived fastverk credential by exchanging its own OIDC token (no shared secrets): the workload-identity broker verifies a GitHub Actions token and mints a scoped, Cognito-backed token — the machine-identity sibling of the interactive login.
The constellation
The rules_* registry — one module per concern, composed into hermetic builds.
Each lands in the fastverk bazel-registry; per-module API reference
(stardoc-generated) and this catalog are kept current by the nightly rebuild.
Categories
- Language toolchains —
rules_uv,rules_lean,rules_postgres,rules_autoconf - API + schema —
rules_jsonschema,rules_openapi,rules_aip - Web + bundlers —
rules_bun,rules_chrome,rules_nextjs,rules_storybook,rules_vite,rules_docker - CI + cloud —
rules_github,rules_gitlab,rules_ci,rules_cloudformation,rules_helm - Docs + publishing —
rules_mdbook,rules_tectonic,rules_readme,rules_markdown - Semantic web —
rules_jena,rules_rdf,rules_schema_org
Modules
The full catalog, generated from the registry by tools/harvest-catalog.sh
(nightly + on repository_dispatch), so new modules + releases appear here
automatically. See each module’s API reference (under Reference) for setup.
| Module | Latest | Source |
|---|---|---|
botnoc | 0.1.0 | fastverk/botnoc |
brand | 0.3.1 | fastverk/brand |
buildbarn | 0.0.2 | fastverk/buildbarn |
fastverk-app | 0.0.2 | fastverk/fastverk-app |
forge | 0.0.1 | fastverk/forge |
fvkit | 0.0.6 | fastverk/fvkit |
meridian | 0.2.2 | mattmarshall/meridian |
pinax | 0.1.0 | mattmarshall/pinax |
rules_agentic_ide | 0.0.4 | fastverk/rules_agentic_ide |
rules_aip | 0.2.2 | fastverk/rules_aip |
rules_autoconf | 0.1.0 | fastverk/rules_autoconf |
rules_beam | 0.0.2 | fastverk/rules_beam |
rules_bibtex | 0.0.6 | fastverk/rules_bibtex |
rules_bun | 0.4.0 | fastverk/rules_bun |
rules_cc_cross | 0.1.0 | fastverk/rules_cc_cross |
rules_cc_host | 0.1.0 | fastverk/rules_cc_host |
rules_chrome | 0.1.0 | fastverk/rules_chrome |
rules_ci_ir | 0.0.1 | fastverk/rules_ci_ir |
rules_cloudformation | 0.8.0 | fastverk/rules_cloudformation |
rules_docker | 0.2.6 | fastverk/rules_docker_compose |
rules_eslint | 0.1.0 | fastverk/rules_eslint |
rules_fastverk | 0.0.2 | fastverk/rules_fastverk |
rules_github | 0.1.2 | fastverk/rules_github |
rules_gitlab | 0.3.2 | fastverk/rules_gitlab |
rules_graphviz | 0.1.0 | fastverk/rules_graphviz |
rules_helm | 0.1.0 | fastverk/rules_helm |
rules_huggingface | 0.0.3 | fastverk/rules_huggingface |
rules_jena | 0.3.2 | fastverk/rules_jena |
rules_jsonschema | 0.3.0 | fastverk/rules_jsonschema |
rules_lang | 0.4.0 | fastverk/rules_lang |
rules_lean | 0.5.3 | fastverk/rules_lean |
rules_lora | 0.1.3 | fastverk/rules_lora |
rules_macvm | 0.0.1 | fastverk/rules_macvm |
rules_markdown | 0.0.3 | fastverk/rules_markdown |
rules_mdbook | 0.3.1 | fastverk/rules_mdbook |
rules_meridian | 0.2.1 | mattmarshall/meridian |
rules_meson | 0.0.1 | fastverk/rules_meson |
rules_nextjs | 0.3.0 | fastverk/rules_nextjs |
rules_openapi | 0.2.1 | fastverk/rules_openapi |
rules_podman | 0.0.2 | fastverk/rules_podman |
rules_postgres | 0.8.0 | fastverk/rules_postgres |
rules_puml | 0.0.2 | fastverk/rules_puml |
rules_rdf | 0.4.0 | fastverk/rules_rdf |
rules_readme | 0.0.3 | fastverk/rules_readme |
rules_runpod | 0.0.11 | fastverk/rules_runpod |
rules_schema_org | 0.0.1 | fastverk/rules_schema_org |
rules_spec | 0.5.1 | fastverk/rules_spec |
rules_ssh_tui | 0.0.5 | fastverk/rules_ssh_tui |
rules_storybook | 0.2.0 | fastverk/rules_storybook |
rules_systemd | 0.0.1 | fastverk/rules_systemd |
rules_tap | 0.0.3 | fastverk/rules_tap |
rules_tectonic | 0.2.0 | fastverk/rules_tectonic |
rules_uv | 0.7.4 | fastverk/rules_uv |
rules_vite | 0.1.1 | fastverk/rules_vite |
rules_vscode | 0.0.2 | fastverk/rules_vscode |
rules_walkthrough | 0.1.0 | fastverk/rules_walkthrough |
rules_web | 0.0.1 | fastverk/rules_web |
rules_xsd | 0.0.1 | fastverk/rules_xsd |
spec | 0.5.2 | fastverk/spec |
vpn | 0.0.1 | fastverk/vpn |
wave | 0.0.1 | fastverk/wave |
Registry split
fastverk and citizen-sh publish from separate registries:
- fastverk modules:
https://registry.fastverk.com/ - citizen-sh modules:
https://raw.githubusercontent.com/citizen-sh/bazel-registry/main/
Use the registry chain that matches the modules you consume.
Quick start
Wire the fastverk registry into your Bazel build, then add the modules you need.
1. Add the registry chain
.bazelrc:
common --registry=https://registry.fastverk.com/
common --registry=https://bcr.bazel.build/
2. Depend on modules
MODULE.bazel:
bazel_dep(name = "rules_uv", version = "0.7.4")
bazel_dep(name = "rules_cloudformation", version = "0.8.0")
# … etc.
See each module’s API reference for module-specific setup (toolchains,
extensions, use_repo).
3. Build
bazel build //...
A fresh checkout / CI resolves every fastverk module from the registry — no git
submodules, no local_path_override. To hack on one locally, override it ad hoc:
bazel build //… --override_module=<name>=path/to/<name>
Philosophy
- Bazel-native first. Cross-module workflows are expressible as Bazel targets, not out-of-band scripts.
- Hermetic by default. Each module either pins its upstream artifact’s sha256 + extracts deterministically, or vendors a source tarball with the same. Host-tool dependencies are limited to OS-provided utilities that don’t drift.
- Honest about gaps. Modules ship at
0.0.xwith explicit “no smoke” labels when not yet verified end-to-end. We don’t pretend. - One thing per module. Splitting beats coupling.
Contributing
Each module has its own issues + PRs. For org-wide coordination (cross-module bumps, registry-tier moves, agent dispatch), botnoc — the bot-driven Network Operations Center — is the entry point. botnoc renders the module catalog above and orchestrates work across the constellation.
rules_mdbook
API reference, generated from the module’s .bzl docstrings (stardoc).
User-facing Bazel rules for rules_mdbook.
Exports mdbook_book, which runs mdbook build over a staged source
tree and packages the rendered HTML into a tarball. Optional plugin
executables (e.g. mdbook-mermaid) are staged onto PATH so mdbook can
resolve them by their bare names.
Targets returning MdbookSiteInfo expose the site tarball
programmatically so future rules (a deploy step, a link checker, a
mdbook serve wrapper) can consume the output without re-running
mdbook.
mdbook_book
load("@rules_mdbook//mdbook:defs.bzl", "mdbook_book")
mdbook_book(name, srcs, out, book_toml, plugins, src_strip_prefix)
Run mdbook build over a staged source tree and produce an HTML tarball.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| srcs | All source files (Markdown, SUMMARY.md, theme assets, etc.). Each file is staged at its package-relative path minus src_strip_prefix. A directory (tree artifact produced by an upstream rule) is copied recursively into its computed relative path, so a rule that stages a generated chapter tree can feed it here directly. | List of labels | required | |
| out | The rendered site, packaged as a .tar.gz. | Label | required | |
| book_toml | The mdbook configuration file. Staged at the root of the build sandbox. | Label | required | |
| plugins | mdbook plugin executables (e.g. @mdbook_mermaid//:mdbook-mermaid). Staged onto PATH so mdbook can resolve them by bare name. | List of labels | optional | [] |
| src_strip_prefix | Prefix to strip from each src’s package-relative path before staging. Empty means files land at their package-relative paths. | String | optional | "" |
mdbook_serve
load("@rules_mdbook//mdbook:defs.bzl", "mdbook_serve")
mdbook_serve(name, plugins)
Run mdbook serve (with watch + live reload) against the live user source tree under $BUILD_WORKSPACE_DIRECTORY/<package>. Invoke via bazel run //path/to:target. The target’s package directory must contain the book.toml; mdbook’s own watch picks up edits without Bazel re-running.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| plugins | mdbook plugin executables, staged onto PATH so mdbook resolves them by bare name. Match the plugins listed in your book.toml. | List of labels | optional | [] |
MdbookSiteInfo
load("@rules_mdbook//mdbook:defs.bzl", "MdbookSiteInfo")
MdbookSiteInfo(tarball)
A rendered mdbook site.
FIELDS
Module extension for rules_mdbook.
Auto-fetches prebuilt mdbook + mdbook-mermaid binaries for the host
platform. Versions are pinned by sha256 in
private/known_versions.bzl. Consumers can override the version per
tool via the toolchain tag class.
Default usage (pulls the default-pinned mdbook + mdbook-mermaid):
mdbook = use_extension("@rules_mdbook//mdbook:extensions.bzl", "mdbook")
use_repo(mdbook, "mdbook", "mdbook_mermaid")
Pin a specific version:
mdbook = use_extension("@rules_mdbook//mdbook:extensions.bzl", "mdbook")
mdbook.toolchain(mdbook_version = "0.5.2", mermaid_version = "0.17.0")
use_repo(mdbook, "mdbook", "mdbook_mermaid")
Release fetching is delegated to
@rules_github//github:repositories.bzl%github_binary_repository
so all our rules_* repos share one URL-shape + sha-pinning impl.
mdbook
mdbook = use_extension("@rules_mdbook//mdbook:extensions.bzl", "mdbook")
mdbook.toolchain(mdbook_version, mermaid_version)
Sets up @mdbook and @mdbook_mermaid as Bazel-fetched prebuilt binaries.
TAG CLASSES
toolchain
Attributes
Toolchain rule for rules_mdbook.
mdbook_toolchain wraps a single mdbook binary as a Bazel toolchain.
Consumers (the mdbook_book and mdbook_serve rules) resolve mdbook
through @rules_mdbook//mdbook:toolchain_type, so users can register
custom mdbook binaries (locally-built fork, alternate version, …) via
register_toolchains(...) without modifying rule attributes.
The module extension at @rules_mdbook//mdbook:extensions.bzl generates
a default toolchain (@mdbook//:mdbook_toolchain_def) wrapping the
prebuilt binary. Users register it from their MODULE.bazel:
register_toolchains("@mdbook//:mdbook_toolchain_def")
mdbook_toolchain
load("@rules_mdbook//mdbook:toolchains.bzl", "mdbook_toolchain")
mdbook_toolchain(name, mdbook)
Declare an mdbook binary as a Bazel toolchain.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| mdbook | Path to the mdbook executable. | Label | required |
MdbookToolchainInfo
load("@rules_mdbook//mdbook:toolchains.bzl", "MdbookToolchainInfo")
MdbookToolchainInfo(mdbook)
The mdbook binary, resolved via a toolchain.
FIELDS
rules_cloudformation
API reference, generated from the module’s .bzl docstrings (stardoc).
rules_cloudformation roadmap
Three milestones to first useful release. Numbering matches the
rules_docker_compose cadence: v0.1 = schema-derived primitives,
v0.2 = hand-written orchestration, v0.3 = deploy wrappers + linter.
v0.1 — schema fetch + codegen
Get the schema into the repo as Bazel-fetched data, run codegen, ship the first typed rule end-to-end.
-
Schema fetch.
cloudformation/private/extensions.bzldefines anhttp_archive-backed module extension pinningaws-cloudformation/cloudformation-template-schemato a specific commit + sha256. Same shape asrules_docker_compose’scompose_spec_extension, excepthttp_archive(nothttp_file) because the upstream packages the schema as part of a Maven build, not a single JSON file — seedocs/SCHEMA_SOURCE.md. -
MODULE.bazel wires
rules_jsonschema. Addsbazel_dep(name = "rules_jsonschema", version = "0.2.0")and ause_extensionblock consuming the schema repo. -
Codegen pipeline. A single
jsonschema_starlark_codegeninvocation reads the masterSchema.templateand emitscloudformation/cloudformation_rules.bzl— onerule()perAWS::*definition. Estimated ~1000+ rules (e.g.cloudformation_aws_s3_bucket,cloudformation_aws_lambda_function,cloudformation_aws_ec2_instance,cloudformation_aws_iam_role,cloudformation_aws_dynamodb_table, …). The committed.bzlis diff-tested against fresh codegen on every CI build, exactly likecompose_rules.bzlinrules_docker_compose. -
Smoke test. One end-to-end example: a single
cloudformation_aws_s3_buckettarget rendered through a placeholder aggregator into golden YAML. Validates that the schema-fetch → codegen → typed-attr → JSON-shard → YAML pipeline works for at least one resource type before the v0.2 aggregator arrives.
v0.2 — hand-written orchestration
Replace the placeholder aggregator with the real graph-walking implementation, plus cross-stack ref resolution.
-
cloudformation_stack. Aggregator rule. Collects shards fromdeps, validates theRefgraph (every logical ID referenced is defined or has a matchingcloudformation_resource_ref), and renders one canonicaltemplate.yamlvia a Rustcfn-genbinary. Same shape asdocker_compose: shard JSON → typed struct (fromrules_jsonschema’sjsonschema_rust_library) → canonical YAML. Stable key ordering so re-renders are byte-identical. -
cloudformation_resource_ref. Cross-stack reference resolver. Given a target stack label and an output name, resolves to the exported value at build time (via either a checked-inoutputs.jsonor a stack-output index file), then rewrites a property of a named resource in the rendered template to that value. Same role asdocker_compose_oci_image_ref: a build-time override that turns a symbolic reference into a concrete pinned value before deploy. -
Providers.
CloudformationResourceInfo,CloudformationStackInfo,CloudformationResourceRefInfo.
v0.3 — deploy + lint
Ship the runtime wrappers and the Java-based linter.
-
cloudformation_up.bazel runwrapper that invokesaws cloudformation deploy --template-file <rendered> --stack-name <stack>against the rendered template, with--parameter-overridesflowing through from rule attrs. Same shape asdocker_compose_up’sbazel runwrapper. -
cloudformation_down.bazel runwrapper foraws cloudformation delete-stack. Mirrorsdocker_compose_down. -
Java linter. Port of cfn-lint–style rules built with
rules_java. Why Java: the upstream schema repo is a Maven project whose intrinsic-function and reference tables already exist in Java — reusing them is cheaper than reimplementing. Packaged as ajava_binaryinvoked from acloudformation_lint_testrule that runs against everycloudformation_stack.
Schema source
Where rules_cloudformation’s typed rules ultimately come from.
Choice (v0.1)
We run the upstream Java assembler
(aws.cfn.codegen.json.Main from
aws-cloudformation/cloudformation-template-schema) at build time
against a sha-pinned snapshot of the AWS CloudFormation Resource
Specification. The assembler emits one JSON Schema per resource
group; we feed the storage group’s output (scoped to AWS::S3.*
in v0.1) through rules_jsonschema’s
jsonschema_starlark_codegen to produce the typed Bazel rules.
Two artifacts are pinned in
cloudformation/private/extensions.bzl:
aws-cloudformation/cloudformation-template-schemaat commit5d7815b14fd533c15c30f9046a76cdcb89afd32a(sha2567f40b919bbea6109244903744262074f6afa32fdd780a6dca0540ef1b57bd774). Fetched but not on the compile path — see the Lombok wrinkle section below. Vendored undercloudformation/private/assembler_src/in delomboked form.- The us-east-1
CloudFormationResourceSpecification.jsonat sha2563bf0f8b5034b51c622da82f7cec9499112a40719f28fff5c6d2050a0c3a24459. Endpoint:https://d1uauaxba7bl26.cloudfront.net/latest/CloudFormationResourceSpecification.json.
How the build composes
@cfn_resource_spec//file:CloudFormationResourceSpecification.json
│
▼
//cloudformation:assembled_storage (cfn_assemble)
│
│ storage-spec.json (JSON Schema, ~280 KB,
│ 223 AWS::S3.* + Tag definitions)
▼
//cloudformation:aws_s3_bucket_gen (jsonschema_starlark_codegen)
│
▼
aws_s3_bucket.bzl (committed, diff_test-gated)
cfn_assemble synthesizes a YAML config that points the assembler
at the local pinned spec (the upstream bundled config.yml has all
25 region URLs hard-coded to the AWS CDN, which would defeat
build-time reproducibility), narrows the region set to us-east-1
(the source-of-truth region), and declares a single custom group
with the requested includes/excludes.
Lombok wrinkle
The upstream sources use Lombok 1.16.22 (released 2018). That
release predates JDK 21+. The current Lombok release line (1.18.x)
fails to initialize under JDK 25 with
com.sun.tools.javac.code.TypeTag :: UNKNOWN, and Bazel 9.1.0’s
rules_java toolchain runs the JavaBuilder on remotejdk25 by default
without an easy override.
After running the prompt’s listed fallbacks (bump Lombok, pin
--java_runtime_version=remotejdk_21, pin Lombok 1.18.36 — none of
which sidestepped the issue because the JavaBuilder process itself
runs on remotejdk25), we took the documented nuclear option: ran
lombok.jar delombok against the upstream sources locally
(java -jar lombok.jar delombok src/main/java -d cloudformation/private/assembler_src),
stripped the @lombok.Generated annotations the delomboker leaves
on each generated method, and committed the result.
Trade-off: refreshing the assembler from a newer upstream commit
isn’t a one-line bump anymore — it’s a delombok + commit. In
exchange, the build has no annotation-processor at compile time and
no Lombok runtime dep, so it stays buildable on whichever JDK Bazel
ships with going forward.
The patched upstream Codegen has one rules_cloudformation-local
fix: newer CFN spec entries can have Type: Json with no
PrimitiveType set, which the upstream code treats as a primitive
but then NPEs on. The patch in Codegen.addPrimitiveType falls back
to “Json” when the primitive name is null.
Known gap: registry-only resources
The legacy Resource Specification we pin covers ~1582 of the
~1600+ types AWS publishes. A handful of newer types
(post-2023 additions — e.g. AWS::EC2::Image,
AWS::EC2::SnapshotBlockPublicAccess) only ship via the newer
CloudFormation Registry schema source (per-resource JSON files at
schema.cloudformation.us-east-1.amazonaws.com/) and never
landed in the legacy spec. Surfacing them would mean pulling
from the Registry endpoint as a second source — same per-resource-
file shape as the v0.0.1 source, but only for the resources the
legacy spec is missing. v0.7+ work item; not on the current
roadmap because demand is low (savvi-ops, the design’s stress
test, hits ~1 of 87 in-use AWS types as a registry-only).
Alternatives considered
| Source | Why not chosen |
|---|---|
Per-resource AWS endpoint (schema.cloudformation.us-east-1.amazonaws.com/<resource>.json) | The v0.0.1 / first-cut v0.1 used this. It works but it’s a per-resource fetch (1200+ URLs to track for full coverage) and the schema content is the AWS resource-provider schema, which is divergent from the CloudFormation Resource Specification. Pivoting now keeps the same source-of-truth as cfn-lint and the CFN Linter docs. |
aws-cloudformation/cloudformation-cli registry schemas | Same per-resource shape, different repository. No advantage. |
| Hand-curated subset | rules_jsonschema’s whole point is avoiding drift between hand-written rules and upstream. Hard-no. |
Refreshing
Three independent bumps:
-
CFN Resource Specification (typical: track AWS-published spec versions):
curl -fsSL https://d1uauaxba7bl26.cloudfront.net/latest/CloudFormationResourceSpecification.json | shasum -a 256 # bump _RESOURCE_SPEC_SHA256 in cloudformation/private/extensions.bzl bazel run //cloudformation:update -
Upstream assembler source (rare: only when upstream changes how groups are computed or fixes a Codegen bug):
# Compute the new tarball hash curl -fsSL https://github.com/aws-cloudformation/cloudformation-template-schema/archive/<commit>.tar.gz | shasum -a 256 # Re-delombok + commit curl -fsSL https://projectlombok.org/downloads/lombok-1.18.36.jar -o /tmp/lombok.jar java -jar /tmp/lombok.jar delombok \ <unpacked-src>/src/main/java \ -d cloudformation/private/assembler_src find cloudformation/private/assembler_src -name '*.java' -exec sed -i '' 's/@lombok\.Generated//g' {} + # Bump _TEMPLATE_SCHEMA_COMMIT + _TEMPLATE_SCHEMA_SHA256 in extensions.bzl bazel run //cloudformation:update -
Maven deps (rare: only when upstream pom.xml shifts):
# Edit MODULE.bazel's maven.install(artifacts=[...]) list REPIN=1 bazel run @cfn_assembler_maven//:pin
Path to ~1200 resource types
v0.1 covers AWS::S3::Bucket as a codegen smoke. v0.2 lifts the
hard-coded resource set into a tag class:
cfn_resources = use_extension(
"@rules_cloudformation//cloudformation/private:extensions.bzl",
"cfn_sources_extension",
)
cfn_resources.bundle(
name = "storage",
includes = ["AWS::S3.*", "AWS::DynamoDB.*"],
)
cfn_resources.bundle(
name = "compute",
includes = ["AWS::EC2.*", "AWS::Lambda.*"],
)
so consumers opt into the resource set they care about — declaring
1200 typed Bazel rules per consumer when they use 10 is wasted
analysis time. Bundling lands in v0.2 (see
ROADMAP.md).
rules_gitlab
API reference, generated from the module’s .bzl docstrings (stardoc).
Public Bazel rules for working with GitLab CI configuration.
Today (v0.1.0):
gitlab_ci_validate(name, src)— build-action rule. Validates a.gitlab-ci.ymlagainst the official GitLab JSON Schema pinned by sha via thegitlab_schemasmodule extension. Hermetic; no network, no auth.gitlab_ci_lint(name, src, host, repo)—bazel run-able target. Wrapsglab ci lint <src>via theglabtoolchain. Hits the GitLab API for full pipeline validation (semantic checks beyond pure schema +include:resolution). Requiresglab auth loginto the target instance.
Future surface:
gitlab_ci_lint_remote(name, src, project)— call/api/v4/projects/:id/ci/lintdirectly (no glab CLI indirection), bake the project context.- Deploy + registry helpers, schema-derived typed Starlark rules
for authoring
.gitlab-ci.ymlfrom Bazel (mirroring the rules_jsonschema + rules_cloudformation pattern).
Limitations of gitlab_ci_validate:
- Does not follow
include:directives. A.gitlab-ci.ymlthat imports another project’s snippets is validated only at its own leaf level; chain validations on the included files by registering each as a separategitlab_ci_validatetarget.gitlab_ci_linthandles includes server-side.
gitlab_ci_lint
load("@rules_gitlab//gitlab:defs.bzl", "gitlab_ci_lint")
gitlab_ci_lint(name, src, host, repo)
bazel run-able target that lints a .gitlab-ci.yml via glab ci lint. Network-bound: hits the GitLab API, requires the user to be glab auth login-ed to the target instance. For hermetic schema-only validation, use gitlab_ci_validate instead.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| src | Label of the .gitlab-ci.yml (or fragment) to lint. | Label | required | |
| host | GitLab host (e.g. gitlab.savvifi.com). Used to anchor glab’s API target when the runfiles cwd doesn’t have a gitlab remote. Ignored if repo is set (which carries host). | String | optional | "" |
| repo | OWNER/REPO or full URL passed as glab -R. Strongly recommended — lets glab pick the right GitLab instance + project context without inspecting the sandbox’s git state. | String | optional | "" |
gitlab_ci_validate
load("@rules_gitlab//gitlab:defs.bzl", "gitlab_ci_validate")
gitlab_ci_validate(name, src)
Validate a .gitlab-ci.yml against the official GitLab JSON Schema (pinned by sha256 via the gitlab_schemas module extension). Output: a stamp file Bazel checks for caching; on schema violation the build fails with check-jsonschema’s diagnostic on stderr.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| src | Label of the .gitlab-ci.yml (or sibling fragment) to validate. | Label | required |
gitlab_ci
load("@rules_gitlab//gitlab:defs.bzl", "gitlab_ci")
gitlab_ci(name, stages, variables, default, image, include, workflow, jobs, extra, out, write_to,
validate, **kwargs)
Generate a .gitlab-ci.yml from a typed Starlark spec.
Assembles the spec in a fixed top-level order (include, workflow,
default, image, stages, variables, jobs sorted by name, extra),
emits it deterministically as YAML, and (by default) schema-validates
the result. Set write_to (e.g. ".gitlab-ci.yml") to also create
<name>.update — bazel run …:<name>.update writes the file into the
source tree; bazel test …:<name>.update checks it is up to date.
PARAMETERS
gitlab_job
load("@rules_gitlab//gitlab:defs.bzl", "gitlab_job")
gitlab_job(stage, script, image, services, before_script, after_script, rules, needs, artifacts,
variables, cache, tags, environment, when, allow_failure, interruptible, timeout, retry,
parallel, coverage, extends, dependencies, extra)
Build one GitLab CI job as a None-stripped, key-ordered dict.
Returns a plain dict (Starlark structs aren’t json.encode-able), so
pass the result as a value in gitlab_ci(jobs = {...}). Any key not
modeled here can be supplied via extra (a raw dict, merged last).
PARAMETERS
gitlab_reference
load("@rules_gitlab//gitlab:defs.bzl", "gitlab_reference")
gitlab_reference(*parts)
Emit a GitLab !reference [job, key, ...] tag value.
Usable as a value anywhere in a spec; survives json.encode as a
sentinel the emitter turns back into a real !reference YAML tag.
PARAMETERS
rules_jsonschema
API reference, generated from the module’s .bzl docstrings (stardoc).
RFC-001 — Codegen Plugin Protocol
Status: draft, revised. Captures the architecture pivot from “Rust-binary-per-output-language” to “per-language plugins reading the schema directly via a minimal stdin/stdout contract”.
Earlier drafts of this RFC proposed a protoc-style architecture with a frontend, a parsed AST proto, and a dual
ast/rawplugin mode. That design was abandoned because (a) JSON Schema is already JSON — every plugin language can parse it directly — and (b) most realistic plugins wrap upstream tools (typify,atombender/go-jsonschema,oapi-codegen, …) that have their own parsing anyway. The AST was a small spec language we’d be inventing for marginal benefit. See “Why we abandoned the AST” below for the full reasoning.
Goal
Decouple rules_jsonschema’s user-facing rules from a hardcoded codegen language. After this RFC lands, adding a new output language is:
- Write a plugin binary in that language so it leverages native
AST tooling —
go/formatfor Go,quote/synfor Rust,ts-morphfor TypeScript. - Register a
jsonschema_codegen_toolchainpointing at it. - Add a
jsonschema_<lang>_libraryuser-facing rule that wraps the target language’s*_libraryBazel rule.
The plugin reads the schema bytes from stdin, options from argv, writes the generated file content to stdout, and signals errors via stderr + exit code. No protobuf dep, no AST proto, no frontend binary. Stdlib-only plugins are achievable in any language.
The contract
A plugin is any executable that conforms to:
INPUT
stdin the schema file contents (raw bytes)
argv --key=value pairs, repeated. Plugin-specific.
The rule may also pass standard flags it owns.
OUTPUT
stdout the generated file content (raw bytes)
stderr diagnostics / error messages
EXIT
0 success — stdout is the generated file
non-zero failure — stderr explains why
That’s it. A plugin in Go is:
package main
import (
"encoding/json"
"io"
"os"
)
func main() {
schemaBytes, _ := io.ReadAll(os.Stdin)
var schema map[string]any
if err := json.Unmarshal(schemaBytes, &schema); err != nil {
fmt.Fprintln(os.Stderr, "parse:", err)
os.Exit(1)
}
// ... generate Go source from schema ...
os.Stdout.Write([]byte(generated))
}
A plugin in Rust is the same thing with serde_json. A plugin in
Python wraps json.load(sys.stdin.buffer). There is no contract-
specific dep in any language.
Standard argv conventions
The rule passes a fixed set of flags every plugin receives, plus
whatever the consumer set in options:
| Flag | Set by | Meaning |
|---|---|---|
--schema-name=NAME | rule | Original schema file basename (e.g. compose-spec.json). For error messages and stable codegen header comments. |
--rule-name=NAME | rule | The Bazel target’s name. Useful for picking output identifiers. |
--<consumer-flag>=VAL | consumer | Free-form per-plugin options from the rule attrs. |
Plugins should treat unknown flags as a hard error so misconfigured options don’t silently degrade output.
Bazel output declaration
Bazel rules must declare their outputs at analysis time, before any action runs. Three real options were considered:
| Approach | Pros | Cons |
|---|---|---|
| A. Single file per rule invocation | Output path known at analysis. Simple. Matches protoc-gen-go in practice. | Plugin authors can’t naturally split output. |
B. declare_directory (tree artifact) | Plugin emits arbitrarily many files. | Downstream rust_library / go_library rules have to glob the directory or expand it. Awkward, non-standard. |
| C. Two-pass: pre-flight + emit | Plugin advertises outputs given a schema, then generates. | Two plugin invocations per build. Doubles action overhead. |
Decision: A. Plugin produces exactly one file (on stdout) per rule invocation. Multi-output needs (types vs validators, client vs server) split into separate rule targets:
jsonschema_go_types(name = "person_types", schema = "person.json")
jsonschema_go_validators(name = "person_validators", schema = "person.json")
Each target is independently cacheable; the build graph is clearer. Tree artifacts (B) remain available as an escape hatch for the rare genuinely-multi-file plugin.
Bazel rule shape
Each per-language user-facing rule has the same structure:
def _jsonschema_rust_codegen_impl(ctx):
out = ctx.actions.declare_file(ctx.label.name + ".rs")
tc = ctx.toolchains[_RUST_TOOLCHAIN].codegen_info
args = [
"--schema-name=" + ctx.file.schema.basename,
"--rule-name=" + ctx.label.name,
]
# Plugin-specific options passed through from rule attrs.
for k, v in ctx.attr.options.items():
args.append("--{}={}".format(k, v))
ctx.actions.run_shell(
inputs = [ctx.file.schema],
outputs = [out],
tools = [tc.binary],
command = '{plugin} {args} < {schema} > {out}'.format(
plugin = tc.binary.path,
args = " ".join([shell.quote(a) for a in args]),
schema = ctx.file.schema.path,
out = out.path,
),
)
return [DefaultInfo(files = depset([out]))]
User-facing macro composes that codegen with the target language’s library rule:
def jsonschema_rust_library(name, schema, **kwargs):
gen_name = name + "_rs_gen"
_jsonschema_rust_codegen(name = gen_name, schema = schema)
rust_library(
name = name,
srcs = [":" + gen_name],
edition = "2021",
deps = [...],
**kwargs
)
Same shape per language.
Why we abandoned the AST
The first draft of this RFC proposed a protoc-style architecture: a frontend parses the schema into a canonical AST proto, plugins consume that AST instead of raw bytes. After looking at it harder I think this was the wrong call. Reasons:
-
The protoc analogy doesn’t transfer. protoc has an AST because
.protofiles have a grammar nobody else has implemented. Plugin authors would otherwise re-implement parsing. JSON Schema is already JSON — every plugin language has a JSON parser in stdlib or one-line dep. The “no plugin reparses” argument is ~free to ignore for us. -
Most plugins wrap upstream tools.
typify,atombender/go-jsonschema,oapi-codegen,openapi-generatorall take raw schema bytes and have their own parsing. Our AST would be throwaway work for them. The dualmode = "ast" | "raw"we briefly proposed was evidence the AST wasn’t the natural fit. -
Cross-plugin consistency was illusory. Different upstream tools interpret edge cases differently (recursive refs, allOf ordering, oneOf discriminator behavior). Putting an AST in front doesn’t unify them — each wrapping plugin still defers to its underlying library.
-
Maintenance cost is real. Defining
Schema/Type/UnionType/IntersectionTypeis a small spec language we invent and ship. Every JSON Schema feature we don’t model becomes anextra_jsonescape hatch. We’d end up maintaining a parallel type system that nothing consumes natively. -
Plugin author ergonomics matter. “Read stdin, write stdout” is the lowest possible barrier to entry. A Bash script could be a plugin. Adding “deserialise a protobuf request” pushes plugin authors into language-specific toolchain setup before they write the first line of codegen logic.
The toolchain pattern (toolchain types per output language, register your own plugin to override) survives the simplification unchanged.
Why we also abandoned the proto envelope
Even without an AST, we considered keeping a thin proto wrapper:
CodeGenRequest{raw_schema, options, version} in, CodeGenResponse{file, error, features} out. Forward-compat without the AST baggage.
The argument against:
- The structured-options part is the only piece of the proto that isn’t trivially expressible as stdin/argv/stderr/exit-code. argv handles structured options fine.
- For ~5 plugins over the foreseeable future, “add a field without breaking old plugins” isn’t load-bearing; we can coordinate.
- Plugin author barrier matters more than abstract evolvability. A one-file Python plugin (15 lines) beats a Rust plugin with protobuf codegen deps for any reasonable measure.
- We can always add a proto envelope later if we hit a real wall. Migrating plugins is straightforward — only the stdin-parsing changes, the codegen logic doesn’t.
Open questions
-
Stable JSON Schema spec-version handling. Plugins should probably refuse to operate on schemas whose
$schemadoesn’t match what they expect. Convention: plugins error with--schema-name=… : unsupported $schema: <value>rather than producing wrong output. Each plugin owns its own version detection. -
Cross-plugin shared parsing. If we ever need it (we don’t yet), a future RFC could add an optional sidecar artifact: the rule runs a one-time
jsonschema_parseaction that emits a normalised JSON form, and plugins opt into reading that instead of the original schema. Backward compatible — old plugins still consume raw. -
Diagnostic format. stderr is freeform today. If we ever want structured diagnostics (file:line:col annotations), we’d define a stderr-line format like
WARNING:path:line:col:msg. Not v1. -
Toolchain attr surface. Currently the toolchain rule just carries
binary. Future fields might include:supported_drafts(list of$schemavalues),default_options(dict),version(for diagnostic banners). All additive.
Decisions to lock in before Phase 1
- Plugin contract: stdin = schema bytes, argv = options, stdout = generated file content, stderr + exit code for errors. No proto, no AST.
- Bazel outputs: single file per rule invocation. Multi-output needs split into separate targets. Tree-artifact escape hatch for genuine many-file plugins.
- Plugin discovery: toolchain types per output language (already in place).
- Repo naming: stay
rules_jsonschema.
Phases
Phase 1: nail down the contract in code
//jsonschema:plugin_contract.md(or similar) — a concise written spec of stdin/argv/stdout/stderr the contract docs reference.- Refit the existing Rust + Starlark codegen binaries to the new
contract.
schema_to_rustalready mostly does this (it reads a path from--schema); switch to stdin and the standard argv flags. - Update
//rust:defs.bzland//starlark:defs.bzlto invoke plugins via the contract. - Existing rules_docker_compose tests should pass byte-identical.
Phase 2: Go plugin (in Go)
tools/plugin_go/main.goreads schema bytes from stdin, parses viaencoding/json, emits Go types usinggo/format. Uses rules_go.//go:defs.bzlwithjsonschema_go_library.- Smoke example: person.json → Go types → round-trip decode test.
This validates the cross-language contract works as cleanly as the RFC claims. If implementing the Go plugin is harder than the “15 lines” pitch, the contract needs tightening.
Phase 3: contract testing
A small integration-test rule that runs an arbitrary plugin against a curated set of “interesting” schemas (compose-spec subset, edge cases, malformed input) and asserts on stdout/stderr/exit behavior. Lets plugin authors verify conformance before registering as a toolchain.
Phase 4: rules_docker_compose migration
Should be a no-op end-user-visibly — the codegen binaries still exist, just invoked through the new contract. Tests pass byte-identical.
Plugin conformance test.
jsonschema_plugin_contract_test(name, plugin) runs the contract
test driver against any executable that claims to implement the
rules_jsonschema plugin contract (see
plugin_contract.md). The driver exercises:
- Minimum-viable invocation produces non-empty stdout + exit 0.
- Malformed JSON input → non-zero exit, stderr explanation, empty stdout (the discipline most likely to be violated by plugins emitting partial output before erroring).
- Unknown flags are rejected.
- Output is deterministic across identical invocations.
Plugin authors use it to gate their toolchain registration:
load("@rules_jsonschema//jsonschema:contract_test.bzl",
"jsonschema_plugin_contract_test")
jsonschema_plugin_contract_test(
name = "my_plugin_conforms",
plugin = "//my:rust_codegen",
)
jsonschema_plugin_contract_test
load("@rules_jsonschema//jsonschema:contract_test.bzl", "jsonschema_plugin_contract_test")
jsonschema_plugin_contract_test(name, plugin)
Run the rules_jsonschema plugin contract scenarios against a plugin binary.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| plugin | The plugin binary to test. Any executable that claims to implement the rules_jsonschema plugin contract. | Label | required |
Go user-facing rules for rules_jsonschema.
jsonschema_go_library is the Go-specific shape of the schema → code
pipeline:
- Resolves the
go_codegen_toolchain_typetoolchain. - Runs the toolchain’s binary on the schema (stdin/argv/stdout
per
//jsonschema/plugin_contract.md), producing a.gofile. - Wraps the
.goin ago_libraryfrom@rules_go.
The default toolchain (registered by rules_jsonschema’s MODULE.bazel)
points at the in-repo schema_to_go Go binary. Coverage is minimal —
primitives, structs, slices, maps, optional pointers, refs. For
fuller JSON-Schema-to-Go support, register your own
jsonschema_codegen_toolchain pointing at a different binary (e.g.
atombender/go-jsonschema).
jsonschema_go_library
load("@rules_jsonschema//go:defs.bzl", "jsonschema_go_library")
jsonschema_go_library(name, schema, importpath, package, extra_args, visibility,
**go_library_kwargs)
Generate a go_library of typed schema bindings.
The emitted package exports one Go type per schema $defs /
definitions entry plus a top-level type from the schema’s
title (if set). Required properties become value-typed fields;
optional properties become pointer-typed with ,omitempty tags.
PARAMETERS
Helpers used by schema_to_starlark-generated rule code.
Kept in a separate file (rather than inlined per generated .bzl) so
the codegen output stays small and any helper fix benefits every
consumer at once. Generated .bzl files load from this module:
load("@rules_jsonschema//runtime:helpers.bzl", "strip_empty", "parse_json_or_none")
parse_json_or_none
load("@rules_jsonschema//runtime:helpers.bzl", "parse_json_or_none")
parse_json_or_none(s)
Return None for empty input, otherwise json.decode(s).
Used for typed schema attrs whose value is a structured object
or array. Generated rule callers pass json.encode({...}) (or
leave the attr empty); the generated impl invokes this to expand
the encoded payload back into a Starlark dict/list that gets
merged into the shard.
PARAMETERS
strip_empty
load("@rules_jsonschema//runtime:helpers.bzl", "strip_empty")
strip_empty(d)
Drop dict entries whose values are absent / zero / empty.
Matches the JSON omitempty convention so generated shards stay
terse — Bazel attr.* zero values (0, False, “”, [], {}) shouldn’t
serialise as explicit overrides. Distinguishing “user set to 0”
from “user didn’t set” isn’t possible at the Starlark layer, so
we conflate them: every typed schema field that wants to mean
something non-default ships a non-zero/-empty value.
PARAMETERS
Providers exposed by rules_jsonschema.
JsonschemaCodegenToolchainInfo is the contract every codegen
toolchain provides: a single binary File that implements the
schema → output-language conversion. Per-language user-facing rules
resolve a toolchain by type
(@rules_jsonschema//jsonschema:<lang>_codegen_toolchain_type),
fetch this provider, and run the binary.
Splitting it out from defs.bzl lets language modules (//rust:,
//starlark:, //go:, …) load just the provider without dragging in
language-specific BUILD machinery.
JsonschemaCodegenToolchainInfo
load("@rules_jsonschema//jsonschema:providers.bzl", "JsonschemaCodegenToolchainInfo")
JsonschemaCodegenToolchainInfo(binary)
A schema → code codegen tool.
FIELDS
| Name | Description |
|---|---|
| binary | File: the codegen executable. Invoked with --schema PATH --out PATH and any language-specific flags the calling rule passes through. |
Rust user-facing rules for rules_jsonschema.
jsonschema_rust_library is the Rust-specific shape of the
schema → code pipeline:
- Resolves the
rust_codegen_toolchain_typetoolchain. - Runs the toolchain’s binary on the schema, producing a
.rs. - Wraps the
.rsin arust_librarywith serde / serde_json / regress threaded as direct deps.
The default toolchain (registered by rules_jsonschema’s MODULE.bazel)
points at the in-repo typify-based schema_to_rust binary. Swap by
declaring your own jsonschema_codegen_toolchain + registering it
ahead of the default.
jsonschema_rust_library
load("@rules_jsonschema//rust:defs.bzl", "jsonschema_rust_library")
jsonschema_rust_library(name, schema, extra_args, serde, serde_json, regress, visibility,
**rust_library_kwargs)
Generate a rust_library of typed schema bindings.
The emitted library exports one Rust struct/enum per top-level
JSON-Schema definition, with #[derive(Serialize, Deserialize)]
plus #[serde(deny_unknown_fields)] wherever the source schema
sets additionalProperties: false.
PARAMETERS
Starlark user-facing rule for rules_jsonschema.
jsonschema_starlark_codegen emits typed Bazel rule() definitions
from a JSON Schema:
- Resolves the
starlark_codegen_toolchain_typetoolchain. - Runs the toolchain’s binary on the schema, producing a
.bzl.
The default toolchain (registered by rules_jsonschema’s MODULE.bazel)
points at the in-repo schema_to_starlark binary. Swap by declaring
your own jsonschema_codegen_toolchain and registering it ahead of
the default.
The output is meant to be committed in the consumer repo; pair with a
diff_test to catch drift (re-runs codegen on every CI build and
asserts the committed .bzl matches what the toolchain emits).
jsonschema_starlark_codegen
load("@rules_jsonschema//starlark:defs.bzl", "jsonschema_starlark_codegen")
jsonschema_starlark_codegen(name, schema, kinds, extra_args, **kwargs)
Generate a .bzl of typed rules from a JSON Schema.
PARAMETERS
Toolchain rules for rules_jsonschema codegen.
jsonschema_codegen_toolchain wraps a single codegen executable
(schema_to_rust, schema_to_starlark, schema_to_go, …) as a
Bazel toolchain. The matching toolchain_type lives in
//jsonschema:BUILD.bazel — one type per output language so a
consumer can independently swap, say, the Rust generator without
touching the Starlark or Go ones.
Default toolchains are registered in //rust:BUILD.bazel,
//starlark:BUILD.bazel, //go:BUILD.bazel. To swap an
implementation, declare your own jsonschema_codegen_toolchain and
register_toolchains(...) it ahead of rules_jsonschema’s default in
your MODULE.bazel.
jsonschema_codegen_toolchain
load("@rules_jsonschema//jsonschema:toolchains.bzl", "jsonschema_codegen_toolchain")
jsonschema_codegen_toolchain(name, binary)
Declare a schema → code codegen executable as a Bazel toolchain.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| binary | The codegen executable for this toolchain. Must accept --schema PATH --out PATH plus any language-specific flags. | Label | required |
write_source_files: copy generated outputs back into source.
The canonical Bazel pattern for committed-codegen workflows. A typical
setup pairs a codegen rule (whose output sits under bazel-bin/...)
with a write_source_files target that copies the output to a path
under source control:
jsonschema_starlark_codegen(
name = "compose_rules_gen",
schema = "...",
kinds = [...],
)
write_source_files(
name = "update_compose_rules",
files = {
"compose_rules.bzl": ":compose_rules_gen",
},
)
bazel build //compose:update_compose_rules— no-op.bazel run //compose:update_compose_rules— copies each generated file to its source-tree destination, respectingBUILD_WORKSPACE_DIRECTORYso multi-repo workspaces still work.
Pair with a diff_test to gate freshness:
diff_test(
name = "compose_rules_up_to_date",
file1 = "compose_rules.bzl",
file2 = ":compose_rules_gen",
)
This rule replaces ad-hoc sh_binary + update.sh pairs throughout
rules_jsonschema’s consumers. Functionally equivalent to
@aspect_bazel_lib//lib:write_source_files.bzl, but in-repo so we
don’t take on aspect_bazel_lib as a dep for a single rule.
write_source_files
load("@rules_jsonschema//util:write_source_files.bzl", "write_source_files")
write_source_files(name, files)
bazel run-able target that copies generated files back into source control.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| files | Map of package-relative destination path → label whose single output file should be copied there. Each source label must produce exactly one output file. | Dictionary: String -> Label | required |
rules_uv
API reference, generated from the module’s .bzl docstrings (stardoc).
rules_uv roadmap
v0.1
-
@uv//:binarybuilt from source viacargo_bootstrap_repository. -
uv_runmacro: sandbox-escapingbazel runwrapper. -
pip.parsemodule extension:uv.lock→@piphub + per-package repos. - Pure-Python wheel materialization (
py3-none-any). - Sdist fallback (raw download; no build step yet).
- End-to-end smoke test in
examples/smoke.
v0.2 (this release)
- Prebuilt-uv toolchain alternative.
uv.toolchain(source = "prebuilt")fetches the official release asset for the host platform fromastral-sh/uvreleases. Supported hosts today:darwin_{aarch64,x86_64}andlinux_{aarch64,x86_64}. musl + 32-bit + Windows triples are intentionally omitted until someone needs them — pinning shas we never test is security theater. - Unified target shape. Both
buildandprebuiltproduce@uv//:binaryas aFile;uv_toolchainaccepts the file directly (no more:installrust_binary indirection).
v0.3 (this release)
- Native wheel selection. PEP 425 / PEP 600 tag scoring in
pip/private/wheel_selection.bzl: parse wheel filenames, fan out compressed tag fields, score against a host-specific ordered tag list (pip/private/platform.bzl). MVP covers the 4 fastverk hosts (darwin_{aarch64,x86_64},linux_{aarch64,x86_64}); rules_python’swhl_target_platformsis more thorough and will be the backing implementation once their internals stabilize. - Sdist installation via uv.
sdist_install_repo(pip/private/sdist_install.bzl): downloads the sdist, shells to@uv//:uv(uv pip install --target=. --no-deps) at repo-rule time. Python interpreter viapython = "host"(python3on PATH) orpython = "uv"(uv python installinto a per-repo scratch dir). -
python_version+pythonattrs onpip.parse. Wheel tag matching consultspython_version; sdist install dispatches onpython.
v0.4 (this release)
- Extras:
requirement("pkg[extra]")resolves to a per-extra Bazel target that re-exports:pkgplus the extra’s deps. Generated from each package’s[package.optional-dependencies]table. - Markers: PEP 508 subset evaluated at extension time
against
python_version+ host platform. Edges whose markers fail are filtered out. Cross-platformselect()is v0.5. - Git sources (
source = { git = "…", rev = "…" }):new_git_repositorywith the BUILD wrapper. - Path sources (
source = { path = "…" }):new_local_repository-style symlink rule. - Editable sources: explicit failure with a clear message (editable installs don’t translate to Bazel).
- Hermetic uv invocation:
--no-configon alluv pipanduv python installcalls so the user’s~/.config/uv/uv.toml(which on many machines points at a private index) doesn’t leak into sandbox builds.
v0.5 (this release)
- Cross-platform wheels.
pip.parse(platforms = [...])opts the hub into multi-platform mode. Packages with platform-divergent native wheels fan out into per-platform repos (@<hub>__<pkg>__<platform>) behind a selector repo that emitsalias(name = "pkg", actual = select(...))over@platforms//os+@platforms//cpuconstraint values. Non-host platform repos are declared but lazy-fetched — they only land on disk when Bazel’s configuration triggers that branch of theselect(). - Multi-platform smoke (
examples/multiplatform/): pure-python wheel (idna) flows through the single-repo path; native wheel (markupsafe) flows through the per-platform select.
v0.6 (next)
Smoke fixtures for git + path sources
v0.4 wires git/path source materialization, but no smoke fixture exercises either. A fixture that lock-files a tiny pure-Python package from a pinned GitHub commit + a sibling local path package would catch regressions.
Sdist install in multi-platform mode
Today sdist install is host-only — if a multi-platform lockfile
references an sdist-only package, the extension fails fast rather
than silently producing a broken cross-platform target. v0.6 could
support per-platform sdist installs by running uv pip install --target once per requested platform (each producing its own
per-platform repo). Requires either cross-compilation toolchains
on the host (rare) or Bazel platform-transition magic.
musl + Windows platform tag tables
pip/private/platform.bzl ships tag tables for the four fastverk
hosts only. Adding musllinux + Windows entries (with
@platforms//os:windows and a musl libc constraint) is mechanical
once a consumer needs them.
Marker evaluator: spot tests
pip/private/markers.bzl is a hand-rolled PEP 508 subset parser.
A skylib unittest suite covering operators, precedence, and the
python_full_version vs python_version edge cases would lock
the behavior down.
Beyond v0.6
uv_pip_compile:bazel run-able workflow to regeneraterequirements.txtfrom apyproject.toml(analogous to rules_uv upstream’s compile workflow).- Cross-platform wheels: support emitting
select()deps when a package has multiple platform wheels but the consumer wants to target several configurations from one tree. - Stardoc-generated reference in
/docs.
Delete uv/ when rules_python’s uv is stable
rules_python ships its own experimental uv toolchain primitive at
@rules_python//python/uv:uv_toolchain.bzl and a binary-fetching
module extension at @rules_python//python/uv:uv.bzl. Both are
marked EXPERIMENTAL: This is experimental and may be removed without notice, so today rules_uv carries its own toolchain +
fetch + build paths.
When rules_python promotes these out of experimental, rules_uv’s
uv/ directory becomes pure duplication and should be removed:
- Drop
uv/extensions.bzl,uv/toolchains.bzl,uv/private/known_versions.bzl,uv/private/uv_source.BUILD.bazel. - Replace our
uv_runmacro with one that resolves through rules_python’suv_toolchain_type. - The pip extension keeps using
@uv//:binaryat repo-rule time (just pointing at whichever target rules_python’s extension materializes by then).
This trims rules_uv down to its actual reason for existing: the
uv.lock TOML → @pip materializer. Track upstream status at
https://github.com/bazelbuild/rules_python/issues/ (search for
“uv toolchain experimental”).
pip_parse module extension — uv.lock → @
Counterpart to rules_python’s pip_parse, but driven by uv.lock
instead of requirements.txt. For each package the lockfile resolves
to, we create a Bazel-fetched repo containing the unpacked wheel
(or installed sdist, or fetched git/path source). A hub repo
aggregates these and exposes a requirement("<name>") macro plus
pre-aliased @<hub>//<name>:pkg labels.
Consumer:
pip = use_extension("@rules_uv//pip:extensions.bzl", "pip")
pip.parse(
hub_name = "pip",
lock = "//:uv.lock",
python_version = "3.12",
)
use_repo(pip, "pip")
Extras are exposed as additional sub-targets on the package repo:
load("@pip//:requirements.bzl", "requirement")
py_library(
name = "app",
deps = [
requirement("requests"), # base package
requirement("requests[security]"), # base + extra deps
],
)
Markers (e.g. marker = "python_version < '3.11'") are evaluated
at extension time against the configured python_version + host
platform. Edges whose markers fail are silently dropped from the
generated BUILD — keeping the host-only view simple. Cross-platform
select() is v0.5.
pip
pip = use_extension("@rules_uv//pip:extensions.bzl", "pip")
pip.parse(hub_name, lock, platforms, python, python_version, uv)
Materialize @
TAG CLASSES
parse
Attributes
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| hub_name | Name of the hub repo (the @<hub_name>//… namespace). | String | optional | "pip" |
| lock | Label pointing at a uv.lock file. | Label | required | |
| platforms | Optional list of <os>_<arch> platforms this lockfile should support. Default is host-only (the v0.4 behavior — select() is not introduced). Supported entries: darwin_aarch64, darwin_x86_64, linux_aarch64, linux_x86_64. Packages with platform-divergent native wheels fan out into per-platform repos behind a select() alias; sdist/git/path packages remain host-only and the build will fail loudly if a non-host platform tries to resolve them. | List of strings | optional | [] |
| python | How to find a Python interpreter for sdist install. host uses python3 on PATH; uv runs uv python install <python_version> per package. | String | optional | "host" |
| python_version | Python major.minor used for wheel-tag matching and (when python = “uv”) the uv-managed interpreter. | String | optional | "3.12" |
| uv | Label of the uv binary used to install sdists. | Label | optional | "@uv" |
User-facing rules for rules_uv.
uv_run— sh_binary macro:bazel run //path:NAMEinvokesuv <subcommand>against the live workspace source. Intentionally non-hermetic (escapes the runfiles sandbox) for the dev loop (uv pip sync,uv lock,uv run …).
Lockfile-driven Python repo materialization lives in
@rules_uv//pip:extensions.bzl (pip_parse), which is the rules_uv
analogue of rules_python’s pip_parse but reads uv.lock rather
than requirements.txt.
uv_run
load("@rules_uv//uv:defs.bzl", "uv_run")
uv_run(name, subcommand, args, **kwargs)
bazel run-able wrapper around uv <subcommand>.
Escapes the runfiles sandbox via BUILD_WORKSPACE_DIRECTORY so uv
operates on the user’s source tree (uv lock, uv pip sync …
both need to write into the workspace).
PARAMETERS
Toolchain wrapper for the uv binary.
UvToolchainInfo.uv is a File for the uv executable. Consumers
resolve it via ctx.toolchains["@rules_uv//uv:toolchain_type"].
The attr uses allow_single_file = True rather than
executable = True because the bootstrapped binary at @uv//:binary
is an alias to a source File (cargo_bootstrap_repository’s output)
— Bazel rejects source files as executable attr inputs, so we
accept the file and let the consuming rule mark it executable
itself.
uv_toolchain
load("@rules_uv//uv:toolchains.bzl", "uv_toolchain")
uv_toolchain(name, uv)
Declares a uv toolchain.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| uv | Label of the uv binary (either built via cargo_bootstrap_repository or fetched as a prebuilt release asset). | Label | required |
UvToolchainInfo
load("@rules_uv//uv:toolchains.bzl", "UvToolchainInfo")
UvToolchainInfo(uv)
Information about a uv toolchain.
FIELDS
rules_openapi
API reference, generated from the module’s .bzl docstrings (stardoc).
OpenAPI plugin conformance test.
openapi_plugin_contract_test(name, plugin) runs the rules_openapi
plugin contract scenarios against any plugin executable. Mirrors
rules_jsonschema’s jsonschema_plugin_contract_test but with
OpenAPI-flavored fixtures (a minimal OpenAPI 3.1 document instead
of a JSON Schema).
openapi_plugin_contract_test
load("@rules_openapi//openapi:contract_test.bzl", "openapi_plugin_contract_test")
openapi_plugin_contract_test(name, plugin)
Run the rules_openapi plugin contract scenarios against a plugin binary.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| plugin | The plugin binary to test. | Label | required |
Rust user-facing rules for rules_openapi.
openapi_rust_client is the Rust client codegen rule:
- Resolves the
rust_client_codegen_toolchain_typetoolchain. - Runs the toolchain’s binary on the OpenAPI spec (stdin/argv/
stdout per
//openapi/plugin_contract.md), producing a.rs. - Wraps the
.rsin arust_librarywhose deps includeprogenitor-client,reqwest,serde,serde_json, and any additional crates the consumer threads through.
The default toolchain (registered by MODULE.bazel) points at the
in-repo openapi_to_rust_client binary, which wraps progenitor
under the hood. Swap by declaring your own openapi_codegen_toolchain
and registering it ahead of the default.
openapi_rust_client
load("@rules_openapi//rust:defs.bzl", "openapi_rust_client")
openapi_rust_client(name, spec, extra_args, progenitor_client, reqwest, serde, serde_json, regress,
visibility, **rust_library_kwargs)
Generate a rust_library of a typed OpenAPI HTTP client.
The library exports a Client struct with one method per
OpenAPI operation, plus a types module containing serde
structs for components/schemas.
PARAMETERS
Providers exposed by rules_openapi.
Same shape as rules_jsonschema’s JsonschemaCodegenToolchainInfo —
the plugin contract is identical (stdin/argv/stdout), the only
difference is the schema content shipped on stdin (OpenAPI document
rather than a JSON Schema).
OpenapiCodegenToolchainInfo
load("@rules_openapi//openapi:providers.bzl", "OpenapiCodegenToolchainInfo")
OpenapiCodegenToolchainInfo(binary)
An OpenAPI → code codegen tool.
FIELDS
| Name | Description |
|---|---|
| binary | File: the codegen executable. Invoked with --schema-name=NAME --rule-name=NAME plus per-plugin flags the calling rule passes through. |
Toolchain rules for rules_openapi codegen.
openapi_codegen_toolchain wraps a single codegen executable as a
Bazel toolchain. Toolchain types are split per (language, use_case)
pair — Rust clients, Go clients, Rust servers, etc. — so a consumer
can swap one plugin without affecting the rest.
Default toolchains are registered in the per-language directories
(//rust:BUILD.bazel, …). To swap an implementation, declare your
own openapi_codegen_toolchain and register_toolchains(...) it
ahead of rules_openapi’s default in your MODULE.bazel.
openapi_codegen_toolchain
load("@rules_openapi//openapi:toolchains.bzl", "openapi_codegen_toolchain")
openapi_codegen_toolchain(name, binary)
Declare an OpenAPI → code codegen executable as a Bazel toolchain.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| binary | The codegen executable. Must accept --schema-name=NAME --rule-name=NAME plus any per-plugin flags the calling rule passes through. | Label | required |
rules_bun
API reference, generated from the module’s .bzl docstrings (stardoc).
User-facing rules for rules_bun.
Four pieces:
-
bun_test— runsbun testas a hermetic Bazel test action with explicit srcs + deps. Returns aBunTestInfoprovider wrapping the test result file (for downstream consumers; the main consumer is the test framework, which only cares about exit codes). -
bun_run— sh_binary macro:bazel run //path:NAMEinvokesbun run <script>against the live workspace source. Intentionally non-hermetic (escapes the runfiles sandbox) for the dev loop. Counterpart tobun_test’s hermetic execution. -
bun_bundle— bundle a JS/TS entry point into one self-contained file withbun build. ReturnsBunBundleInfo. -
bun_compile— compile a JS/TS entry point into a standalone native executable withbun build --compile(Bun runtime + bundled JS). ReturnsBunBinaryInfoand isbazel run-nable.
All resolve the Bun binary via @rules_bun//bun:toolchain_type (set
up by register_toolchains("@bun//:bun_toolchain_def") in your
MODULE.bazel).
bun_bundle / bun_compile have two ways to provision node_modules:
-
Bun-native (recommended; no aspect_rules_js, no pnpm-lock): pass a
node_moduleslabel (a@<name>//:node_modulesfrom abun_deps.installtag — seeextensions.bzl) plussrcs(the entry- local modules).
bun buildruns directly via the toolchain Bun; a small shell driver stages the entry into a real tree and symlinks the closure so Bun resolves the import graph natively.
- local modules).
-
Legacy aspect_rules_js: pass a
driverjs_binary whose entry point is@rules_bun//bun:bun-build-driverand whosedatastages the build entry plus its full linked node_modules closure; aspect materializes that closure into the action runfiles.
driver and node_modules are mutually exclusive — set exactly one.
bun_test likewise takes an optional node_modules for dep resolution.
bun_bundle
load("@rules_bun//bun:defs.bzl", "bun_bundle")
bun_bundle(name, srcs, out, driver, entry, external, format, node_modules, target)
Bundle a JS/TS entry into one file via the hermetic Bun toolchain. Either Bun-native (node_modules from bun_deps.install, no aspect_rules_js) or the legacy aspect driver js_binary path.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| srcs | Bun-native path. The entry file + any local modules it imports, declared as action inputs. Ignored on the legacy driver path (that stages sources via the js_binary’s data). | List of labels | optional | [] |
| out | The single bundled output file (conventionally *.mjs). | Label | required | |
| driver | LEGACY aspect_rules_js path. A js_binary whose entry point is @rules_bun//bun:bun-build-driver and whose data stages the bundle entry + its full linked node_modules closure. Mutually exclusive with node_modules; set exactly one. | Label | optional | None |
| entry | Path of the entry point relative to the workspace root (e.g. packages/aion-cli/index.js). On the native path this is the execroot-relative path; on the legacy path it is relative to the driver’s _main runfiles root (same string in practice). | String | required | |
| external | Module names to exclude from the bundle (passed as --external <name>, repeatable). Use for native addons and runtime requires that must stay external, e.g. pg-native, @aws-sdk/*, encoding, source-map-support. | List of strings | optional | [] |
| format | Bun --format. Defaults to esm so import.meta in deps stays valid under Node. | String | optional | "esm" |
| node_modules | Bun-native path. A node_modules closure (typically @<name>//:node_modules from a bun_deps.install tag). When set, bun build runs directly via the toolchain Bun (no js_binary driver, no aspect_rules_js): the closure is symlinked to the execroot root so Bun resolves the import graph by walking up from entry. Mutually exclusive with driver. Pair with srcs (the entry + local modules). | Label | optional | None |
| target | Bun --target: the intended execution environment for the bundle. Defaults to node. | String | optional | "node" |
bun_compile
load("@rules_bun//bun:defs.bzl", "bun_compile")
bun_compile(name, srcs, out, driver, entry, external, node_modules, target)
Compile a JS/TS entry into a standalone native executable (Bun runtime + bundled JS) via bun build --compile. Either Bun-native (node_modules) or the legacy aspect driver path.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| srcs | Bun-native path. The entry file + any local modules it imports, declared as action inputs. Ignored on the legacy driver path. | List of labels | optional | [] |
| out | The standalone executable output. On --target bun-windows-* give it a .exe suffix. | Label | required | |
| driver | LEGACY aspect_rules_js path. A js_binary whose entry point is @rules_bun//bun:bun-build-driver and whose data stages the build entry + its full linked node_modules closure. Mutually exclusive with node_modules; set exactly one. | Label | optional | None |
| entry | Path of the entry point relative to the workspace root (e.g. apps/studio-cli/index.js). | String | required | |
| external | Module names to keep external (--external <name>, repeatable). NOTE: native .node addons are NOT embedded by --compile — list them here and provide the .node files at runtime alongside the produced binary. | List of strings | optional | [] |
| node_modules | Bun-native path. A node_modules closure (typically @<name>//:node_modules from a bun_deps.install tag). When set, bun build --compile runs directly via the toolchain Bun (no js_binary driver, no aspect_rules_js). Mutually exclusive with driver. Pair with srcs. | Label | optional | None |
| target | Bun compile target triple. Empty (the default) compiles for the host platform. Cross-compile values: bun-linux-x64, bun-linux-x64-modern, bun-linux-x64-baseline, bun-linux-arm64, bun-darwin-x64, bun-darwin-arm64, bun-windows-x64, and the *-musl libc variants (e.g. bun-linux-x64-musl). A future enhancement could derive this from the Bazel --platforms via a transition; for v1 pass the string. | String | optional | "" |
bun_test
load("@rules_bun//bun:defs.bzl", "bun_test")
bun_test(name, srcs, data, node_modules)
Run bun test over the listed source files as a Bazel test target.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| srcs | Test files (typically *.test.ts, *.test.js). Each is passed to bun test explicitly so Bazel tracks them as inputs. | List of labels | required | |
| data | Additional runtime inputs (fixtures, bunfig.toml, etc.). | List of labels | optional | [] |
| node_modules | Optional node_modules closure (typically @<name>//:node_modules from a bun_deps.install tag). Staged at the workspace runfiles root as node_modules/ so bun test resolves dependency imports without bun install. The Bun-native replacement for aspect_rules_js’s npm_link_all_packages. | Label | optional | None |
BunBinaryInfo
load("@rules_bun//bun:defs.bzl", "BunBinaryInfo")
BunBinaryInfo(binary, target)
A standalone native executable produced by bun build --compile.
FIELDS
| Name | Description |
|---|---|
| binary | File: the standalone executable. |
| target | string: the Bun compile target triple (empty = host). |
BunBundleInfo
load("@rules_bun//bun:defs.bzl", "BunBundleInfo")
BunBundleInfo(bundle, format)
A single-file bundle produced by bun build.
FIELDS
BunTestInfo
load("@rules_bun//bun:defs.bzl", "BunTestInfo")
BunTestInfo(result)
Result metadata for a bun test run.
FIELDS
bun_run
load("@rules_bun//bun:defs.bzl", "bun_run")
bun_run(name, script, args, **kwargs)
Invoke bun run <script> against the live workspace source.
Escapes the runfiles sandbox via BUILD_WORKSPACE_DIRECTORY so Bun
resolves modules + reads files from the user’s actual source tree.
Intentionally NOT hermetic — that’s bun_test’s job.
PARAMETERS
Module extensions for rules_bun.
Two extensions:
-
bun— auto-fetches a prebuilt Bun binary for the host platform. Versions are sha256-pinned inprivate/known_versions.bzl. Consumers can override via thetoolchaintag class.bun = use_extension("@rules_bun//bun:extensions.bzl", "bun") use_repo(bun, "bun") register_toolchains("@bun//:bun_toolchain_def")Pin a specific version:
bun.toolchain(version = "1.3.14") -
bun_deps— Bun-nativenode_modulesstaging. Eachinstalltag produces a repo@<name>whose:node_modulesfilegroup is abun install --frozen-lockfile-ed tree. The pure-Bun replacement for aspect_rules_js’snpm_translate_lock+npm_link_all_packages(no pnpm-lock, no aspect_rules_js):bun_deps = use_extension("@rules_bun//bun:extensions.bzl", "bun_deps") bun_deps.install( name = "npm", package_json = "//:package.json", lock = "//:bun.lock", ) use_repo(bun_deps, "npm")then
bun_test(node_modules = "@npm//:node_modules", ...)andbun_bundle(node_modules = "@npm//:node_modules", ...).
The actual release fetching is delegated to
@rules_github//github:repositories.bzl%github_binary_repository
so that the URL-shape + sha-pinning logic stays consistent across
all our rules_* repos.
bun
bun = use_extension("@rules_bun//bun:extensions.bzl", "bun")
bun.toolchain(version)
Sets up @bun as a Bazel-fetched prebuilt Bun binary.
TAG CLASSES
toolchain
Attributes
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| version | Override Bun version. Defaults to the value in known_versions.bzl. | String | optional | "" |
bun_deps
bun_deps = use_extension("@rules_bun//bun:extensions.bzl", "bun_deps")
bun_deps.install(name, bun_version, ignore_scripts, install_flags, lock, package_json,
trusted_dependencies)
Bun-native node_modules staging — @<name>//:node_modules from a bun install --frozen-lockfile. Replaces aspect_rules_js’s npm_translate_lock + npm_link_all_packages for pure-Bun repos.
TAG CLASSES
install
Stage a node_modules tree from a package.json + bun.lock.
Attributes
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | Name of the generated repo. Reference its node_modules as @<name>//:node_modules. | Name | required | |
| bun_version | Bun version to fetch for the install. Empty = the toolchain extension’s default. | String | optional | "" |
| ignore_scripts | Skip dependency lifecycle scripts (--ignore-scripts). Default True. | Boolean | optional | True |
| install_flags | Extra raw flags appended to bun install. | List of strings | optional | [] |
| lock | The bun.lock pinning the install (--frozen-lockfile). | Label | required | |
| package_json | The package.json to install from. | Label | required | |
| trusted_dependencies | Packages to --trust (run lifecycle scripts for) even when ignore_scripts is True. | List of strings | optional | [] |
Toolchain rule for rules_bun.
bun_toolchain wraps a single Bun binary as a Bazel toolchain.
Consumers (the bun_test and bun_run rules) resolve Bun through
@rules_bun//bun:toolchain_type, so users can register custom Bun
binaries (locally-built fork, alternate version, baseline-CPU
variant) via register_toolchains(...) without modifying rule
attrs.
The module extension at @rules_bun//bun:extensions.bzl generates a
default toolchain (@bun//:bun_toolchain_def) wrapping the prebuilt
binary. Users register it from MODULE.bazel:
register_toolchains("@bun//:bun_toolchain_def")
bun_toolchain
load("@rules_bun//bun:toolchains.bzl", "bun_toolchain")
bun_toolchain(name, bun)
Declare a Bun binary as a Bazel toolchain.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| bun | Path to the Bun executable. | Label | required |
BunToolchainInfo
load("@rules_bun//bun:toolchains.bzl", "BunToolchainInfo")
BunToolchainInfo(bun)
The Bun binary, resolved via a toolchain.
FIELDS
rules_postgres
API reference, generated from the module’s .bzl docstrings (stardoc).
User-facing rules for rules_postgres.
-
pg_parse_valid_testwraps theparse_checkC binary as ash_testthat gates a.sqlfile against PostgreSQL’s own parser (via libpg_query). Passes iff parse_check exits 0, fails with the parser’s error + cursor position on stderr otherwise. Use this to keep emitted-SQL or hand-written-DDL in sync with what PostgreSQL accepts. -
pg_parse_treeruns thesql_to_protobufC binary on a.sqlfile and captures the marshalledpg_query.ParseResultprotobuf bytes as a.pgpbartifact. This is the single-file convenience macro; multi-file pipelines should usesql_library+sql_ast_libraryfrom@rules_lang//polyglot:sql.bzlinstead.
pg_parse_tree
load("@rules_postgres//postgres:defs.bzl", "pg_parse_tree")
pg_parse_tree(name, sql, out, **kwargs)
Run libpg_query over a .sql file, capture the protobuf AST.
Single-file convenience around @rules_postgres//tools:sql_to_protobuf.
For multi-file pipelines, prefer sql_library + sql_ast_library
from @rules_lang//polyglot:sql.bzl, which use the same C tool via
pg_sql_toolchain and propagate SqlAstInfo so downstream
projections (json, lean, catalog) compose cleanly.
PARAMETERS
| Name | Description | Default Value |
|---|---|---|
| name | genrule target name. | none |
| sql | label of the .sql file to parse. | none |
| out | output filename. Defaults to name + ".pgpb". | None |
| kwargs | forwarded to the underlying genrule. | none |
RETURNS
A .pgpb file whose bytes are exactly the marshalled
pg_query.ParseResult (see @libpg_query//:pg_query.proto).
pg_parse_valid_test
load("@rules_postgres//postgres:defs.bzl", "pg_parse_valid_test")
pg_parse_valid_test(name, sql, **kwargs)
Assert that a SQL file parses cleanly under PostgreSQL’s parser.
PARAMETERS
Module extension for rules_postgres.
Exposes two tag classes:
pg.query(version = …) — fetches libpg_query and builds it as a
cc_library. Creates @libpg_query.
pg.source(version = …) — fetches the full PostgreSQL source tarball
and lays a minimal BUILD overlay on top
(filegroups for source dirs + a probe
pg_common_string cc_library). Creates
@postgres_src.
The two paths are independent. Most consumers want only pg.query for
SQL parse-validation gates; pg.source is for advanced tooling that
needs the full PG codebase under Bazel.
Default usage:
pg = use_extension("@rules_postgres//postgres:extensions.bzl", "pg")
pg.query(version = "17-6.2.2")
use_repo(pg, "libpg_query")
With full PG source as well:
pg.source(version = "17.6")
use_repo(pg, "libpg_query", "postgres_src")
For generating compile_commands.json (consumable by rules_lang’s
c_ast_dump_from_compdb), see pg_meson_configure in
postgres/meson.bzl. That rule runs a hermetic meson setup as a
Bazel build action using rules_foreign_cc’s meson + ninja toolchains.
pg
pg = use_extension("@rules_postgres//postgres:extensions.bzl", "pg")
pg.query(version)
pg.source(lay_overlay, version)
Module extension fetching libpg_query and/or the full PostgreSQL source tree.
TAG CLASSES
query
Pull libpg_query as @libpg_query.
Attributes
source
Pull the PostgreSQL source tarball as @postgres_src.
Attributes
rules_rdf
API reference, generated from the module’s .bzl docstrings (stardoc).
rules_rdf roadmap
Two waypoints between today’s scaffold and a usable abstract RDF toolchain layer. Each waypoint is one published bazel-registry release.
v0.1 — toolchain types + plugin contract + placeholder rules
The goal is for a consumer to be able to declare every planned target
type (rdf_dataset, sparql_query_test, rdf_validate_test,
rdf_transform, rdf_reason) today, against a no-op default
toolchain, then swap in a real implementation (e.g.
rules_jena) without
touching their BUILD files. This makes rules_rdf adoptable
incrementally — consumers can wire their build graph before any
engine is integrated.
Deliverables:
- Plugin contract document at
rdf/plugin_contract.md(draft already in tree). Same shape as rules_jsonschema’splugin_contract.md, adjusted for RDF semantics:- stdin = the RDF document bytes (the dataset; format declared
via
--in-format), not a JSON schema. - argv =
--key=valuepairs (same as jsonschema). Standard flags:--rule-name,--in-format. Per-toolchain flags:--query,--shapes,--out-format,--profile. - stdout = generated output (query results / validation report / converted graph / inferred triples). Same single-file-per- invocation discipline.
- stderr = diagnostics.
- exit = 0 / non-zero.
- stdin = the RDF document bytes (the dataset; format declared
via
- All four toolchain types defined in
//rdf:BUILD.bazel:sparql_engine_toolchain_type,rdf_validator_toolchain_type,rdf_serializer_toolchain_type,rdf_reasoner_toolchain_type. - Providers:
RdfDatasetInfo,RdfEngineToolchainInfo,RdfValidatorToolchainInfo,RdfSerializerToolchainInfo,RdfReasonerToolchainInfo. Each toolchain info wraps a singlebinaryFile, matching the jsonschema pattern. - Default user-facing rules implemented as
_no_opplaceholders:rdf_dataset— real (returnsRdfDatasetInfo; no toolchain needed).sparql_query_test,sparql_query_run,rdf_validate_test,rdf_transform,rdf_reason— declare their toolchain dependency and accept all their final attrs, but the in-repo default toolchain points at a_no_opbinary that writes an empty stdout and exits 0. Consumers can declare targets and they build; swapping inrules_jenamakes them actually run.
- Conformance test driver
rdf_plugin_contract_testcovering the same scenarios as the jsonschema driver —valid_minimal(small dataset round-trips),malformed_input(garbage on stdin → exit non-zero, empty stdout),unknown_flag(rejects unknown argv),determinism(byte-identical stdout on identical invocations). One driver, parameterised by toolchain type. - stardoc for the public surface, with
diff_testfreshness.
Out of scope for v0.1: chained pipelines, real-engine examples, result-set diff helpers.
v0.2 — cross-toolchain wiring + real-engine examples
Once rules_jena is published and registered, rules_rdf grows the
glue that ties multiple toolchains together in one pipeline.
Deliverables:
- Chained pipelines —
rdf_validate_testandsparql_query_testaccept the output ofrdf_reasonas their dataset, so a consumer can express “materialise inferences, then run shape validation on the closure” as a typed build graph. The intermediate inferred graph is a realRdfDatasetInfo-bearing target, not a hidden side effect. - Result-set helpers — a small Starlark helper for the common
zero-row-CSV gate pattern, plus an
rdf_results_diff_testfor golden SPARQL result sets (SRX/JSON normalisation). - Examples directory using a real RDF corpus:
- W3C example datasets
fetched via
http_filewith a pinned sha256 (the same fetch-and-pin discipline rules_docker_compose uses for the compose-spec schema). - One end-to-end smoke target per toolchain type, registered
against
rules_jena.
- W3C example datasets
fetched via
- CI matrix running the conformance test driver against every registered concrete implementation we know about, gating rules_rdf releases on at least one concrete backend passing.
After v0.2 the abstract layer is feature-complete; further work moves into the concrete-implementation repos.
rdf_plugin_contract_test(name, plugin, toolchain_type) runs
the rules_rdf conformance test driver against any executable
claiming to implement the plugin contract for the named toolchain
type. See plugin_contract.md for what the
driver asserts.
Plugin authors gate toolchain registration on it:
load("@rules_rdf//rdf:contract_test.bzl", "rdf_plugin_contract_test")
rdf_plugin_contract_test(
name = "jena_sparql_conforms",
plugin = "//jena:jena_sparql",
toolchain_type = "sparql_engine",
)
The four toolchain types each have their own minimum-valid input
inside the driver; pass the bare name (without the
_toolchain_type suffix or @rules_rdf//rdf: prefix).
rdf_plugin_contract_test
load("@rules_rdf//rdf:contract_test.bzl", "rdf_plugin_contract_test")
rdf_plugin_contract_test(name, plugin, toolchain_type)
Run the rules_rdf conformance test driver against a plugin binary. See plugin_contract.md.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| plugin | The plugin binary to test. Any executable that claims to implement the rules_rdf plugin contract. | Label | required | |
| toolchain_type | Which toolchain type’s scenarios to run: one of sparql_engine, rdf_validator, rdf_serializer, rdf_reasoner. | String | required |
rdf_dataset(name, srcs, in_format) — declare a labeled
collection of RDF files.
This is the single source of “what triples are in this graph?” that every other rule consumes. Carrying both the file depset and the format string up-front lets sparql_query_test / rdf_validate_test / … avoid sniffing extensions at action time and lets consumers mix datasets with declared formats in one BUILD target without ambiguity.
Multi-file datasets are concatenated by the consuming rule in
lexicographic order before being piped to the plugin’s stdin
(see rdf/plugin_contract.md). Consumers that care about ordering
should name files to sort accordingly.
rdf_dataset
load("@rules_rdf//rdf:dataset.bzl", "rdf_dataset")
rdf_dataset(name, deps, srcs, in_format)
A labeled collection of RDF source files + linked-graph deps.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| deps | Other rdf_datasets this graph links to (imported ontologies, vocabulary modules). Their files are folded into this dataset’s transitive_files closure, so reasoning/query over the linked vocabularies resolves. Deps should share in_format (normalize otherwise). | List of labels | optional | [] |
| srcs | RDF source files. Concatenated in lexicographic order by consuming rules before being piped to the plugin’s stdin. | List of labels | required | |
| in_format | Serialization of every file in srcs. Mixed-format datasets aren’t supported in v0.1 — use rdf_transform first. | String | optional | "turtle" |
Providers for the four rules_rdf toolchain types.
Each provider wraps both the executable and the runfiles needed
to invoke it. Carrying runfiles in the provider matters for
plugin implementations that aren’t a single self-contained binary
— py_binary, java_binary, sh_binary all stage helper files via
runfiles. Consuming rules merge the provider’s runfiles into
their own to make the plugin actually executable inside a Bazel
sandbox.
RdfDatasetInfo
load("@rules_rdf//rdf:providers.bzl", "RdfDatasetInfo")
RdfDatasetInfo(files, transitive_files, in_format)
A declared RDF dataset.
FIELDS
RdfReasonerToolchainInfo
load("@rules_rdf//rdf:providers.bzl", "RdfReasonerToolchainInfo")
RdfReasonerToolchainInfo(binary, runfiles, files_to_run)
An RDF inference engine. Resolved by rdf_reason.
FIELDS
RdfSerializerToolchainInfo
load("@rules_rdf//rdf:providers.bzl", "RdfSerializerToolchainInfo")
RdfSerializerToolchainInfo(binary, runfiles, files_to_run)
An RDF format converter. Resolved by rdf_transform.
FIELDS
RdfValidatorToolchainInfo
load("@rules_rdf//rdf:providers.bzl", "RdfValidatorToolchainInfo")
RdfValidatorToolchainInfo(binary, runfiles, files_to_run)
An RDF validator (SHACL today; ShEx in scope for v0.2). Resolved by rdf_validate_test.
FIELDS
SparqlEngineToolchainInfo
load("@rules_rdf//rdf:providers.bzl", "SparqlEngineToolchainInfo")
SparqlEngineToolchainInfo(binary, runfiles, files_to_run)
A SPARQL query engine. Resolved by sparql_query_test and sparql_query_run.
FIELDS
User-facing inference rules.
rdf_reason runs the registered rdf_reasoner toolchain over an
RDF dataset and emits the derived-triples graph (Turtle) as a
build artifact. Unlike sparql_query_test / rdf_validate_test,
this is a regular rule — its output is a file that downstream
rules can declare as a src or data dependency.
load("@rules_rdf//rdf:dataset.bzl", "rdf_dataset")
load("@rules_rdf//reason:defs.bzl", "rdf_reason")
rdf_dataset(name = "ontology", srcs = glob(["*.ttl"]))
rdf_reason(
name = "inferred",
base = ":ontology",
profile = "rdfs",
)
For custom rule sets (Jena RETE rules):
rdf_reason(
name = "inferred",
base = ":ontology",
profile = "custom",
rules = "rules/transitive.rule",
)
The reasoner toolchain implementation decides which profiles are
supported; the abstract layer only validates that profile = "custom" is paired with rules and vice versa.
rdf_reason
load("@rules_rdf//reason:defs.bzl", "rdf_reason")
rdf_reason(name, base, include_base, profile, rules)
Run inference over an RDF dataset; emit the derived-triples graph (Turtle).
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| base | RDF dataset to run inference over. | Label | required | |
| include_base | If True, emit base + derived triples; otherwise only the derived (default). | Boolean | optional | False |
| profile | Reasoning profile. custom requires rules. | String | optional | "rdfs" |
| rules | Custom rule file (Jena RETE syntax). Required iff profile = ‘custom’. | Label | optional | None |
User-facing SPARQL rules.
sparql_query_test is the zero-row gate idiom: declare an
invariant as a SPARQL query whose result set is empty when the
graph satisfies the invariant. CI runs it as a Bazel test; any
non-empty row triggers a failure.
It’s the rules_rdf analog of the production GateZeroRows.java
pattern in the Aion RFC repo’s kg/java/. v0.1 wires the rule
through sparql_engine_toolchain_type; the actual SPARQL
execution comes from whichever concrete toolchain the consumer
registered (rules_jena, a future rules_rdflib, etc.).
load("@rules_rdf//rdf:dataset.bzl", "rdf_dataset")
load("@rules_rdf//sparql:defs.bzl", "sparql_query_test")
rdf_dataset(name = "corpus", srcs = glob(["*.ttl"]))
sparql_query_test(
name = "no_dangling_refs",
dataset = ":corpus",
query = "queries/dangling.rq",
)
sparql_query
load("@rules_rdf//sparql:defs.bzl", "sparql_query")
sparql_query(name, dataset, out_format, query)
Run a SPARQL query and emit the results as a build artifact (the producer counterpart to sparql_query_test’s gate). Turns a reasoned graph into queryable, downstream-consumable data — e.g. grounding tuples for training-data generation.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| dataset | The rdf_dataset (closure) to query. | Label | required | |
| out_format | Result serialization. Tabular (tsv/csv/json/xml) for SELECT/ASK; RDF (turtle/ntriples/…) for CONSTRUCT/DESCRIBE (also yields an rdf_dataset). | String | required | |
| query | The SPARQL query file (SELECT/ASK → tabular; CONSTRUCT/DESCRIBE → graph). | Label | required |
sparql_query_smoke_test
load("@rules_rdf//sparql:defs.bzl", "sparql_query_smoke_test")
sparql_query_smoke_test(name, dataset, queries)
Assert that a set of SPARQL queries all parse + execute against a dataset. The query-smoke gate idiom — catches syntax errors and reference rot after schema changes.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| dataset | An rdf_dataset the queries run against. | Label | required | |
| queries | SPARQL query files. The test passes iff every one parses and executes without error (no row-count assertion — that’s sparql_query_test). | List of labels | required |
sparql_query_test
load("@rules_rdf//sparql:defs.bzl", "sparql_query_test")
sparql_query_test(name, dataset, query)
Run a SPARQL query against an RDF dataset; fail if the result set is non-empty. The zero-row gate idiom.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| dataset | An rdf_dataset whose triples the query runs against. | Label | required | |
| query | The SPARQL query file. Result set must be empty for the test to pass (per --fail-on-nonempty). | Label | required |
Toolchain registration rules for rules_rdf.
One rule per toolchain type. Each takes the plugin binary as a
mandatory exec-config label and exposes the matching *ToolchainInfo
provider with both the binary File and its runfiles bundle.
Concrete plugins (rules_jena, rules_rdflib, …) register via:
sparql_engine_toolchain(
name = "jena_arq_sparql_toolchain",
binary = ":jena_sparql",
)
toolchain(
name = "jena_arq_sparql",
toolchain = ":jena_arq_sparql_toolchain",
toolchain_type = "@rules_rdf//rdf:sparql_engine_toolchain_type",
)
rdf_reasoner_toolchain
load("@rules_rdf//rdf:toolchains.bzl", "rdf_reasoner_toolchain")
rdf_reasoner_toolchain(name, binary)
Declare an RDF reasoner (inference) toolchain.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| binary | The plugin executable. Must conform to the contract in rdf/plugin_contract.md. | Label | required |
rdf_serializer_toolchain
load("@rules_rdf//rdf:toolchains.bzl", "rdf_serializer_toolchain")
rdf_serializer_toolchain(name, binary)
Declare an RDF serializer (format-converter) toolchain.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| binary | The plugin executable. Must conform to the contract in rdf/plugin_contract.md. | Label | required |
rdf_validator_toolchain
load("@rules_rdf//rdf:toolchains.bzl", "rdf_validator_toolchain")
rdf_validator_toolchain(name, binary)
Declare an RDF validator toolchain.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| binary | The plugin executable. Must conform to the contract in rdf/plugin_contract.md. | Label | required |
sparql_engine_toolchain
load("@rules_rdf//rdf:toolchains.bzl", "sparql_engine_toolchain")
sparql_engine_toolchain(name, binary)
Declare a SPARQL engine toolchain.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| binary | The plugin executable. Must conform to the contract in rdf/plugin_contract.md. | Label | required |
User-facing format-conversion rule.
rdf_transform re-serializes an RDF dataset into a different
format via the registered rdf_serializer toolchain. The output
is a regular build artifact.
load("@rules_rdf//rdf:dataset.bzl", "rdf_dataset")
load("@rules_rdf//transform:defs.bzl", "rdf_transform")
rdf_dataset(name = "src_turtle", srcs = ["data.ttl"], in_format = "turtle")
rdf_transform(
name = "data_ntriples",
dataset = ":src_turtle",
out_format = "ntriples",
)
Output filename = <name>.<ext> where <ext> is the canonical
extension for out_format (.ttl, .nt, .nq, .trig,
.jsonld, .rdf).
rdf_transform
load("@rules_rdf//transform:defs.bzl", "rdf_transform")
rdf_transform(name, dataset, out_format)
Convert an RDF dataset between serializations.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| dataset | RDF dataset to convert. | Label | required | |
| out_format | Target serialization. | String | required |
User-facing RDF validation rules.
rdf_validate_test runs a SHACL shapes graph against an RDF
dataset and fails the build if any violations are reported.
Resolves through rdf_validator_toolchain_type so the actual
SHACL engine is pluggable (rules_jena’s
org.apache.jena.shacl.ShaclValidator, a future
rules_pyshacl, …).
load("@rules_rdf//rdf:dataset.bzl", "rdf_dataset")
load("@rules_rdf//validate:defs.bzl", "rdf_validate_test")
rdf_dataset(name = "ontology", srcs = glob(["ontology/*.ttl"]))
rdf_validate_test(
name = "ontology_conforms",
dataset = ":ontology",
shapes = "shapes.ttl",
)
ShEx support is in scope for v0.2 (the toolchain contract leaves
room for it via the --shapes-language arg, but for v0.1 the
shapes file is assumed Turtle-encoded SHACL).
rdf_validate_test
load("@rules_rdf//validate:defs.bzl", "rdf_validate_test")
rdf_validate_test(name, dataset, severity, shapes)
Validate an RDF dataset against a SHACL shapes graph.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| dataset | An rdf_dataset to validate. | Label | required | |
| severity | Minimum severity that fails the build. | String | optional | "violation" |
| shapes | SHACL shapes graph (Turtle). | Label | required |
rules_jena
API reference, generated from the module’s .bzl docstrings (stardoc).
rules_jena roadmap
Three releases get from “scaffold” to a full Jena-backed implementation
of every rules_rdf toolchain type, plus a small set of user-facing
convenience rules. Each row points at the production pattern in
~/Documents/rfcs/kg/java/ that gets ported; see
SOURCES.md for the full catalog.
v0.1 (next)
Stand up the shared library and the first toolchain. Goal: prove
rules_rdf’s contract works for a Jena backend end-to-end.
- Port
Loader.java(corpus-loading) andWriter.java(deterministic Turtle) plus their tests (WriterTest.java) into a public:rdf_iojava_libraryat the rules_jena root. - Implement one toolchain — the SPARQL engine — as a
java_binaryregistered underrules_rdf’ssparql_engine_toolchain_type:- Read the SPARQL query path from argv (
--query=...). - Read the data graph as Turtle from stdin.
- Emit results (TSV or JSON) on stdout, diagnostics on stderr, non-zero exit on parse / execution failure.
- Same shape as the
gate_query_smoketarget inkg/java/BUILD.bazel.
- Read the SPARQL query path from argv (
- Gate the toolchain binary with
rules_rdf’s plugin-contract conformance test (the analog ofjsonschema_plugin_contract_test). - Register the default toolchain in
MODULE.bazel. - Maven coordinates for
jena-arq/jena-core/jena-base/jena-iri/slf4j-simpledeclared inline (norules_jvm_externalpin yet — defer to v0.2 once the full set of binaries is in scope).
v0.2
Round out the toolchain implementations. Goal: every rules_rdf
toolchain type has a Jena-backed default.
- Port
GateHarness.java,Gates.java,GateZeroRows.java,GateShacl.java,GateQuerySmoke.javaas additional toolchain-backing binaries — one per gate shape. - Implement the SHACL validator toolchain on top of
org.apache.jena.shacl.ShaclValidator(mirroringGateShacl). - Implement the RDF serializer toolchain on top of
RDFDataMgrplus theWriter.javainvariants. - Implement the OWL reasoner toolchain via
ReasonerRegistry.getOWLMicroReasoner()(the same callKgReasonermakes in production). - Pin Maven artifacts via
rules_jvm_external. Singlemaven.installblock inMODULE.bazel; a lockedmaven_install.jsonchecked in. Removes the hand-rolled@maven//:org_apache_jena_*references that consumers maintain today. - Smoke fixture using a tiny pinned ontology
(a few classes, a few SHACL shapes, a handful of
.rqfiles) so every toolchain gets an end-to-end test under//examples/smoke.
v0.3
Expose the higher-level patterns the corpus uses every day. Goal: a
downstream consumer can replace kg/java/ with a thin BUILD file
that loads from rules_jena.
- Extract
kg/lint/patterns into a reusablejena_lintrule — orphan / consistency checks driven by a user-supplied query set. - Extract
kg/rules/patterns into ajena_reasonrule — runs a pinned set of Jena rule files over aDatasetand emits a deterministic inferred Turtle output (mirroringkg_reasoner --check). - Provide a
jena_corpusmacro that takes an ontology dir, a TTL glob, a queries dir and stitches together the gatetest_suitethe corpus uses today.
Source-of-truth patterns to port
The Jena patterns rules_jena packages as toolchains all exist in
production today, in the Aion RFC knowledge-graph tree at
~/Documents/rfcs/kg/java/. This file catalogs which files we extract
and what each one teaches.
Reference BUILD files (read-only sources of the wiring):
~/Documents/rfcs/kg/java/BUILD.bazel— theJENA_DEPSlist, the:loader,:writer,:gate_harnessjava_librarydeclarations, plus thegate_*java_tests.~/Documents/rfcs/kg/java/reasoner/BUILD.bazel— thekg_reasonerjava_binary(OWL-MICRO inference).
Files
File (under ~/Documents/rfcs/kg/java/) | What it teaches |
|---|---|
Loader.java | Single shared library that loads every TTL under a corpus root into one in-memory Jena Dataset. Every downstream binary depends on this — port becomes the public :rdf_io library at the rules_jena root. |
Writer.java | Deterministic Turtle serializer. Stable prefix ordering, stable blank-node labels, byte-stable round-trip. The invariants documented in the file header become the rules_jena serializer toolchain’s contract. |
WriterTest.java | Round-trip + parse-equivalence test: write(load(write(model))) byte-equals write(model) and the result is isomorphic to the input. Ports as the rules_jena serializer-toolchain conformance test. |
GateHarness.java | Orchestrates a set of SPARQL zero-row checks plus a SHACL conformance check against one Dataset. The “compose multiple gates into one suite” pattern. |
Gates.java | Shared query-plumbing helpers (load a .rq from disk, execute against a Dataset, collect results). Used by every Gate* binary; ports as a private helper for the SPARQL + SHACL toolchain binaries. |
GateZeroRows.java | The “run one .rq and fail if it returns >0 rows” gate shape. Generalizes to the rules_jena SPARQL toolchain’s zero-row mode. |
GateQuerySmoke.java | The “every .rq under a dir parses + executes” gate shape. The conformance-test analog for SPARQL toolchains. |
GateShacl.java | The “load shapes.ttl + data, run ShaclValidator, fail on non-conforming” gate shape. Becomes the rules_jena SHACL validator toolchain core. |
reasoner/KgReasoner.java | OWL-MICRO inference via ReasonerRegistry.getOWLMicroReasoner(), plus a Jena rule-file driver. Deterministic, idempotent output. Becomes the rules_jena OWL reasoner toolchain. |
What we do not port (yet)
These exist in the kg/java/ tree but are corpus-specific, not generic
Jena tooling — they belong in a downstream consumer, not in
rules_jena:
KgReport.java,CrossCutting.java— Aion-specific reporting.AionPaths.java— XDG/macOS/Windows config paths (not Jena).BuildSummary.java—SUMMARY.mddrift gate (not Jena).- Anything under
kg/java/edit/,kg/java/lint/,kg/java/metrics/,kg/java/research/— corpus-specific CLIs. The reusable shapes inside them (lint rules, metric formulas) land asjena_lint/jena_reasonin v0.3.
jena_dataset(name, default_graph, named_graphs) — a Jena
Dataset: a default graph plus a set of named graphs addressable
by IRI.
Provider-only. Composes jena_model labels. Like jena_model,
also emits RdfDatasetInfo (the union of all triples across the
default + named graphs) so rules_rdf rules consume it transparently.
load("@rules_jena//jena:defs.bzl", "jena_model", "jena_dataset")
jena_model(name = "core", srcs = ["core.ttl"], in_format = "turtle")
jena_model(name = "facts", srcs = ["facts.ttl"], in_format = "turtle")
jena_model(name = "claims", srcs = ["claims.ttl"], in_format = "turtle")
jena_dataset(
name = "corpus",
default_graph = ":core",
named_graphs = {
"http://example.org/g/facts": ":facts",
"http://example.org/g/claims": ":claims",
},
)
Datasets without named graphs are a degenerate case — for those,
use jena_model directly.
jena_dataset
load("@rules_jena//jena:dataset.bzl", "jena_dataset")
jena_dataset(name, default_graph, named_graphs)
A Jena Dataset composed of named-graph jena_models + an optional default graph.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| default_graph | A jena_model whose triples form the dataset’s default graph (unnamed). Optional. | Label | optional | None |
| named_graphs | Map of graph IRI → jena_model label. Each entry becomes a named graph in the resulting Dataset. | Dictionary: String -> Label | optional | {} |
Public API surface for rules_jena.
Re-exports the v0.2 user-facing rules (Bazel-idiomatic Jena data
primitives) + the JENA_DEPS Maven label set shared with anyone
writing their own Jena java_binary.
load("@rules_jena//jena:defs.bzl",
"JENA_DEPS",
"jena_model", "jena_dataset", "jena_rule_set", "jena_reasoner",
"JenaModelInfo", "JenaDatasetInfo", "JenaRuleSetInfo", "JenaReasonerInfo")
Pair with the rules_rdf user-facing test rules (sparql_query_test,
rdf_validate_test) — jena_model / jena_dataset emit both
JenaModelInfo / JenaDatasetInfo AND RdfDatasetInfo, so they’re
drop-in replacements for rdf_dataset in any rules_rdf rule.
rules_jena’s MODULE.bazel auto-registers four toolchains
satisfying every rules_rdf toolchain type — pulling in
rules_jena is enough to run any of sparql_query_test,
rdf_validate_test, rdf_transform, rdf_reason. v0.2’s
jena_reason build action is the consumer-facing alternative
when a downstream rule wants a concrete file artifact instead of
the test-shaped rdf_reason.
jena_dataset
load("@rules_jena//jena:defs.bzl", "jena_dataset")
jena_dataset(name, default_graph, named_graphs)
A Jena Dataset composed of named-graph jena_models + an optional default graph.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| default_graph | A jena_model whose triples form the dataset’s default graph (unnamed). Optional. | Label | optional | None |
| named_graphs | Map of graph IRI → jena_model label. Each entry becomes a named graph in the resulting Dataset. | Dictionary: String -> Label | optional | {} |
jena_model
load("@rules_jena//jena:defs.bzl", "jena_model")
jena_model(name, srcs, base_iri, in_format)
A Jena Model (single RDF graph) declared as Bazel data.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| srcs | Source RDF files for this single graph. Concatenated in lexicographic order by Jena tools. | List of labels | required | |
| base_iri | Optional base IRI for resolving relative references in srcs. Empty = none. | String | optional | "" |
| in_format | Serialization of every file in srcs. Mixed formats aren’t supported — pipe through rdf_transform first if you need to combine. | String | optional | "turtle" |
jena_reasoner
load("@rules_jena//jena:defs.bzl", "jena_reasoner")
jena_reasoner(name, profile, rule_set)
A Jena reasoner configuration (provider-only).
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| profile | Built-in profile name or custom. custom requires rule_set. | String | optional | "rdfs" |
| rule_set | A jena_rule_set label. Required iff profile = ‘custom’. | Label | optional | None |
jena_rule_set
load("@rules_jena//jena:defs.bzl", "jena_rule_set")
jena_rule_set(name, rules)
A set of Jena rule files for the rule-engine reasoner.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| rules | Jena rule files. Each must follow the rule-engine syntax at https://jena.apache.org/documentation/inference/#rules. | List of labels | required |
JenaDatasetInfo
load("@rules_jena//jena:defs.bzl", "JenaDatasetInfo")
JenaDatasetInfo(default_graph, named_graphs)
A Jena Dataset (collection of named graphs + an optional default graph). Used by rules that need named-graph addressability (Fuseki, multi-graph SPARQL).
FIELDS
| Name | Description |
|---|---|
| default_graph | JenaModelInfo | None: triples that live outside any named graph. |
| named_graphs | dict[str, JenaModelInfo]: graph IRI → model. Order-preserving. |
JenaModelInfo
load("@rules_jena//jena:defs.bzl", "JenaModelInfo")
JenaModelInfo(files, in_format, base_iri)
A single Jena Model (RDF graph). Provider-only — the files declared on the rule remain the source of truth.
FIELDS
JenaReasonerInfo
load("@rules_jena//jena:defs.bzl", "JenaReasonerInfo")
JenaReasonerInfo(profile, rule_set)
A Jena reasoner configuration. Either a built-in profile (rdfs, owl-rl, owl-mini, owl-micro) or a custom rule set; never both. Consumed by jena_reason and by the rdf_reasoner_toolchain_type plugin contract.
FIELDS
| Name | Description |
|---|---|
| profile | str: built-in profile name, or empty if custom. |
| rule_set | JenaRuleSetInfo | None: rule set for the custom profile. |
JenaRuleSetInfo
load("@rules_jena//jena:defs.bzl", "JenaRuleSetInfo")
JenaRuleSetInfo(files)
A set of Jena rule files consumed by the rule-engine reasoner (Jena’s RETE-based forward/backward inference). See https://jena.apache.org/documentation/inference/ for the rule syntax. Distinct from SPARQL .rq files.
FIELDS
jena_model(name, srcs, in_format, base_iri) — declare one RDF
graph as a Jena-aware data primitive.
Provider-only: no Bazel actions, no parsed-form artifacts. The
srcs files remain the source of truth; downstream rules either
read them directly or feed them to a Java tool that parses them
into an in-memory Model.
Every jena_model ALSO emits RdfDatasetInfo (the abstract
provider from rules_rdf) so it’s a drop-in dataset for any
rules_rdf rule:
load("@rules_jena//jena:defs.bzl", "jena_model")
load("@rules_rdf//sparql:defs.bzl", "sparql_query_test")
jena_model(
name = "ontology",
srcs = ["ontology.ttl"],
in_format = "turtle",
)
sparql_query_test( # works: resolves via RdfDatasetInfo
name = "ontology_well_formed",
dataset = ":ontology",
query = "queries/check.rq",
)
For named-graph use cases see jena_dataset.
jena_model
load("@rules_jena//jena:model.bzl", "jena_model")
jena_model(name, srcs, base_iri, in_format)
A Jena Model (single RDF graph) declared as Bazel data.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| srcs | Source RDF files for this single graph. Concatenated in lexicographic order by Jena tools. | List of labels | required | |
| base_iri | Optional base IRI for resolving relative references in srcs. Empty = none. | String | optional | "" |
| in_format | Serialization of every file in srcs. Mixed formats aren’t supported — pipe through rdf_transform first if you need to combine. | String | optional | "turtle" |
Provider types for the rules_jena public API.
The data primitives (jena_model, jena_dataset, jena_rule_set,
jena_reasoner) are provider-only — they carry references to
files + small Jena-shaped config, no build actions. Build-action
rules (jena_reason, the rules_rdf-driven rdf_validate_test,
etc.) consume them.
Every data-providing rule also emits the abstract RdfDatasetInfo
from rules_rdf, so jena_model / jena_dataset are drop-in
replacements for rdf_dataset in any rules_rdf rule. Consumers
who want Jena-aware features (named graphs, rule sets, OWL
profiles) reach for the Jena providers; everyone else stays on
the abstract interface.
The names use the package-prefixed convention (JenaXInfo) so
that an unwrapped JenaModelInfo import is unambiguous next to
the rules_rdf RdfDatasetInfo.
JenaDatasetInfo
load("@rules_jena//jena:providers.bzl", "JenaDatasetInfo")
JenaDatasetInfo(default_graph, named_graphs)
A Jena Dataset (collection of named graphs + an optional default graph). Used by rules that need named-graph addressability (Fuseki, multi-graph SPARQL).
FIELDS
| Name | Description |
|---|---|
| default_graph | JenaModelInfo | None: triples that live outside any named graph. |
| named_graphs | dict[str, JenaModelInfo]: graph IRI → model. Order-preserving. |
JenaModelInfo
load("@rules_jena//jena:providers.bzl", "JenaModelInfo")
JenaModelInfo(files, in_format, base_iri)
A single Jena Model (RDF graph). Provider-only — the files declared on the rule remain the source of truth.
FIELDS
JenaReasonerInfo
load("@rules_jena//jena:providers.bzl", "JenaReasonerInfo")
JenaReasonerInfo(profile, rule_set)
A Jena reasoner configuration. Either a built-in profile (rdfs, owl-rl, owl-mini, owl-micro) or a custom rule set; never both. Consumed by jena_reason and by the rdf_reasoner_toolchain_type plugin contract.
FIELDS
| Name | Description |
|---|---|
| profile | str: built-in profile name, or empty if custom. |
| rule_set | JenaRuleSetInfo | None: rule set for the custom profile. |
JenaRuleSetInfo
load("@rules_jena//jena:providers.bzl", "JenaRuleSetInfo")
JenaRuleSetInfo(files)
A set of Jena rule files consumed by the rule-engine reasoner (Jena’s RETE-based forward/backward inference). See https://jena.apache.org/documentation/inference/ for the rule syntax. Distinct from SPARQL .rq files.
FIELDS
jena_reasoner(name, profile|rule_set) — declare a reasoner
configuration.
Provider-only. Either a built-in profile or a custom rule set; the rule rejects both-or-neither.
Built-in profiles map onto Jena’s
ReasonerRegistry:
profile | Jena equivalent |
|---|---|
rdfs | ReasonerRegistry.getRDFSReasoner() |
owl-rl | ReasonerRegistry.getOWLReasoner() |
owl-mini | ReasonerRegistry.getOWLMiniReasoner() |
owl-micro | ReasonerRegistry.getOWLMicroReasoner() |
custom | GenericRuleReasoner with the given rule_set. |
The Aion production kg_reasoner uses owl-micro plus
purpose-written Jena rule files; both shapes have first-class
support here.
load("@rules_jena//jena:defs.bzl", "jena_rule_set", "jena_reasoner")
jena_rule_set(name = "kg_rules", rules = glob(["rules/*.rule"]))
jena_reasoner(name = "owl_micro_plus_kg", profile = "custom", rule_set = ":kg_rules")
To actually apply the reasoner to a base model, use jena_reason
(which runs a build action) or the abstract rdf_reason rule
from rules_rdf (which resolves the reasoner toolchain).
jena_reasoner
load("@rules_jena//jena:reasoner.bzl", "jena_reasoner")
jena_reasoner(name, profile, rule_set)
A Jena reasoner configuration (provider-only).
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| profile | Built-in profile name or custom. custom requires rule_set. | String | optional | "rdfs" |
| rule_set | A jena_rule_set label. Required iff profile = ‘custom’. | Label | optional | None |
jena_rule_set(name, rules) — a collection of Jena rule files
for the rule-engine reasoner.
Provider-only. Consumed by jena_reasoner(profile = "custom") and
by the rdf_reasoner_toolchain_type plugin contract when the
plugin’s --rules flag points at a file from this set.
Jena rule syntax (the RETE forward-chainer’s input — distinct from SPARQL):
@prefix ex: <http://example.org/> .
[transitiveSubOrg:
(?a ex:partOf ?b),
(?b ex:partOf ?c)
-> (?a ex:partOf ?c)
]
See https://jena.apache.org/documentation/inference/#rules. The
file extension is .rule by convention; .txt is tolerated.
jena_rule_set
load("@rules_jena//jena:rules.bzl", "jena_rule_set")
jena_rule_set(name, rules)
A set of Jena rule files for the rule-engine reasoner.
ATTRIBUTES
| Name | Description | Type | Mandatory | Default |
|---|---|---|---|---|
| name | A unique name for this target. | Name | required | |
| rules | Jena rule files. Each must follow the rule-engine syntax at https://jena.apache.org/documentation/inference/#rules. | List of labels | required |