Using Changesets in a polyglot monorepo

L luke.hsiao.dev ↗

▲ 20 points • 6 comments • by lwhsiao • 5w ago • HN discussion ↗

Pangram verdict · v3.3

We believe that this document is fully human-written

4 %

AI likelihood · overall

Human

100% human-written 0% AI-generated

SEGMENTS · HUMAN 6 of 6

SEGMENTS · AI 0 of 6

WORD COUNT 1,296

PEAK AI % 13% · §6

Analyzed

Apr 21

backend: pangram/v3.3

Segments scanned

6 windows

avg 216 words each

Distribution

100 / 0%

human / AI fraction

Verdict

Human

Pangram v3.3

Article text · 1,296 words · 6 segments analyzed

Human AI-generated

§1 Human · 0%

One of the nice things about working in a smaller business is you can enjoy using things that don’t need to scale to extreme sizes. One example in the software world is monorepos. While monorepos can scale well (see Google, Facebook, and others), doing so requires special tooling and infrastructure. With plain git, you can only go so far. While you can use it, it has meaningful advantages, like being able to make atomic changes that affect many parts of the system in a single commit, which eliminates whole classes of compatibility and integration issues. You can always split a monorepo later (see git-filter-repo).So, suppose you’re a small-to-medium team using a monorepo. Let’s go further and say that this monorepo stores all your company’s code, meaning it spans many different programming languages—it’s a polyglot monorepo. What tool can you use to manage versioning in a consistent way?I argue that changesets is a solid choice, even if it’s primarily focused on the JavaScript/TypeScript ecosystem.Background For any versioning tool, you are typically looking for how to: define what content appears in the changelog/release notes influence the version numbers of the packages automate the commits doing the actual metadata bumps and tagging automate the builds that happen in response changesets assumes per-package semantic versioning (i.e., all packages have their own version). In addition, each package has its own CHANGELOG.md.The changesets team also has a GitHub Action, changesets/action which importantly allows specifying custom scripts for the version and publish commands. That customization is what gives changesets support for polyglot repositories.In changesets, engineers commit “changeset” files to the repository that define what content ends up in changelogs, and what packages versions are bumped (i.e., major, minor, patch).See the changesets documentation for more details.Implementing an automated release process on GitHub I’m a fan of just. I also really like uv scripts. The example below uses both.I’m also going to assume you are in a enterprise setting where all your monorepo is private, not open-source.

§2 Human · 0%

Repository setup My recommended organization (at least at time of writing) is something like the following.. ├── .changeset │ ├── config.json │ └── README.md ├── contrib │ └── utils ├── docker │ └── Dockerfile ├── docs │ ├── package.json │ ├── pnpm-lock.yaml │ ├── ... │ └── pnpm-workspace.yaml ├── Justfile ├── package-lock.json ├── package.json ├── packages │ ├── python-one │ │ ├── ... │ │ └── package.json │ ├── rust-one │ │ ├── ... │ │ └── package.json │ └── rust-two │ │ ├── ... │ │ └── package.json ├── pnpm-workspace.yaml └── third-partyPut all packages in a packages/ directory, no matter what language they are. I also enjoy having documentation as code, so let’s say you have a docs/ directory, too, and that your docs is written in a javascript-based frontend (like Starlight), for the purposes of highlighting a nuance later.Changeset configuration With this setup, you can configure changesets with a proxy pnpm workspace at the root with all your packages.# pnpm-workspace.yaml packages: - "packages/**"And, declare your changesets dependencies:// package.json { "name": "example-monorepo", "private": true, "devDependencies": { "@changesets/changelog-git": "^0.2.0", "@changesets/cli": "^2.29.0" } }You should now also update your .gitignore:node_modules/Because changesets is built for JavaScript, we also need “proxy” package.json files for all of our packages; changesets uses these to perform version bumps.These can be as simple as:// packages/python-one/package.json { "name": "python-one", "version": "0.1.0", "private": true }With this setup, note how we are intentionally trying to exclude our internal docs/ as a pnpm workspace member—we only want to version packages.

§3 Human · 1%

To do so, declare the docs/ directory its own pnpm workspace, otherwise it will try and combine the docs/ dependencies into the root package-lock.json. This can be as simple as:# docs/pnpm-workspace.pyml packages: []Next, we can add our changeset configuration:// .changeset/config.json { "$schema": "https://unpkg.com/@changesets/[email protected]/schema.json", "changelog": "@changesets/changelog-git", "commit": false, "fixed": [], "linked": [], "access": "restricted", "baseBranch": "main", "updateInternalDependencies": "patch", "ignore": [], "privatePackages": { "version": true, "tag": true }, "___experimentalUnsafeOptions_WILL_CHANGE_IN_PATCH": { "onlyUpdatePeerDependentsWhenOutOfRange": true } }Automating releases with GitHub The glue to create polyglot versioning PRs Next, we want to automate our releases. That is, generating the changelog PRs, bumping package metadata, pushing tags, and triggering builds on those tags.Let’s start with our GitHub Workflow definition, and unpack the scripts it calls.name: Release

on: push: branches: - main

concurrency: ${{ github.workflow }}-${{ github.ref }}

permissions: contents: write pull-requests: write

jobs: release: name: Release runs-on: ubuntu-latest outputs: published: ${{ steps.changesets.outputs.published }} steps: - uses: actions/checkout@v6 - uses: actions/setup-node@v4 with: cache: npm

§4 Human · 2%

- uses: astral-sh/setup-uv@v7 - uses: taiki-e/install-action@just - run: npm install - name: Create Release Pull Request or Tag id: changesets uses: changesets/action@v1 with: version: just version publish: npx @changesets/cli publish # I like conventional commits commit: "chore(release): version packages" title: "chore(release): version packages" env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

docker: needs: [release] if: needs.release.outputs.published == 'true' uses: ./.github/workflows/docker.yml secrets: inheritYou might be working why we run a workflow explicitly, rather than using something like on.push.tags as a trigger.It turns out GitHub has two fatal flaws with that intuitive approach (at time of writing). First, if you push >3 tags at once, workflows will not trigger. Unfortunately, this is a relatively common scenario in a monorepo. Second, GitHub’s triggering of on.push.tags is highly unreliable. This unreliability is still present, even if you use a PAT as they instruct.So, instead consider an explicit workflow_call for the purpose as I’ve done here.Setting version: just version is the key to polyglot support.# Version packages based on changesets [doc('Consume changesets: bump versions, update changelogs, sync native version files.')] [group('release')] version: npx @changesets/cli version uv run --script contrib/utils/sync-versions.pyThe meat of the glue for polyglot support then, is how you implement sync-versions.py.The key bit here is we rely on changesets to bump the versions in package.json for us when we call npx @changesets/cli version, but then it is up to us to propagate that version to the respective language’s metadata appropriately.Here is an example that uses pretty naive parsing. You can write something similar (or better!)

§5 Human · 10%

for the languages you use.#!/usr/bin/env -S uv run --script # # /// script # requires-python = ">=3.12" # dependencies = [] # /// # # Sync versions from package.json files (updated by changesets) to native # package manifests (Cargo.toml, pyproject.toml, etc.).

import json import re import subprocess from enum import Enum, auto from pathlib import Path

PACKAGES_DIR = Path(__file__).resolve().parent.parent.parent / "packages"

class SyncResult(Enum): NOT_FOUND = auto() UP_TO_DATE = auto() UPDATED = auto()

def read_package_json(pkg_dir: Path) -> dict | None: """Read and parse a package.json file.""" pkg_json = pkg_dir / "package.json" if not pkg_json.exists(): return None return json.loads(pkg_json.read_text())

def update_cargo_toml(pkg_dir: Path, version: str) -> SyncResult: """Update version in [package] section of Cargo.toml.""" cargo_toml = pkg_dir / "Cargo.toml" if not cargo_toml.exists(): return SyncResult.NOT_FOUND

lines = cargo_toml.read_text().splitlines(keepends=True) in_package_section = False

for i, line in enumerate(lines): stripped = line.strip()

# Track which TOML section we're in if stripped.startswith("["): in_package_section = stripped == "[package]" continue

if in_package_section and stripped.startswith("version"): new_line = re.sub( r'^(\s*version\s*=\s*")([^"]+)(")', rf"\g<1>{version}\3", line, ) if new_line != line: lines[i] = new_line cargo_toml.write_text("".join(lines)) rel = cargo_toml.relative_to(PACKAGES_DIR.parent) print(f" Updated {rel}") return SyncResult.UPDATED return SyncResult.UP_TO_DATE

return SyncResult.

§6 Human · 13%

UP_TO_DATE

def update_pyproject_toml(pkg_dir: Path, version: str) -> SyncResult: """Update version in [project] section of pyproject.toml.""" pyproject = pkg_dir / "pyproject.toml" if not pyproject.exists(): return SyncResult.NOT_FOUND

lines = pyproject.read_text().splitlines(keepends=True) in_project_section = False

for i, line in enumerate(lines): stripped = line.strip()

# Track which TOML section we're in if stripped.startswith("["): in_project_section = stripped == "[project]" continue

if in_project_section and stripped.startswith("version"): new_line = re.sub( r'^(\s*version\s*=\s*")([^"]+)(")', rf"\g<1>{version}\3", line, ) if new_line != line: lines[i] = new_line pyproject.write_text("".join(lines)) rel = pyproject.relative_to(PACKAGES_DIR.parent) print(f" Updated {rel}") return SyncResult.UPDATED return SyncResult.UP_TO_DATE

return SyncResult.UP_TO_DATE

def refresh_lockfiles() -> None: """Refresh all lockfiles under the repo to match updated versions.""" repo_root = PACKAGES_DIR.parent

print("Refreshing lockfiles...")

# Cargo.lock — root workspace + any standalone crate lockfiles cargo_locks = sorted( set(repo_root.glob("Cargo.lock")) | set(PACKAGES_DIR.rglob("Cargo.lock")) ) for cargo_lock in cargo_locks: lock_dir = cargo_lock.parent rel = lock_dir.relative_to(repo_root) or Path(".")