cache: publish cached closure narinfos by amaanq · Pull Request #127 · manic-systems/circus

amaanq · 2026-06-24T19:48:38Z

Cached builds that reused an existing output only published narinfos for the primary output, so clients could miss dependencies from the project cache.

Publish signed narinfos for the whole closure, keep project ownership separate for shared store paths, and move recursive lookup and signing out of the scheduler loop.

Cached builds that reused an existing output only published narinfos for the primary output, so clients could miss dependencies from the project cache. Publish signed narinfos for the whole closure, keep project ownership separate for shared store paths, and move recursive lookup and signing out of the scheduler loop. Co-authored-by: Amaan Qureshi <git@amaanq.com>

NotAShelf

Left some comments. Some of them are just a little bit speculative but I'd like to catch things before they explode.

NotAShelf · 2026-06-27T10:38:03Z

@@ -82,6 +83,20 @@ pub async fn upsert(pool: &PgPool, info: UpsertNarInfo<'_>) -> Result<()> {
  .bind(info.project_id)
  .execute(pool)
  .await?;
+
+  if let Some(project_id) = info.project_id {
+    sqlx::query(
+      "INSERT INTO narinfo_cache_projects (store_path, project_id, build_id, \
+       updated_at) VALUES ($1, $2, $3, NOW()) ON CONFLICT (store_path, \
+       project_id) DO UPDATE SET build_id = COALESCE(EXCLUDED.build_id, \
+       narinfo_cache_projects.build_id), updated_at = NOW()",
+    )
+    .bind(info.store_path)
+    .bind(project_id)
+    .bind(info.build_id)
+    .execute(pool)
+    .await?;
+  }


If the first insert succeeds and the second fails, the narinfo_cache row exists without its project association. I feel like this'd silently break narinfo serving for that path until the next build touches it. If I'm correct, wrapping both in a transaction eliminates this partial-write window.

NotAShelf · 2026-06-27T10:41:32Z

+async fn has_signed_persisted_narinfo(
+  pool: &PgPool,
+  store_path: &str,
+  project_id: Option<Uuid>,
+) -> Result<bool, ApiError> {
+  sqlx::query_scalar::<_, bool>(
+    "SELECT EXISTS(SELECT 1 FROM narinfo_cache n WHERE n.store_path = $1 AND \
+     n.sig IS NOT NULL AND btrim(n.sig) != '' AND ($2::uuid IS NULL OR \
+     n.project_id = $2 OR EXISTS (SELECT 1 FROM narinfo_cache_projects ncp \
+     WHERE ncp.store_path = n.store_path AND ncp.project_id = $2)))",
+  )
+  .bind(store_path)
+  .bind(project_id)
+  .fetch_one(pool)
+  .await
+  .map_err(|e| ApiError(circus_common::CiError::Database(e)))
+}


This feels a bit superficial. We're checking btrim(n.sig) != '' rather than verifying the signature cryptographically against the store path fingerprint. "Exploiting" this would require DB write access anyway so it's not particularly a concern right now, but I'd like to consider (now or in a followup PR) validating the signature against the narinfo fingerprint using the server's signing key.

NotAShelf · 2026-06-27T10:43:18Z

+fn narinfo_row_uses_local_nar_route(
+  row: &circus_common::repo::narinfo_cache::NarInfo,
+) -> bool {
+  if row.compression != "none" {
+    return false;
+  }
+  let Some(path) = row.url.strip_prefix("nar/") else {
+    return false;
+  };
+  let Some((object, query)) = path.split_once('?') else {
+    return false;
+  };
+  path::Path::new(object)
+    .extension()
+    .is_some_and(|ext| ext.eq_ignore_ascii_case("nar"))
+    && query.split('&').any(|part| part.starts_with("hash="))
+}


This classifies NAR routes as local purely by URL format (compression == "none", starts with nar/, has ?hash=, ends in .nar). It doesn't validate hash segments are valid nix store hashes, and seems incredibly fragile to me.

NotAShelf · 2026-06-27T10:44:58Z

+fn spawn_succeeded_output_closure_publication(
+  pool: &PgPool,
+  worker_pool: &Arc<WorkerPool>,
+  build_id: Uuid,
+  output_path: &str,
+) {
+  let pool = pool.clone();
+  let worker_pool = Arc::clone(worker_pool);
+  let output_path = output_path.to_owned();
+  tokio::spawn(async move {
+    publish_succeeded_output_closure(
+      &pool,
+      &worker_pool,
+      build_id,
+      &output_path,
+    )
+    .await;
+  });
+}


Panics or runner shutdown silently drop the work because closure publication is spawned as tokio::spawn. The main build dispatch in worker.rs, however, calls persist_closure_narinfos inline (with .await) and thus, we've got a discrepancy. I think it's worth reconciling so maybe consider a JoinSet or tracking completion so panics don't silently drop publication.

amaanq force-pushed the cache-closure-narinfos branch from 04d9936 to 558242d Compare June 24, 2026 20:07

amaanq force-pushed the cache-closure-narinfos branch from 558242d to 9f713ce Compare June 24, 2026 20:39

NotAShelf requested changes Jun 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cache: publish cached closure narinfos#127

cache: publish cached closure narinfos#127
amaanq wants to merge 1 commit into
mainfrom
cache-closure-narinfos

amaanq commented Jun 24, 2026

Uh oh!

NotAShelf left a comment

Uh oh!

NotAShelf Jun 27, 2026

Uh oh!

NotAShelf Jun 27, 2026

Uh oh!

NotAShelf Jun 27, 2026

Uh oh!

NotAShelf Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

amaanq commented Jun 24, 2026

Uh oh!

NotAShelf left a comment

Choose a reason for hiding this comment

Uh oh!

NotAShelf Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

NotAShelf Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

NotAShelf Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

NotAShelf Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants