Skip to content

OpenTelemetry metrics for signature verification duration #737

@dahlia

Description

@dahlia

Summary

Add duration metrics for signature verification so operators can tell whether request authentication contributes to inbox latency.

This issue narrows #316 to verification latency. It complements #619, which covers signature verification failure counters but not timing.

Current state

Fedify already creates spans for http_signatures.verify, ld_signatures.verify, and object_integrity_proofs.verify. The inbox span event also records whether HTTP Signatures and Linked Data Signatures were verified, plus HTTP signature failure reasons.

Those spans help debug individual requests. They do not provide sample-independent latency metrics for questions like:

  • What is HTTP Signature verification p95?
  • Are key-fetch failures making verification slower?
  • Is Object Integrity Proof verification a noticeable part of inbox latency?

Proposed solution

Once #619 adds metrics support, add a histogram for signature verification duration.

Proposed instrument:

  • activitypub.signature.verification.duration: histogram, recording verification duration in milliseconds.

Proposed attributes:

  • activitypub.signature.kind: http, linked_data, or object_integrity
  • activitypub.signature.result: verified, rejected, missing, or error
  • http_signatures.failure_reason, only for HTTP Signature failures where the existing reason is available
  • http_signatures.algorithm, ld_signatures.type, or object_integrity_proofs.cryptosuite, only when the value comes from a small known set

Do not include key IDs, actor IDs, request URLs, or object IDs as metric attributes. Those belong on spans, not metrics.

For HTTP Signatures, measure the full verification path as seen by Fedify, including local key lookup and remote key fetches. This is the latency added to inbox handling. If maintainers later need crypto-only timing, add a separate metric rather than changing this one.

Scope

  • Instrument HTTP Signature verification through verifyRequestDetailed() or the equivalent internal verification path.
  • Instrument Linked Data Signature verification through verifySignature().
  • Instrument Object Integrity Proof verification through verifyProof().
  • Avoid double-counting wrapper APIs such as verifyRequest(), verifyObject(), or higher-level inbox handling when they call lower-level verification functions.
  • Update docs/manual/opentelemetry.md with metric names, units, and attributes.

Acceptance criteria

  • Signature verification duration metrics are emitted for HTTP Signatures, Linked Data Signatures, and Object Integrity Proofs.
  • Success, rejection, missing-signature, and thrown-error paths are classified consistently. For Linked Data Signatures and Object Integrity Proofs, omit states that cannot occur on that code path.
  • Metrics avoid high-cardinality attributes such as key IDs, actor IDs, and URLs.
  • Tests cover at least one successful verification and one failed verification for HTTP Signatures.
  • Documentation explains whether the HTTP Signature duration includes key fetching.

Open questions

  • Should the histogram be a single instrument with activitypub.signature.kind, or separate instruments for HTTP Signatures, Linked Data Signatures, and Object Integrity Proofs?
  • Should key fetch duration be split into a separate metric now, or is the existing activitypub.fetch_key span enough for this milestone?
  • Should missing signatures be recorded as duration observations, or counted only by the failure counter from OpenTelemetry metrics and span events for federation health #619?

Metadata

Metadata

Assignees

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions