Skip to content

Billing simplification — DX-focused refactor (umbrella tracker) #1540

@whoAbhishekSah

Description

@whoAbhishekSah

Goal

Make Frontier's Stripe billing easy to read, modify, and integrate with — for new contributors, existing devs, and microservice / SDK integrators. This is a developer-experience pass, not a feature-completeness pass.

Why now

A new dev currently has to internalise 17 domain types, ~38 billing handler RPCs, two checkout codepaths that produce subscriptions through different logic, Stripe SubscriptionSchedule phase arithmetic, plan inference from Stripe subscription items, and a tangled admin-vs-tenant surface — before they can confidently modify anything. Service-layer unit-test coverage for the four largest billing services (subscription, customer, invoice, checkout) is zero today because Stripe is tightly coupled.

What changes (high-level)

  • Provider interface. All Stripe calls move behind billing/provider.Provider. A NoopProvider ships so contributors can run billing locally without a Stripe key, signing secret, or webhook tunnel.
  • Single subscription activation path. CreateCheckout and DelegatedCheckout collapse to one internal SubscriptionService.Activate(); checkout becomes a payment-collection method, not a subscription-creation method.
  • Local plan_id authoritative. Subscription stops inferring its plan by intersecting Stripe subscription-item product IDs (the most fragile code in the system).
  • Flat subscription state. Stripe SubscriptionSchedule arithmetic moves behind the Provider; service layer deals in CurrentPlanID / NextPlanID / ChangeEffectiveAt.
  • Admin vs tenant taxonomy. Six catalogue-mutation RPCs that live on FrontierService but check IsSuperUser move to AdminService. Every billing RPC gets an @auth proto comment.
  • RPC consolidation. ~8–10 duplicate RPCs deprecated, then removed.
  • Data-modelling fixes. Explicit provider_registered_at state replaces the ProviderID == \"\" sentinel; billing_prices becomes append-only (immutable prices); pending_invoice_items flush queue prevents lost line items on Stripe outages.
  • Docs. Stripe ownership diagram, integration cookbook, subscription state machine, authorization table.
  • Service-layer tests. ≥ 60 % coverage for the four big services, enabled by NoopProvider.

Tasks (23 total)

Independent — startable now:

  • T3. Magic strings → typed constants (\"frontier\", seat behaviours, intervals, metadata keys)
  • T4. Fix four known billing TODOs + emit audit records on subscription mutations
  • T5. Authorization audit — document required role per RPC; add @auth proto comments
  • T6. Stripe ownership boundary doc — what Frontier owns vs Stripe vs dual
  • T7a. Move handler business logic to services (non-checkout pieces); dedupe duplicate transforms
  • T8. `billing/provider` interface — Phase 1: `CustomerProvider` extraction

Startable after one open-question check:

  • T1. RPC inventory + deprecation annotations (verify `CreatePlan` upsert behaviour first)
  • T2. Stop `PlanID` inference from Stripe items (audit groups-deprecation overlap first)

Provider-interface chain:

  • T9. `billing/provider` interface — Phase 2: Subscription / Invoice / Checkout providers
  • T10. `NoopProvider` + `billing.provider=noop` config
  • T11. Unify `CreateCheckout` + `DelegatedCheckout` into one activation path
  • T12. Flatten subscription state machine (CurrentPlanID / NextPlanID / ChangeEffectiveAt)
  • T13. Break up `checkout.Create()` (282 lines) and `checkout.Apply()` (187 lines)
  • T14. Break up `ChangePlan()` (182) / `UpdateProductQuantity()` (134); dedupe `shouldChange*Quantity`
  • T18. Service-layer unit tests via NoopProvider (≥ 60 % coverage)
  • T20. State-machine + checkout-flow + reconciliation diagrams
  • T21. Explicit `provider_registered_at` state for Customer / Subscription
  • T22. Immutable Prices — treat updates as new rows
  • T23. `pending_invoice_items` flush queue (durable line-item enqueue)

Gated behind RPC inventory or other deps:

  • T7b. Move plan-change conflict validation from handler to service (after T11)
  • T15. Merge `Usage` into `Credit.Transaction` (decision-gated: usage strategic?)
  • T16. Remove deprecated RPCs (after T1 + one SDK release cycle)
  • T17. Admin vs Non-Admin taxonomy — rename / split service; move misclassified RPCs
  • T19. Integration cookbook — six common jobs, Go + JS snippets

Quantified targets

Artefact Today Target
`billing/subscription/service.go` 1,194 lines < 600
`billing/checkout/service.go` 1,066 lines < 600
Service files importing `*stripe.Client` 5 0
Stripe-schedule helpers in service layer 6 0
`"frontier"` magic strings in billing service code 17 0
Service-layer unit-test coverage (4 services) 0 % ≥ 60 %
`FrontierService` billing RPCs checking `IsSuperUser` 6 0
Handler RPCs in `billing_*.go` ~38 ≤ 30

Out of scope (deferred to follow-up milestones)

Materialised entitlements (perf), outbound billing webhooks (after `core/webhook/` spike), explicit entitlement model, usage event pipeline, billing event-sourcing, currency on Transaction, credit expiry, dunning, MRR / churn reporting, inbound webhook DLQ, swapping Stripe for another production provider.

Open questions to resolve

  1. `CreatePlan` actually upsert safely today? (gates T1)
  2. Any billing code touching `core/group/` that breaks on group deprecation? (gates T2 / T12)
  3. Backwards-compat policy: rename RPCs through deprecation, or alias forever? (sizes T17)
  4. Usage-based billing strategic in next two quarters? (gates T15)
  5. Multi-currency on roadmap in next two quarters? (currently out of scope)
  6. `AdminService/RevertBillingUsage` uses `IsAuthorized(Platform, …)` instead of `IsSuperUser` like sibling admin RPCs — intentional or oversight? (resolved as part of T17)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions