feat: basic graph definitions by BasilRohner · Pull Request #503 · leanprover/cslib

BasilRohner · 2026-04-19T20:41:30Z

This PR introduces basic graph definitions: Graph, SimpleGraph, DiGraph, and SimpleDiGraph, along with coercions and notation that form the basis of algorithmic graph theory in CSLib.

Design. We intentionally diverge from Mathlib's graph definitions, prioritizing representations that support algorithmic reasoning. In graph algorithm design, it is common to manipulate graphs in many ways (e.g., dynamic graph operations such as adding or removing nodes/edges, contracting edges, and contracting vertex sets). As a result, we prioritize a minimal, simple definition of graphs that best aligns with how they are defined in the literature on algorithmic graph theory. In particular, we use set-based definitions of both the vertex and edge sets, along with minimal additional attributes, to reduce early proof obligations and select those whose proofs are closer to their textbook counterparts.

General structures. Here we use Set everywhere, allowing statements about possibly infinite graphs uniformly without enforcing finiteness through extra axioms. We associate labels with edges for the more general multigraph notions (Graph, DiGraph) via the auxiliary Edge and DiEdge records, and abstract the label away for the more restricted notions (SimpleGraph, SimpleDiGraph), which use Sym2 α and α × α directly. The definitions have been tested in practice by further formalizing concepts such as walks (see the GraphAlgorithms Walk module). However, to keep this PR concise, we defer discussion of the Walk design to a later PR.

Comparison with PR #427. PR #427 introduced SimpleGraph with Finset-based vertex and edge sets and edges via Sym2. Here, we use Set throughout, treating finiteness as an orthogonal concern rather than baking it into the type, and introduce a hierarchy comprising both general multigraph structures (Graph, DiGraph) with edge labels and specialized structures (SimpleGraph, SimpleDiGraph) without labels, since multi-edges are disallowed there.

Main definitions.

Graph: (possibly infinite) undirected multigraph
SimpleGraph: (possibly infinite) undirected graph without loops or multi-edges
DiGraph: (possibly infinite) directed graph
SimpleDiGraph: (possibly infinite) directed graph without loops or multi-edges

Future work. We plan to add fundamental graph algorithms building on these definitions.

Co-authored-by: Sorrachai Yingchareonthawornchai sorrachai.cp@gmail.com

merge

ctchou · 2026-04-19T22:07:35Z

I have some questions/comments:
(1) Does the Graph in this PR differ in a substantive way from the Graph in Mathlib.Combinatorics.Graph.Basic? It seems to me that they are basically equivalent to each other. If my understanding is correct, then can't you just define Graph.endpoints and Graph.incidence on mathlib's Graph, if they are the API you prefer?
(2) A Set that is Set.Finite can be converted into a Finset using:
https://leanprover-community.github.io/mathlib4_docs/Mathlib/Data/Set/Finite/Basic.html#Set.Finite.toFinset
I know that this API is "noncomputable". But once you obtain the Finset versions of vertexSet and edgeSet, you can operate on them in a completely "computable" manner. By having both Set and Finset versions of basically the same definition, I'm afraid that you will have to repeat the statements and proofs of many theorems for the two versions.
(3) It seems to me that you can define DiGraph by extending Quivers in Mathlib.Combinatorics.Quiver.Basic, which is so general that I'm pretty sure it can accommodate any notion of "digraph" you have.

Shreyas4991 · 2026-04-19T23:48:01Z

Discussion thread on this PR :
Leanprover Zulip thread link

sorrachai · 2026-04-30T11:10:17Z

+  /-- The endpoint map, sending each edge to its unordered pair of endpoints. -/
+  endpoints : ε → Sym2 α
+  /-- Every endpoint of an edge is a vertex. -/
+  incidence : ∀ e ∈ edgeSet, ∀ v ∈ endpoints e, v ∈ vertexSet


This may cause a junk value problem on endpoints where an endpoint outside the set must be specified.

To avoid the junk value problem, we can define an edge object separately and define the graph directly over the edgeSet. Note that we use endpoints_id as the identifier for the multiplicity of an edge of the same endpoints

structure UndirectedEdge (α β : Type*) where endpoints_id : β endpoints : Sym2 α deriving DecidableEq abbrev Edge := UndirectedEdge structure UndirectedGraph (α β : Type*) where vertexSet : Set α edgeSet : Set (Edge α β) incidence : ∀ e ∈ edgeSet, ∀ v ∈ e.endpoints, v ∈ vertexSet

Thank you for pointing this out and for the discussion in the Zulip thread. I've updated the definitions in line with your proposed solution to avoid this issue.

eric-wieser · 2026-04-30T18:29:43Z

+/-- Typeclass for graph-like structures that have a vertex set. -/
+class HasVertexSet (G : Type*) (V : outParam Type*) where
+  /-- The vertex set of the graph. -/
+  vertexSet : G → V
+
+/-- Typeclass for graph-like structures that have an edge set. -/
+class HasEdgeSet (G : Type*) (E : outParam Type*) where
+  /-- The edge set of the graph. -/
+  edgeSet : G → E


If you make these the canonical spelling then you need to restate all the loopless and incidence theorems in terms of them (and probably rename those fields to be loopless' to avoid stealing the preferred name)

Thanks for pointing this out. I've restated the loopless and incidence theorems in terms of the new typeclasses and renamed the original fields to loopless' to free up the preferred name.

eric-wieser · 2026-04-30T18:31:11Z

Design. We intentionally diverge from Mathlib's graph definitions, ...

Please record these design decisions in the module docstring for future readers, not just the git history / PR title.

sorrachai · 2026-05-04T20:17:52Z

+@[grind]
+structure SimpleGraph (α : Type*) where
+  /-- The finite set of vertices. -/
+  vertexSet : Finset α


Inconsistency on the set definitions: SimpleGraph is defined using Finset, whereas Graph is defined using Set.

Thank you for pointing this out. This design decision was intentional, with the goal of arriving at a definition of SimpleGraph that serves as a canonical graph definition encompassing the most common usage. That said, I agree this introduces an inconsistency, and I've updated it to Set α. Finiteness can be supplied as a separate hypothesis.

sorrachai · 2026-05-04T20:18:09Z

+pairs of distinct vertices. -/
+structure SimpleDiGraph (α : Type*) where
+  /-- The finite set of vertices. -/
+  vertexSet : Finset α


Finset vs. Set

sorrachai · 2026-05-04T20:21:16Z

+  /-- The set of vertices. -/
+  vertexSet : Set α
+  /-- The set of edges. -/
+  edgeSet : Set ε


Avoidable junk value problem using the suggested definitions of Arc (similarly to Edge).

eric-wieser · 2026-05-05T17:32:30Z

-
-/-- A finite simple graph on `α`: finite vertex and edge sets, edges as unordered pairs of
-distinct vertices. -/
+  edgeSet : Set (Edge α β)


Another option here would be Sym2 a -> Set b, which gives the set of edge labels for a given pair of vertices, and the empty set for disconnected nodes

Thanks for the suggestion! We'd prefer to keep the set-based definition since it's closer to the informal G = (V, E) presentation, whereas your version is closer to the relational definition of Mathlib.Combinatorics.Graph.

sorrachai

I am happy with the current definitions now. After a long discussion in the Zulip chat, the design is not bad, but it is similar to Mathlib's definitions of simple undirected graphs, whose development of set-based definitions is still in its early stages. We are happy to collaborate as the library grows.

Aside from the fragmentation of definitions in mathlib, one key principle of cslib is efficient computation, and we may redesign the definitions in the future to support efficient computation and reasoning about complexity. We prefer to have all graph definitions (undirected/directed and simple/non-simple) in one place, and to build other objects on top of the proposed graph definitions with efficient computation in mind.

P.S. This PR was a product of the Lean hackathon event at ETH Zurich. More basic primitives for graph algorithms will follow in subsequent PRs.

chenson2018

I see that @sorrachai has approved this, but I really do not feel that the Zulip thread has converged on accepting this PR, unless I am missing some other discussion in the graph theory channel.

This is decidedly not my area of expertise, but I am very worried about such a large deviation from very active development in Mathlib. I'd rather take this slowly and have a definite consensus or lack thereof rather than merge something too quickly.

sorrachai · 2026-05-06T08:50:49Z

I see that @sorrachai has approved this, but I really do not feel that the Zulip thread has converged on accepting this PR, unless I am missing some other discussion in the graph theory channel.

This is decidedly not my area of expertise, but I am very worried about such a large deviation from very active development in Mathlib. I'd rather take this slowly and have a definite consensus or lack thereof rather than merge something too quickly.

I appreciate your concern. I would like to understand what happens if there is no consensus on this.

Shreyas4991 · 2026-05-06T10:11:20Z

I disagree that these definitions are computationally efficient, far from it. An efficient defintion would use lean arrays to represent neighbouring sets, better yet Vectors. This definition is merely a representational change which can be obtained by constructors on mathlib definitions.

BasilRohner · 2026-05-06T16:55:20Z

Summary. This pulls together the points we've made in the Zulip thread and earlier reviews. The design has two main ideas:

Embedded-set structure. Vertex and edge sets are represented as vertexSet : Set α and edgeSet : Set β, and not using a type. A prime example of what this buys: subgraphs and graph operations stay inside the same type. A subgraph of G : Graph α β is itself a Graph α β, related by ≤. Induced subgraph, edge and vertex deletion, and contraction are all maps Graph α β → Graph α β. No separate Subgraph G type, no coe/spanningCoe pair to choose between, no parallel lattice and API to keep in sync. The Mathlib workaround (SimpleGraph.Subgraph with its two coercions to SimpleGraph ↥verts and SimpleGraph V) has been a known pain point for years.
Consistency across the family. All four graph types (Graph, SimpleGraph, DiGraph, SimpleDiGraph) use the same setup: embedded set-based vertex and edge sets, shared HasVertexSet/HasEdgeSet typeclasses, shared V(G)/E(G) notation, and forgetful maps (SimpleGraph.toGraph, SimpleDiGraph.toDiGraph) that carry properties from the general types down to the specialised ones. Walk machinery is a prime example: VertexSeq, IsWalk, loopErase, takeUntil, dropUntil, rerootCycle get defined once and reused across all four. Mathlib's SimpleGraph.Walk is indexed by a specific SimpleGraph V and doesn't transfer to Graph α β. Mathlib's graph ecosystem doesn't have this uniformity yet: Mathlib.Combinatorics.Graph.Basic (May 2025) uses embedded sets for multigraphs, while the older SimpleGraph V stays type-as-vertex-set, with no canonical map between them.

On compatibility with Mathlib. From what I understand, the concern raised so far is mostly not the design itself but the divergence from Mathlib's current API. We'd like eventual alignment too. Mathlib is already moving this way (Graph α β and a recent PR to add vertexSet to Digraph made by @Shreyas4991 are already steps in that direction), but extending the same approach to SimpleGraph V, for instance by adding a vertexSet, field may require serious refactoring. A lot of API rests on the type-as-vertex-set design, and every change has to be weighed against it. The new Graph α β is also still being filled in and refined for the same reason.

Introducing the definitions prevents stalling and enables active development of graph theory on the CSLib side without waiting for Mathlib's API to adapt. We hope Mathlib will push its graph definitions in this direction over time, though that will mean significant adaptation of existing APIs given the accumulated technical debt. If it does, alignment should be straightforward. If it doesn't, we still think the embedded-set design is much better suited to algorithmic graph theory. Automata in CSLib are a close precedent: developed in parallel with Mathlib's, the two have coexisted productively, each handling the use cases best suited to its scope.

BasilRohner and others added 11 commits April 18, 2026 18:41

feat: define notions of graphs, walks, cycles and paths

5b021d4

updates

a4e1312

only graph

e7d08db

fixed

d12c198

changes

892dae8

removed unfinished code

48ac00d

Delete flake.lock

48e7aea

Delete flake.nix

9449caf

removed hypergraphs

dafe8b4

Merge branch 'main' of github.com:BasilRohner/cslib

c6d777e

merge

fixes

ba8b1b2

BasilRohner requested review from chenson2018 and fmontesi as code owners April 19, 2026 20:41

sorrachai reviewed Apr 30, 2026

View reviewed changes

eric-wieser reviewed Apr 30, 2026

View reviewed changes

sorrachai reviewed May 4, 2026

View reviewed changes

address review feedback

763858b

eric-wieser reviewed May 5, 2026

View reviewed changes

sorrachai approved these changes May 6, 2026

View reviewed changes

sorrachai assigned arademaker May 6, 2026

chenson2018 requested changes May 6, 2026

View reviewed changes

Conversation

BasilRohner commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ctchou commented Apr 19, 2026

Uh oh!

Shreyas4991 commented Apr 19, 2026

Uh oh!

sorrachai Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

BasilRohner May 5, 2026

Choose a reason for hiding this comment

Uh oh!

eric-wieser Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BasilRohner May 5, 2026

Choose a reason for hiding this comment

Uh oh!

eric-wieser commented Apr 30, 2026

Uh oh!

sorrachai May 4, 2026

Choose a reason for hiding this comment

Uh oh!

BasilRohner May 5, 2026

Choose a reason for hiding this comment

Uh oh!

sorrachai May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sorrachai May 4, 2026

Choose a reason for hiding this comment

Uh oh!

eric-wieser May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BasilRohner May 5, 2026

Choose a reason for hiding this comment

Uh oh!

sorrachai left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chenson2018 left a comment

Choose a reason for hiding this comment

Uh oh!

sorrachai commented May 6, 2026

Uh oh!

Shreyas4991 commented May 6, 2026

Uh oh!

BasilRohner commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

BasilRohner commented Apr 19, 2026 •

edited

Loading

eric-wieser Apr 30, 2026 •

edited

Loading

sorrachai May 4, 2026 •

edited

Loading

eric-wieser May 5, 2026 •

edited

Loading

sorrachai left a comment •

edited

Loading

BasilRohner commented May 6, 2026 •

edited

Loading