Skip to content

feat: pagination in services and types#61

Open
AnonymousXC wants to merge 7 commits into
elixir-cloud-aai:devfrom
AnonymousXC:pagination
Open

feat: pagination in services and types#61
AnonymousXC wants to merge 7 commits into
elixir-cloud-aai:devfrom
AnonymousXC:pagination

Conversation

@AnonymousXC

@AnonymousXC AnonymousXC commented Jun 3, 2026

Copy link
Copy Markdown

Description

Adds backward compatible pagination to the services and service/types endpoint.

Fixes #58 (issue)

Type of change

New feature (non-breaking change which adds functionality)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have not reduced the existing code coverage
  • I have added docstrings following the Python style guidelines of this project to all new modules, classes, methods and functions are documented with docstrings following; I have updated any previously existing docstrings, if applicable
  • I have updated any sections of the app's documentation that are affected by the proposed changes, if applicable

Summary by Sourcery

Add optional pagination support to the services and service types endpoints while preserving the existing non-paginated behaviour.

New Features:

  • Introduce query-based pagination for the /services endpoint returning paginated service lists when page parameters are supplied.
  • Introduce query-based pagination for the /service/types endpoint returning paginated service type lists when page parameters are supplied.

Enhancements:

  • Extend the OpenAPI specification to describe optional pagination query parameters and paginated response schemas for services and service types.

@sourcery-ai

sourcery-ai Bot commented Jun 3, 2026

Copy link
Copy Markdown

Reviewer's Guide

Adds optional, backward-compatible pagination support to the GA4GH registry services and service types endpoints, and updates the OpenAPI specification to describe the new query parameters and paginated response schemas.

Sequence diagram for optional pagination in getServices endpoint

sequenceDiagram
    actor Client
    participant FlaskServer
    participant getServices
    participant ServicesCollection

    Client->>FlaskServer: GET /services?page&page_size
    FlaskServer->>getServices: getServices

    alt [page and page_size missing]
        getServices->>ServicesCollection: find(filter, projection)
        ServicesCollection-->>getServices: cursor(all services)
        getServices-->>FlaskServer: list(services)
    else [page or page_size provided]
        getServices->>ServicesCollection: count_documents({})
        ServicesCollection-->>getServices: total_count
        getServices->>ServicesCollection: find(filter, projection)
        ServicesCollection-->>getServices: cursor
        getServices->>ServicesCollection: cursor.skip(skip_items).limit(page_size)
        ServicesCollection-->>getServices: paginated_services
        getServices-->>FlaskServer: {results, pagination}
    end

    FlaskServer-->>Client: 200 OK
Loading

Sequence diagram for optional pagination in getServiceTypes endpoint

sequenceDiagram
    actor Client
    participant FlaskServer
    participant getServiceTypes
    participant getServices

    Client->>FlaskServer: GET /service/types?page&page_size
    FlaskServer->>getServiceTypes: getServiceTypes
    getServiceTypes->>getServices: getServices.__wrapped__
    getServices-->>getServiceTypes: services or {results, pagination}
    getServiceTypes->>getServiceTypes: build uniq_types

    alt [page and page_size missing]
        getServiceTypes-->>FlaskServer: uniq_types
    else [page or page_size provided]
        getServiceTypes->>getServiceTypes: paginate uniq_types
        getServiceTypes-->>FlaskServer: {results, pagination}
    end

    FlaskServer-->>Client: 200 OK
Loading

Entity relationship diagram for new pagination response schemas

erDiagram
    PaginationMetadata {
        int page
        int page_size
        int total
    }

    PaginatedExternalService {
    }
    PaginatedServiceType {
    }
    ExternalService {
    }
    ServiceType {
    }

    PaginatedExternalService ||--o{ ExternalService : results
    PaginatedExternalService ||--|| PaginationMetadata : pagination

    PaginatedServiceType ||--o{ ServiceType : results
    PaginatedServiceType ||--|| PaginationMetadata : pagination
Loading

File-Level Changes

Change Details Files
Implement optional pagination for the /services endpoint while preserving the original non-paginated behavior.
  • Read optional page and page_size query parameters from the request, defaulting to page 1 and page_size 10 when pagination is requested but values are missing.
  • When no pagination parameters are provided, return the original full list of services as a plain array for backward compatibility.
  • When pagination is used, compute total document count and page bounds, validate the requested page, and apply MongoDB skip and limit to fetch a page of services.
  • Change the response shape for paginated calls to return a results array plus a pagination object containing page, page_size, and total.
  • Raise BadRequest when the requested page is out of valid range.
cloud_registry/ga4gh/registry/server.py
Implement optional pagination for the /service/types endpoint, compatible with both paginated and non-paginated /services behavior.
  • Call the underlying getServices implementation and normalize its return value so that service lists can be derived from either array or paginated dict responses.
  • Compute distinct service types from the resolved services list as before, then optionally paginate the resulting type list based on page and page_size query parameters.
  • When no pagination parameters are provided, return the original list of unique service types; when provided, return a results plus pagination response structure.
  • Validate page bounds for service types and raise BadRequest for invalid page requests.
cloud_registry/ga4gh/registry/server.py
Extend the OpenAPI specification to document pagination parameters and paginated response types for services and service types.
  • Add optional integer page and page_size query parameters to the /services endpoint definition.
  • Update /services and /service/types 200 responses to use a oneOf schema that allows either the original array response or a new paginated object response.
  • Introduce a reusable PaginationMetadata schema describing page, page_size, and total fields.
  • Define PaginatedExternalService and PaginatedServiceType schemas that wrap existing item schemas in a results array plus pagination metadata.
cloud_registry/api/20201108.e0358db.openapi.yaml

Assessment against linked issues

Issue Objective Addressed Explanation
#58 Implement backward-compatible pagination for GET /services, accepting optional page and page_size query parameters, returning the existing bare array when no pagination params are provided, and returning an envelope { results: [...], pagination: { page, page_size, total, ... } } with 400 Bad Request when the requested page exceeds available pages.
#58 Implement backward-compatible pagination for GET /services/types, accepting optional page and page_size query parameters, returning the existing bare array when no pagination params are provided, and returning an envelope { results: [...], pagination: { page, page_size, total, ... } } with 400 Bad Request when the requested page exceeds available pages.
#58 Update the API specification to document the new pagination behavior for /services and /services/types, including query parameters page and page_size and the paginated response schema.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 4 issues, and left some high level feedback:

  • In getServices, the total_pages calculation causes a BadRequest when total_count == 0 (e.g., page=1 and no records yield total_pages == 0), which is surprising; consider treating empty collections as a valid first page and returning an empty results list with appropriate pagination metadata instead of erroring.
  • Both getServices and getServiceTypes now contain nearly identical pagination logic (defaults, bounds checking, skip/limit or slicing); it would be cleaner and less error-prone to extract this into a shared helper that can be reused across endpoints.
  • The mixed return type of getServices (raw list vs. object with results/pagination) forces callers like getServiceTypes to use isinstance checks and __wrapped__ workarounds; consider normalizing the return shape (e.g., always a paginated object with sensible defaults) or providing a separate internal helper that always returns a list.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `getServices`, the `total_pages` calculation causes a `BadRequest` when `total_count == 0` (e.g., page=1 and no records yield `total_pages == 0`), which is surprising; consider treating empty collections as a valid first page and returning an empty `results` list with appropriate pagination metadata instead of erroring.
- Both `getServices` and `getServiceTypes` now contain nearly identical pagination logic (defaults, bounds checking, skip/limit or slicing); it would be cleaner and less error-prone to extract this into a shared helper that can be reused across endpoints.
- The mixed return type of `getServices` (raw list vs. object with `results`/`pagination`) forces callers like `getServiceTypes` to use `isinstance` checks and `__wrapped__` workarounds; consider normalizing the return shape (e.g., always a paginated object with sensible defaults) or providing a separate internal helper that always returns a list.

## Individual Comments

### Comment 1
<location path="cloud_registry/ga4gh/registry/server.py" line_range="45-48" />
<code_context>
+    # Return paginated response
+    page = page or 1
+    page_size = page_size or 10
+    total_count = db_collection_service.count_documents({})
+    total_pages = floor(total_count / page_size) + (1 if total_count % page_size > 0 else 0)
+
+    if page < 1 or page > total_pages:
+        raise BadRequest
+    
</code_context>
<issue_to_address>
**issue:** Requesting page 1 on an empty collection results in BadRequest instead of an empty paginated response.

Because `total_count == 0` yields `total_pages = 0`, any `page >= 1` satisfies `page > total_pages` and raises `BadRequest`. This makes `/services?page=1` on an empty store an error instead of an empty `results` with pagination metadata, and it’s inconsistent with `getServiceTypes`, where `page > total_pages` is only invalid when `total_count > 0`. Please adjust the condition so that when `total_count == 0`, `page=1` (and possibly higher pages) returns an empty `results` instead of an error.
</issue_to_address>

### Comment 2
<location path="cloud_registry/ga4gh/registry/server.py" line_range="43-46" />
<code_context>
+        return list(records)
+    
+    # Return paginated response
+    page = page or 1
+    page_size = page_size or 10
+    total_count = db_collection_service.count_documents({})
+    total_pages = floor(total_count / page_size) + (1 if total_count % page_size > 0 else 0)
+
+    if page < 1 or page > total_pages:
</code_context>
<issue_to_address>
**issue (bug_risk):** page_size is not validated and can be zero or negative, leading to runtime errors and invalid skip/limit values.

If a client sends `page_size=0` or a negative value, this will raise `ZeroDivisionError` when computing `total_pages` and also result in invalid `.limit(page_size)` and negative skip values. The same applies to the `getServiceTypes` pagination. Please enforce `page_size >= 1` (and optionally a maximum) and reject invalid values with a `BadRequest`.
</issue_to_address>

### Comment 3
<location path="cloud_registry/ga4gh/registry/server.py" line_range="101-102" />
<code_context>
+    page = request.args.get("page", type=int)
+    page_size = request.args.get("page_size", type=int)
+    
     services = getServices.__wrapped__()
+    if isinstance(services, dict):
+        services = services["results"]
     types = [s["type"] for s in services]
</code_context>
<issue_to_address>
**issue (bug_risk):** Pagination parameters on /service-types currently affect the underlying /services call, which restricts the set of services considered when computing distinct service types.

Because `page` and `page_size` are still present in `request.args` when calling `getServices.__wrapped__()`, `getServices` paginates the services before `getServiceTypes` builds the type list from `services["results"]`, and then you paginate the types again. `getServiceTypes` should derive distinct types from the full, unpaginated service set, and only then apply pagination to the types. Consider calling a helper that always returns the complete service list or otherwise disabling pagination inside `getServices` for this endpoint.
</issue_to_address>

### Comment 4
<location path="cloud_registry/api/20201108.e0358db.openapi.yaml" line_range="24-35" />
<code_context>
+          schema:
+            type: integer
+            example: 1
+        - name: page_size
+          in: query
+          required: false
+          schema:
+            type: integer
+            example: 10
       responses:
         '200':
</code_context>
<issue_to_address>
**suggestion (bug_risk):** The OpenAPI schema for pagination parameters does not constrain page/page_size to positive integers, which conflicts with server-side expectations.

Backend code requires `page` and `page_size` to be positive, non-zero values, but the OpenAPI spec only marks them as generic integers. Please add `minimum: 1` (and optionally an upper `maximum`) to both schemas so that validation and generated clients correctly enforce the backend constraints.

```suggestion
        - name: page
          in: query
          required: false
          schema:
            type: integer
            minimum: 1
            example: 1
        - name: page_size
          in: query
          required: false
          schema:
            type: integer
            minimum: 1
            example: 10
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread cloud_registry/ga4gh/registry/server.py Outdated
Comment thread cloud_registry/ga4gh/registry/server.py Outdated
Comment thread cloud_registry/ga4gh/registry/server.py Outdated
Comment on lines +101 to +102
services = getServices.__wrapped__()
if isinstance(services, dict):

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Pagination parameters on /service-types currently affect the underlying /services call, which restricts the set of services considered when computing distinct service types.

Because page and page_size are still present in request.args when calling getServices.__wrapped__(), getServices paginates the services before getServiceTypes builds the type list from services["results"], and then you paginate the types again. getServiceTypes should derive distinct types from the full, unpaginated service set, and only then apply pagination to the types. Consider calling a helper that always returns the complete service list or otherwise disabling pagination inside getServices for this endpoint.

Comment thread cloud_registry/api/20201108.e0358db.openapi.yaml
@AnonymousXC AnonymousXC requested a review from uniqueg June 5, 2026 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add pagination support to Services and Service Types responses

1 participant