Replace client-side lunr search with worker-backed semantic/RAG search by Copilot · Pull Request #148 · mieweb/docs

Copilot · 2026-04-23T21:27:28Z

Replaces the in-browser lunr full-text index with true semantic search powered by the existing Cloudflare Workers AI + Vectorize RAG pipeline. All existing UI components (the ⌘K modal, the React SearchModal, and the CommandPalette) keep their look and behavior — only the underlying search engine changes.

Architecture

graph LR
    subgraph Frontend
        SearchUI[⌘K Search Modal]
        ChatUI[FloatingAIChat]
    end
    subgraph "Cloudflare Worker (ai-assistant)"
        Search["/search endpoint"]
        Chat["/chat endpoint"]
    end
    subgraph Cloudflare
        AI[Workers AI]
        VEC[Vectorize Index]
    end

    SearchUI -->|POST /api/ai-assistant/search| Search
    ChatUI -->|POST /api/ai-assistant/chat| Chat
    Search -->|embed| AI
    Search -->|query| VEC
    Chat -->|embed + generate| AI
    Chat -->|query| VEC

Both /search and /chat share the same embeddings, Vectorize index and retrieval code — /search is simply the retrieval half of RAG (no LLM call), so it's fast and cheap enough to use for interactive search-as-you-type.

What changed

Worker (`workers/ai-assistant`)

New src/search.ts with semanticSearch() — embeds query, calls Vectorize, filters by brand (metadata or URL-prefix fallback), de-duplicates by URL, truncates snippets on sentence boundaries.
src/index.ts now routes POST /search and GET /search?q=…&brand=…&limit=… with request validation (length cap, limit clamp to 1–25).
src/types.ts adds SearchRequest / SearchResponse / SearchResultItem types and a brand field on vector metadata.

Indexer (`scripts/index-docs.ts`)

Chunks are now tagged with brand metadata and namespaced chunk IDs (eh-… / wc-…) so the worker can filter cleanly and both brands can coexist in the same Vectorize index.

Frontend

themes/mieweb-docs/assets/js/main.js (the primary ⌘K search modal) — replaced all lunr index loading/building/searching with debounced fetch calls to the worker, using AbortController to cancel stale requests. Shows a friendly "Search is temporarily unavailable" fallback on error. The same HTML templates, skeletons, keyboard navigation, and ?q= deep-linking behavior are preserved.
src/components/SearchModal.tsx — same swap on the React side; now shows snippets under each result.
src/components/DocumentationApp.tsx (CommandPalette) — same worker-backed fetch with cancellation.
themes/mieweb-docs/layouts/_default/baseof.html — injects window.SearchApiUrl from the new Hugo param and drops the lunr.min.js script tag.

Config

config-eh.toml + config-wc.toml — new [params.search] block with apiUrl = "/api/ai-assistant/search", and removed the lunr.min.js module mount.
package.json — removed lunr and @types/lunr.
eslint.config.js — removed the lunr browser global.
Removed themes/mieweb-docs/assets/js/vendor/lunr.min.js and the now-empty vendor/ directory.

Docs

workers/README.md — updated title, architecture diagram, API reference, and curl examples for the new /search endpoint.
themes/mieweb-docs/README.md — removed the vendor/lunr.min.js line from the file tree.

API — `POST /search`

Request

{ "query": "schedule an appointment", "brand": "eh", "limit": 10 }

Response

{
  "query": "schedule an appointment",
  "results": [
    {
      "id": "eh-eh-features-scheduling-chunk-0",
      "title": "Scheduling",
      "url": "/eh/features/scheduling/",
      "section": "features",
      "snippet": "Scheduling allows you to manage appointments…",
      "score": 0.87
    }
  ]
}

GET /search?q=…&brand=…&limit=… is also supported.

Deployment notes

Deploy the updated worker: cd workers/ai-assistant && npx wrangler deploy.
Re-index docs (to pick up the new brand metadata): build the site for each brand (./build.sh eh / ./build.sh wc) then npm run index:docs -- --brand eh and npm run index:docs -- --brand wc.
Legacy vectors without brand metadata continue to work — the worker falls back to filtering by URL prefix (/eh/… or /wc/…) until they're re-indexed.
The Pages binding already routes /api/* to the worker, so /api/ai-assistant/search is reachable with no Cloudflare config changes.

Verification

npm run typecheck — clean
npm run lint — 0 errors (73 pre-existing warnings, same as baseline)
npm run build:components — succeeds, bundle contains no references to lunr
./build.sh eh — succeeds, grep -r lunr public/eh/ returns nothing, window.SearchApiUrl is correctly injected
Worker tsc --noEmit — only 3 pre-existing errors from BaseAiTextEmbeddingsModels / BaseAiTextGenerationModels types (unrelated to this PR)

Agent-Logs-Url: https://github.com/mieweb/docs/sessions/4c6f0ac6-1c0b-4eac-a298-31e33f220d98

cloudflare-workers-and-pages · 2026-04-23T21:30:44Z

Deploying wc-docs with Cloudflare Pages

Latest commit:	`4df709e`
Status:	✅ Deploy successful!
Preview URL:	https://2f79b7a1.wc-docs.pages.dev
Branch Preview URL:	https://copilot-replace-client-side.wc-docs.pages.dev

View logs

cloudflare-workers-and-pages · 2026-04-23T21:30:47Z

Deploying eh-docs with Cloudflare Pages

Latest commit:	`4df709e`
Status:	✅ Deploy successful!
Preview URL:	https://41edfe00.eh-docs.pages.dev
Branch Preview URL:	https://copilot-replace-client-side.eh-docs.pages.dev

View logs

github-actions · 2026-04-23T21:56:23Z

👋 @Copilot, Your documentation has been pushed to https://docs-qa.med-web.com/148-merge/ for commit 38a7fd6

The PR added POST /search to the worker router but /api/ai-assistant/* is served by Cloudflare Pages Functions in this repo (not the worker). Without a Pages Function at functions/api/ai-assistant/search.ts, requests 404'd and the search modal showed 'temporarily unavailable'. Mirrors the worker's semanticSearch logic (over-sampling, brand filter with URL-prefix fallback, URL de-dup, sentence-boundary snippets) and reuses the existing Pages Functions embeddings helper. Also adds SearchRequest/SearchResponse/SearchResultItem types and a brand field on VectorMetadata.

github-actions · 2026-04-23T22:29:48Z

👋 @wreiske, Your documentation has been pushed to https://docs-qa.med-web.com/148-merge/ for commit ff7043f

Legacy vectors in the docs-embeddings index were created without brand metadata and with brand-agnostic URLs like /features/encounters/... (no /eh/ or /wc/ prefix). The old fallback required url.startsWith('/eh/') which filtered every legacy result out, so /search returned an empty array even though /chat found the same sources fine. Now the fallback only rejects URLs explicitly prefixed for the *other* brand; brand-agnostic URLs are accepted. Explicit brand metadata still wins when present.

github-actions · 2026-04-23T22:40:34Z

👋 @wreiske, Your documentation has been pushed to https://docs-qa.med-web.com/148-merge/ for commit 79553df

- Index markdown by heading so each vector carries an anchor/heading, and deep-link search results to url#anchor (indexer + Hugo template). - New POST/GET /api/ai-assistant/search/answer endpoint that runs the retrieval half of RAG + a short LLM call with bracketed [n] citations. Refuses to answer when the docs do not cover the query. - SearchModal shows an 'Ask AI' CTA (or ⌘↩) above results and renders an AnswerCard with the LLM answer + clickable numbered sources. - Surfaces section heading in each result (falls back to existing section). - Typed anchor/heading end-to-end: VectorMetadata, SearchResultItem, and the client SearchResult all carry them; older vectors stay compatible (optional fields).

github-actions · 2026-04-24T15:26:35Z

👋 @wreiske, Your documentation has been pushed to https://docs-qa.med-web.com/148-merge/ for commit 2b002b7

The SearchModal source was updated in 23aa4ed but the committed bundle at themes/mieweb-docs/assets/js/react/components.js was never regenerated, so the deployed site kept rendering the old modal without the Ask AI CTA or AnswerCard.

Previously the committed bundle at themes/mieweb-docs/assets/js/react/ components.js could drift from src/components/*.tsx because build.sh only ran hugo. Now every build (and --live preview) regenerates the bundle first, so the deployed site always ships the latest React UI.

github-actions · 2026-04-24T15:58:04Z

👋 @wreiske, Your documentation has been pushed to https://docs-qa.med-web.com/148-merge/ for commit a9c5f33

github-actions · 2026-04-24T15:59:59Z

👋 @wreiske, Your documentation has been pushed to https://docs-qa.med-web.com/148-merge/ for commit a7ff77f

The deployed site renders the HTML+JS search modal (main.js), not the React SearchModal component, so the AI features from 23aa4ed were never visible on the live site. Port the Ask AI UI to the actual path used in production: - Add Ask AI CTA + answer card to search-modal.html partial - Wire up POST /api/ai-assistant/search/answer in main.js - ⌘/Ctrl+Enter inside the modal asks the AI - Deep-link results to url#anchor when the indexer emitted one - Reset answer UI on every keystroke / modal close

github-actions · 2026-04-24T16:11:27Z

👋 @wreiske, Your documentation has been pushed to https://docs-qa.med-web.com/148-merge/ for commit 8761a84

github-actions · 2026-04-24T16:31:20Z

👋 @wreiske, Your documentation has been pushed to https://docs-qa.med-web.com/148-merge/ for commit bec542f

- Add shared prompt-guard module with heuristic injection detection, delimiter-wrapped user input, and a hardened system-prompt fragment telling the model to treat user input and doc excerpts as untrusted data and refuse off-topic / jailbreak requests. - Short-circuit obvious jailbreak attempts ("ignore previous instructions", "you are now …", DAN/dev mode, role-injection tokens, etc.) with the canonical refusal before touching the LLM. - Sanitize answer-endpoint output: strip responses containing code fences or model-instruction leaks, replacing them with the refusal. - Applied to both the Cloudflare Pages function endpoints (/api/ai-assistant/search/answer, /api/ai-assistant/chat) and the mirror worker (workers/ai-assistant).

github-actions · 2026-04-24T16:40:13Z

👋 @wreiske, Your documentation has been pushed to https://docs-qa.med-web.com/148-merge/ for commit 2c811ae

github-actions · 2026-04-24T16:53:51Z

👋 @wreiske, Your documentation has been pushed to https://docs-qa.med-web.com/148-merge/ for commit e475962

Copilot

Pull request overview

This PR replaces the docs site’s client-side Lunr search with a worker/Pages-function backed semantic (Vectorize + Workers AI embeddings) search pipeline, and adds an inline “Ask AI” (RAG answer) option in the ⌘K search UI.

Changes:

Add semantic search (/search) + versioning and inline RAG answer (/search/answer) endpoints, plus prompt-injection hardening.
Update indexer to embed heading-aware chunks with brand + anchor metadata and publish a content-addressed index version to KV.
Remove Lunr assets/dependencies and update frontend search UIs to call the new APIs (with cancellation and richer snippets).

Reviewed changes

Copilot reviewed 29 out of 33 changed files in this pull request and generated 12 comments.

Show a summary per file

File	Description
wrangler.toml	Adds `DOCS_CACHE` KV binding for search cache/version support.
workers/ai-assistant/src/types.ts	Adds search-related types + brand/anchor metadata fields.
workers/ai-assistant/src/search.ts	Implements semantic retrieval + brand filtering + URL de-dupe + snippet building.
workers/ai-assistant/src/rag.ts	Adds prompt-guard rules + injection short-circuit for chat RAG.
workers/ai-assistant/src/prompt-guard.ts	Adds worker-side prompt injection guard (mirrors Pages Functions copy).
workers/ai-assistant/src/index.ts	Routes `/search` (GET/POST) to semantic search handler.
workers/README.md	Documents new search endpoints and usage.
wikigdrive.toml	Removes lunr module mount.
themes/mieweb-docs/layouts/partials/search-modal.html	Adds Ask-AI CTA container + answer card region.
themes/mieweb-docs/layouts/_default/search.json	Adds `rawContent` to support heading/anchor chunking.
themes/mieweb-docs/layouts/_default/home.searchindex.json	Updates comment to reflect Vectorize indexing consumption.
themes/mieweb-docs/layouts/_default/baseof.html	Injects `window.SearchApiUrl`; removes Lunr script include.
themes/mieweb-docs/assets/js/vendor/lunr.min.js	Deletes Lunr vendored bundle.
themes/mieweb-docs/assets/js/main.js	Switches ⌘K modal to semantic search + adds inline Ask-AI.
themes/mieweb-docs/README.md	Updates theme feature list/tree for semantic search.
src/components/SearchModal.tsx	React search modal now calls semantic search + adds inline RAG answer + client caching.
src/components/DocumentationApp.tsx	CommandPalette search swaps to worker-backed search.
scripts/index-docs.ts	Adds brand/anchor metadata, content hashing + KV version publishing, and change-skip logic.
package.json	Removes `lunr` and `@types/lunr`.
package-lock.json	Removes Lunr packages from lockfile.
functions/api/ai-assistant/version.ts	Adds KV-backed index version reader.
functions/api/ai-assistant/types.ts	Adds search + version response types and metadata fields.
functions/api/ai-assistant/search/version.ts	Adds `/search/version` endpoint for client cache pinning.
functions/api/ai-assistant/search/answer.ts	Adds `/search/answer` RAG answer endpoint with injection defenses + caching.
functions/api/ai-assistant/search.ts	Adds semantic `/search` endpoint with version-aware caching headers.
functions/api/ai-assistant/rag.ts	Adds prompt-guard rules + injection short-circuit for chat.
functions/api/ai-assistant/prompt-guard.ts	Adds shared injection heuristics + prompt fragment for LLM calls.
eslint.config.js	Removes `lunr` global.
config-wc.toml / config-eh.toml	Adds `[params.search]` and removes Lunr mount.
build.sh	Rebuilds React bundle during builds; optionally refreshes Vectorize index when env vars are present.

Comments suppressed due to low confidence (1)

src/components/SearchModal.tsx:680

ArrowRight uses group-hover:opacity-100, but the parent button doesn’t have the group class, so the hover style will never activate and the icon will remain opacity-0. Add group to the button className (or remove the group-hover styling) so the indicator behaves as intended.

              <button
                key={result.id}
                data-index={index}
                onClick={() => handleSelect(result)}
                className={cn(
                  "hover:bg-muted w-full px-4 py-3 text-left transition-colors focus:outline-none",
                  index === selectedIndex && "bg-muted"
                )}
              >
                <div className="flex items-center gap-3">
                  <FileText className="text-muted-foreground h-4 w-4 flex-shrink-0" />
                  <div className="min-w-0 flex-1">
                    <div className="text-foreground truncate font-medium">
                      {result.title}
                    </div>
                    {(result.heading || result.section) && (
                      <div className="text-muted-foreground flex items-center gap-1 text-xs">
                        <Hash className="h-3 w-3" />
                        {result.heading || result.section}
                      </div>
                    )}
                    {result.snippet && (
                      <div className="text-muted-foreground mt-1 line-clamp-2 text-xs">
                        {result.snippet}
                      </div>
                    )}
                  </div>
                  <ArrowRight className="text-muted-foreground h-4 w-4 flex-shrink-0 opacity-0 group-hover:opacity-100" />
                </div>

- SearchModal: abort in-flight request on cache-hit and empty-query paths so an earlier fetch can't race in and overwrite newer results. - SearchModal: reset isLoading on open so a modal closed mid-request no longer shows a stale spinner next time it opens. - SearchModal: thread brand through fetchIndexVersion so the brand-scoped /version cache is used when available. - DocumentationApp: wire performSearch into the CommandPalette by subscribing to the context query and debouncing; remove dead 'void performSearch' line. - DocumentationApp: carry anchor/heading through WorkerSearchResult and navigate to '#anchor' when present (parity with SearchModal / main.js). - index-docs: correct splitByHeadings docstring (first section has no heading/anchor; caller substitutes a title if it wants one). - index-docs: reuse newVersion for the KV write instead of recomputing, avoiding duplicate hashing and future drift. - main.js: normalize SEARCH_API_BASE trailing slash before building ANSWER_API_URL to avoid '.../search//answer'. - main.js: hide the 'Ask AI' CTA in closeSearchModal so a stale CTA doesn't flash when the modal re-opens empty. - version.ts: read brand-scoped 'index:version:<brand>' KV key when a brand is provided, falling back to the global key for compat. - search/version.ts: accept optional ?brand=eh|wc so clients can pin to a brand-specific version. - search.ts: pass parsed.brand into getIndexVersion. - prompt-guard: add scripts/check-prompt-guard-sync.ts and CI step to fail the build if the Pages and Workers copies drift on behavior.

github-actions · 2026-04-24T17:06:27Z

👋 @wreiske, Your documentation has been pushed to https://docs-qa.med-web.com/148-merge/ for commit c0ccdcf

Replace client-side lunr search with worker-backed semantic/RAG search

d90da7d

Agent-Logs-Url: https://github.com/mieweb/docs/sessions/4c6f0ac6-1c0b-4eac-a298-31e33f220d98

Copilot AI assigned Copilot and wreiske Apr 23, 2026

Copilot created this pull request from a session on behalf of wreiske April 23, 2026 21:29 View session

Copilot AI requested a review from wreiske April 23, 2026 21:29

Copilot finished work on behalf of wreiske April 23, 2026 21:29

wreiske added 2 commits April 24, 2026 08:56

Remove remaining lunr references after semantic search migration

21741a5

wreiske marked this pull request as ready for review April 24, 2026 16:49

Merge branch 'master' into copilot/replace-client-side-search

8fad211

wreiske requested a review from Copilot April 24, 2026 16:49

Copilot started reviewing on behalf of wreiske April 24, 2026 16:50 View session

Copilot AI reviewed Apr 24, 2026

View reviewed changes

wreiske merged commit 6cc2729 into master Apr 24, 2026
5 checks passed

wreiske deleted the copilot/replace-client-side-search branch April 24, 2026 17:12

Conversation

Copilot AI commented Apr 23, 2026

Architecture

What changed

Worker (workers/ai-assistant)

Indexer (scripts/index-docs.ts)

Frontend

Config

Docs

API — POST /search

Deployment notes

Verification

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying wc-docs with Cloudflare Pages

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying eh-docs with Cloudflare Pages

Uh oh!

github-actions Bot commented Apr 23, 2026

Uh oh!

github-actions Bot commented Apr 23, 2026

Uh oh!

github-actions Bot commented Apr 23, 2026

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Worker (`workers/ai-assistant`)

Indexer (`scripts/index-docs.ts`)

API — `POST /search`

cloudflare-workers-and-pages Bot commented Apr 23, 2026 •

edited

Loading

cloudflare-workers-and-pages Bot commented Apr 23, 2026 •

edited

Loading