Skip to content

feat: add learn export-kg and NormalizedTerm action metadata (Refs #759, #735)#834

Closed
AlexMikhalev wants to merge 1705 commits intomainfrom
task/759-export-corrections-markdown
Closed

feat: add learn export-kg and NormalizedTerm action metadata (Refs #759, #735)#834
AlexMikhalev wants to merge 1705 commits intomainfrom
task/759-export-corrections-markdown

Conversation

@AlexMikhalev
Copy link
Copy Markdown
Contributor

Summary

Two features in one PR:

feat(agent): learn export-kg command (Issue #759)

  • New learn export-kg CLI subcommand that reads captured CorrectionEvent markdown files
  • Groups compatible corrections by their corrected value
  • Emits Logseq-style KG markdown artefacts per unique corrected term
  • Supports --output flag and --correction-type filter

feat(types): add action metadata to NormalizedTerm (Issue #735)

  • Add action, priority, trigger, pinned fields to NormalizedTerm
  • Add same four fields to AutocompleteMetadata for alignment
  • All fields use #[serde(default)] for backward-compatible JSON

Notable Fix

  • bincode bug: #[serde(skip_serializing_if)] on optional fields in HashMap values causes UnexpectedEof on deserialization when fields are None. Removed skip_serializing_if from AutocompleteMetadata fields.

Files Changed

  • crates/terraphim_agent/src/learnings/export_kg.rs (new)
  • crates/terraphim_agent/src/main.rs (ExportKg variant + handler)
  • crates/terraphim_agent/src/learnings/mod.rs (exports)
  • crates/terraphim_types/src/lib.rs (NormalizedTerm fields + methods)
  • crates/terraphim_automata/src/autocomplete.rs (AutocompleteMetadata fields)
  • 9 other files updated with new struct fields

Testing

  • cargo test --workspace passes
  • cargo clippy --workspace --all-targets clean
  • Build verified on all targets including benches and tests

AlexMikhalev and others added 30 commits April 12, 2026 11:00
Drop unused learning-system re-exports and wrapper code, move list access to the concrete module path, and remove unused report/result fields so CI Clippy passes without lint suppressions or dead production code.
Keep the performance workflow consistent with the existing API benchmark skip path so a missing local health endpoint does not fail the run after custom benchmarks complete.
Use real file discovery for generated benchmark artifacts and keep PR commenting best-effort so the performance workflow fails only on benchmark gates, not on quoted globs or token-permission noise.
- Add get_target_triples_with_fallback() for Linux dual-target support
- Try GNU first, fall back to MUSL if GNU binary not available
- Prioritize signed archives (.tar.gz) over raw binaries
- Skip Rust cache for x86_64-unknown-linux-gnu to prevent stale artifacts
- Add MUSL SHA256 fallback in Homebrew formula generation
- Fix validation test latency threshold for CI stability

Refs #791
Replace outdated 15-crate overview with complete workspace reference
covering all 52 crates across 9 categories. Auto-generated from
Cargo.toml descriptions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ressions

fix: restore Gitea tracker paging and claim verification
Remove the obsolete cargo-audit fetch flag, acknowledge the current rand advisory in cargo-deny, and update the cross-mode consistency test to parse the current CLI human output so main branch security and test jobs pass again.
fix(ci): restore main validation after #788
fix(update): GNU/MUSL fallback for autoupdate and release pipeline
Carry forward the telemetry routing changes from #789 onto current main by making routing decisions async and wiring the orchestrator call sites to the updated control-plane behavior without the unrelated hook changes.
all_model_performances now acquires the read lock once for all models
via compute_snapshot helper. record_telemetry uses record_batch to write
a full tick's events in a single write-lock acquisition instead of N.
Extract `parse_stdout_for_telemetry` and `parse_stderr_for_telemetry`
as private methods on `AgentOrchestrator`, eliminating the duplicated
inline output-parsing logic that existed in both `drain_output_events`
and `poll_agent_exits`.

Both call sites now delegate to the shared helpers. No behaviour change;
440 tests pass, clippy clean.

Co-Authored-By: Terraphim AI <noreply@terraphim.ai>
…gration tests Refs #523

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…main

feat(orchestrator): refresh telemetry integration on current main
Bump the workspace release line to 1.16.33, fix the remaining performance baseline update glob on main, and make terraphim_agent persistence tests derive valid roles from the live CLI instead of hardcoding stale role names so main validation stays green as configuration evolves.
chore(release): prepare v1.16.33 and restore main CI
Use the live CLI/config role set in selected_role_tests instead of hardcoding Default so the main release build no longer depends on one specific embedded role name being present.
fix(agent): make selected role tests config-agnostic
Make terraphim_agent role-selection and persistence tests derive valid roles from the live config, and make terraphim_mcp_server integration tests pass zlob to their nested cargo build under CI so the main release build stops failing on stale role assumptions and fff build-script panics.
fix(ci): stabilize release build test suites
The remaining main release-build failures came from MCP test harnesses that spawned nested cargo run/build commands under CI=true without forwarding the zlob feature required by fff-search. Make every nested MCP cargo invocation propagate zlob in CI so the release test sweep is deterministic on main.
…lization

fix(mcp): pass zlob through nested CI cargo runs
Carry the last main-branch release blockers into the open PR by making all nested MCP cargo invocations propagate zlob under CI, switching the server workflow integration test to the dedicated role-selection path, and aligning the terraphim_service LLM validation test with the current fallback client contract.
…lization

fix(ci): finish release 1.16.33 mainline test stabilization
… educational content chunking Refs #559

Extend terraphim-markdown-parser with heading tree, configurable section
type classification, and education-aware content chunking for the Odilo
DLT ingest pipeline.

- Add heading.rs: HeadingNode tree from mdast AST, SectionType enum,
  SectionConfig with configurable pattern matching, build_heading_tree()
  and classify_sections()
- Add chunk.rs: ContentChunk with stable ULIDs, chunk_by_headings()
  producing composite chunk IDs (content_id#section_path#block_ulid)
- Store AST in NormalizedMarkdown to avoid double-parse
- Delete unused scratchpad.rs (pulldown-cmark experiment)
- 24 tests pass (14 new), clippy clean, no new dependencies
Two remaining CI Main Branch failures after #801:

1. orchestrator_tests: git_diff_baseline() fell back to HEAD~1 for
   the diff baseline, but shallow CI checkouts lack history before HEAD.
   Now uses git's empty-tree SHA as stable fallback when root commit
   lookup returns nothing, making test_orchestrator_compound_review_integration,
   test_git_diff_matching_changes_spawns, and test_spawn_agent_proceeds_with_git_diff_findings
   reliable across shallow/fetch-depth=0 CI environments.

2. integration_tests: test_end_to_end_server_workflow was selecting
   the already-active role (* marker) which fails with exit 1.
   Now selects a different role (non-current) so the switch is
   always exercised meaningfully.

Refs: #801
… classification, and educational content chunking Refs #559' (#560) from task/559-heading-hierarchy-chunking into main
Test User and others added 26 commits April 21, 2026 15:51
…llback (Fixes adf-fleet#44)

Problem
-------
When the orchestrator polled the repo-wide Gitea comments endpoint, every
PR comment came through with issue_number=0. That zero propagated:

    webhook/poll -> AdfCommand::SpawnAgent { issue_number: 0, .. }
        -> DispatchTask::MentionDriven { issue_number: 0, .. }
        -> mention_def.gitea_issue = Some(0)
        -> OutputPoster::post_agent_output_for_project(..., 0, ...)
        -> POST /api/v1/repos/{owner}/{repo}/issues/0/comments
        -> 500 {"message": "issue does not exist [id: 0, repo_id: 0, index: 0]"}

so the agent output was silently lost (it ran, exited 0, but never reached
Gitea). Observed live on 2026-04-21:

    15:02:08 webhook  received webhook event ... issue=28 comment_id=9880
    15:02:16 orch     dispatching mention-driven agent ... issue=0 comment_id=9880
    15:04:16 orch     agent exit classified ... exit_class=success
    15:04:16 ERROR    failed to post output for reviewer in project digital-twins:
                      Gitea post_comment error 500 on issue 0

Root cause
----------
`GET /api/v1/repos/{owner}/{repo}/issues/comments` returns TWO mutually-
exclusive URL fields per comment:

  - issue_url        -- set for issue comments, empty string for PR comments
  - pull_request_url -- set for PR comments, empty string for issue comments

`impl From<RepoComment> for IssueComment` at gitea.rs:1132 only read
issue_url. For PR comments it fed an empty string to `rsplit('/').next()`,
`.parse()` failed, and `.unwrap_or(0)` kicked in.

Fix
---
- Deserialise `pull_request_url` alongside `issue_url` on `RepoComment`.
- Try issue_url first, fall back to pull_request_url; skip empty strings.
- PRs share the issue numeric namespace in Gitea, so the same `rsplit('/')`
  trailing-segment trick works for both.

Tests
-----
- New: test_repo_comments_pr_comment_extracts_pr_number_from_pull_request_url
  asserts both PR comments (via pull_request_url) and issue comments (via
  issue_url) resolve to the correct number in a single response.
- Existing: test `assert_eq!(comments[0].issue_number, 0)` for "no URLs at all"
  still passes; behaviour preserved when both URLs are missing/empty.

Verification
------------
- `cargo test -p terraphim_tracker --all-targets` — 40/40 PASS
- `cargo test -p terraphim_orchestrator --all-targets` — all suites PASS
- `cargo clippy -p terraphim_tracker --all-targets -- -D warnings` — clean
- `cargo fmt --check` — clean
… pull_request_url fallback (Fixes adf-fleet#44)' (#738) from task/fix-outputposter-issue-zero into main
…, tracker, output, quickwit - Refs terraphim/adf-fleet#4

Threads project context through the orchestrator runtime so one process can serve multiple projects:

- DispatchTask variants carry project: String; Dispatcher adds per-project fairness (round-robin within same priority)
- Restart cooldown + concurrency caps keyed on (project, agent); ConcurrencyConfig gains per_project caps
- Agent spawn resolves agent.project -> Project and builds SpawnContext with working_dir + ADF_PROJECT_ID / ADF_WORKING_DIR / GITEA_OWNER / GITEA_REPO env
- dual_mode runs one Tracker per project (RunningTrackers map), with __global__ fallback for legacy single-project mode
- output_poster routes comments to the per-project Gitea repo
- quickwit events tagged with project_id; index_id resolved per project
- Legacy single-project configs keep working unchanged
Documents the full pipeline from DispatchContext through
RoutingDecisionEngine (KG + keyword + static merge, C1/C3 filter, budget
filter, scoring, telemetry adjustment) through spawn_with_fallback,
model_args, the tokio subprocess, exit classification and OutputPoster
write-back.

Cross-references the code (routing.rs, kg_router.rs, provider_probe.rs,
provider_budget.rs, error_signatures.rs, spawner) and the log lines so
an operator can read a journal trace alongside it. Includes a worked
example from a real security-sentinel dispatch.

Also documents the adf-fleet#44 fix (PR #738): pull_request_url
fallback for PR-comment issue_number extraction.
…nce' (#739) from task/docs-adf-model-selection into main
Before this change, an agent's own `gtr comment` / API calls inside its
task shell used `$GITEA_TOKEN` from `source ~/.profile` — the shared
root token — so the agent posted as `root` even though OutputPoster was
using the agent's own token for the wrapped completion comment.

After: build_spawn_context_for_agent resolves the per-agent token via
OutputPoster::agent_token(project, name) (reading agent_tokens.json)
and injects it as a `GITEA_TOKEN` env override, which beats the
~/.profile value. Every agent now posts to Gitea under its own login
for both the wrapped completion comment AND any direct `gtr` call in
its task script.

Changes:
- output_poster.rs: add ProjectTrackers.agent_tokens (HashMap<String,
  String>) parallel to agent_trackers; rewrite build_project_trackers
  to return both from one pass; add pub OutputPoster::agent_token
  getter.
- lib.rs: extend build_spawn_context_for_agent to accept
  Option<&OutputPoster>; inject GITEA_TOKEN when agent_token returns
  Some. Update the two callsites in spawn_agent and handle_review_pr.

Tests:
- agent_token_returns_configured_value_when_tokens_file_loaded: loads
  a tempdir agent_tokens.json, asserts agent_token() returns the
  configured value; unknown agent/project returns None.
- agent_token_returns_none_when_no_tokens_file: no file configured →
  always None; agent defaults to project root token.

Verification:
- cargo test -p terraphim_orchestrator --all-targets — all pass
  (including the 2 new unit tests + existing 497 suite tests).
- cargo clippy -p terraphim_orchestrator --all-targets -- -D warnings
  — clean.
- cargo fmt --check — clean.
…into spawn env' (#741) from task/inject-per-agent-gitea-token into main
Adds a per-agent identity paragraph to Stage 1 (DispatchContext) and a
new section covering PR #741 (env_overrides injection) + the retasked
meta-coordinator that now picks top-PageRank and dispatches via @adf:
mention instead of posting health reports to issue #107.
…inator dispatch role' (#743) from task/docs-adf-token-inject into main
…its with manual conflict resolution)' (#740) from sync/github-to-gitea-full into main
Bumps rust from 1.94-slim-bookworm to 1.95-slim-bookworm.

---
updated-dependencies:
- dependency-name: rust
  dependency-version: 1.95-slim-bookworm
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* feat(drift-detector): agent work [auto-commit]

* feat(security-sentinel): agent work [auto-commit]

* feat(spec-validator): agent work [auto-commit]

* feat(sessions): Session enrichment pipeline with concept extraction (Spec F5.1 / Task 3.1) Refs #756

- Add enrichment field to SessionMetadata (feature-gated)
- Wire up actual enrichment in REPL /sessions enrich command
- Add enrichment feature to terraphim_agent Cargo.toml
- Fix SessionMetadata construction sites for feature-gated field

Acceptance criteria:
- SessionEnricher processes messages through terraphim_automata
- ConceptMatch with occurrence count, message IDs, and confidence
- Concept pair detection (co-occurring concepts)
- Dominant topic identification via frequency analysis
- Enrichment results stored in SessionMetadata
- /sessions enrich command triggers enrichment on demand
- Feature-gated via enrichment feature flag
- cargo test passes (57/57)
- cargo clippy -- -D warnings passes

* chore: fix formatting Refs #756

* fix: remove redundant struct update syntax in SessionMetadata Refs #756

* fix: add SessionMetadata constructor to avoid clippy field-assignment error Refs #756
…Refs #758 (#831)

* feat(types): extract BM25 score module to terraphim_types Refs #758

* refactor(service): replace score module with re-export from terraphim_types Refs #758

* feat(sessions): add search-index feature with BM25 session search adapter Refs #758

* feat(sessions): replace brute-force search with BM25-ranked results Refs #758

* feat(agent): add /sessions index CLI command for BM25 search index stats Refs #758

* fix(types): add Default derives for all scorer structs (clippy) Refs #758

* feat(sessions): hybrid search with KG concepts first, BM25 fallback Refs #758

* fix(sessions): UTF-8 boundary panic in build_body truncation, log BM25 errors Refs #758
…759

- Add export_kg.rs module with export_corrections_as_kg() function
- Add ExportKg variant to LearnSub CLI enum
- Corrections are grouped by corrected value, exported as Logseq-style KG markdown
- Supports --type tool-preference (default) or --type all filter
- 7 unit tests covering empty dir, single export, merging, filtering, filenames
- clippy clean
…759

- Add export_kg.rs module with export_corrections_as_kg() function
- Add ExportKg variant to LearnSub CLI enum
- Corrections are grouped by corrected value, exported as Logseq-style KG markdown
- Supports --type tool-preference (default) or --type all filter
- 7 unit tests covering empty dir, single export, merging, filtering, filenames
- clippy clean
…raphim/terraphim-ai into task/759-export-corrections-markdown
@AlexMikhalev
Copy link
Copy Markdown
Contributor Author

Export-kg and NormalizedTerm metadata cherry-picked into main as commit 781df78. Closing this PR as all valuable changes are now merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants