Improved speaker detection with GLiNER + expanded regex by MithilSaiReddy · Pull Request #6735 · BasedHardware/omi

MithilSaiReddy · 2026-04-17T03:00:12Z

Improve Speaker Detection (Issue #3039)

Description
Improves speaker name extraction accuracy by combining GLiNER (NER) with enhanced regex detection, better handling of lowercase ASR output, and fixes for incorrect extraction in self-introduction phrases.

What this delivers

Detects speaker names from natural phrases like “this is bob”, “hey it’s charlie”, etc.
Handles lowercase ASR output and normalizes it to properly capitalized names.
Prevents GLiNER from returning full phrases instead of names in self-introductions.
Improves support for multi-word names (e.g., “John Smith”).
Maintains backward compatibility with existing detection logic.

Key implementation

GLiNER filtering

Skips NER for self-introduction phrases across 30+ languages.
Introduces _contains_intro_phrase() to avoid incorrect entity extraction.

Expanded regex patterns

Adds support for common patterns missed by GLiNER:
- “This is X”
- “Hey it’s X”
- “Call me X”
- “You’re speaking with X”
- “You’re talking to X”
- “The name’s X”
Supports both single and multi-word names.

Lowercase ASR handling

Processes raw inputs like “this is bob”.
Removes filler words (e.g., “hi”, “uh”, “um”).
Filters non-name tokens.
Normalizes output to properly capitalized names.

Multi-word name fix

Ensures correct capitalization for names like “John Smith”.

Impact

Improves speaker identification accuracy in real-world ASR scenarios, especially for noisy, lowercase, and conversational inputs, without introducing regressions or additional dependencies.

Test Results

✅ 48/48 tests passing
🌍 Coverage across 33 languages
🔒 No regressions observed

Considerations

Performance: No new dependencies; regex operates in O(n); GLiNER usage remains cached.
Privacy: No additional data collection; operates on transient ASR text only.
Reliability: Fully covered by test suite with fallback handling for edge cases.

Docs

No user-facing changes; existing documentation and inline comments are sufficient.

AI Usage

Claude and OpenCode were used to assist with regex design, edge case handling, and PR structuring.
All logic and changes were manually reviewed and validated.

Files

backend/utils/speaker_identification.py
.gitignore

Video Recording (Test Cases)

greptile-apps · 2026-04-17T03:05:32Z

Greptile Summary

This PR integrates GLiNER NER as a speaker-detection layer in speaker_identification.py, adds expanded English regex patterns ("This is X", "Call me X", etc.), and introduces lowercase ASR fallback handling. The new batch_detect_speakers_from_texts async helper and a 258-line test suite are also included.

P1 – in-function import: from gliner import GLiNER inside _get_gliner_model() directly violates the project's "no in-function imports" rule; since gliner is now in requirements.txt, it should be a top-level import.
P1 – _contains_intro_phrase false positives: plain substring matching on short tokens ("я", "olen", "sono", "soy") bypasses GLiNER for entire Slavic-language conversations and for common proper nouns, silently degrading detection quality.
P1 – _clean_person_name two-letter name bug: the len(first_word) <= 2 guard discards valid two-letter first names (Ed, Bo, Jo) and returns the second word instead.

Confidence Score: 2/5

Not safe to merge — three P1 logic bugs need fixes before landing

The in-function import is a clear rule violation per CLAUDE.md; the substring-matching false-positive in _contains_intro_phrase silently breaks GLiNER bypass for entire languages (e.g., all Russian text containing 'я'); and _clean_person_name incorrectly discards two-letter first names. All three are present-defect correctness issues on the changed code path.

backend/utils/speaker_identification.py (all three P1 issues); backend/requirements.txt (unpinned gliner version)

Important Files Changed

Filename	Overview
backend/utils/speaker_identification.py	Adds GLiNER-based NER for speaker detection; contains in-function import violation, substring false-positive logic in _contains_intro_phrase for short Slavic tokens, and a 2-letter first-name truncation bug in _clean_person_name
backend/requirements.txt	Adds gliner as a new ML dependency with a loose minimum-version constraint (>=0.2.0) contrary to the exact-pin convention used throughout the rest of the file
backend/tests/unit/test_gliner_ner.py	New unit test file with reasonable coverage of _clean_person_name and detect_speaker_from_text; tests for 'This is John' actually exercise the lowercase-fallback path, not GLiNER itself
backend/test.sh	New test file correctly added to the CI test runner
.gitignore	Adds /tmp/, .pyc, and pycache/ ignore entries; pycache*/ is usually already covered by defaults and /tmp/ is very broad

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["detect_speaker_from_text(text)"] --> B{text empty\nor len < 3?}
    B -->|Yes| Z[return None]
    B -->|No| C["_detect_person_entities_cached(text)"]
    C --> D{"_contains_intro_phrase(text)?"}
    D -->|Yes| E["Skip GLiNER\nreturn ([], True)"]
    D -->|No| F["GLiNER model.predict_entities()"]
    F -->|Exception| G["return ([], False)"]
    F -->|Success| H["Filter intro-phrase words\n+ len >= 2"]
    H --> I["return (persons, True)"]
    E --> J{ner_available\nand persons?}
    I --> J
    G --> J
    J -->|Yes| K["_clean_person_name(person)\nfor each person"]
    K --> L{cleaned not None?}
    L -->|Yes| M[return cleaned name]
    L -->|No| N[try next person]
    J -->|No| O["Try regex patterns\n(patterns_to_check)"]
    N --> O
    O --> P{regex match?}
    P -->|Yes| Q[return capitalized name]
    P -->|No| R["Strip filler words\nfrom text_lower"]
    R --> S{"startswith 'this is'?"}
    S -->|Yes| T[return first_word.capitalize]
    S -->|No| V{"startswith intro_phrases?"}
    V -->|Yes| W[return first_word.capitalize]
    V -->|No| Z2[return None]

_{Reviews (1): Last reviewed commit: "Merge branch 'BasedHardware:main' into m..." | Re-trigger Greptile}

greptile-apps · 2026-04-17T03:05:37Z

+def _contains_intro_phrase(text: str) -> bool:
+    """Check if text contains any intro phrase."""
+    text_lower = text.lower()
+    for phrase in GLINER_INTRO_PHRASES:
+        if phrase in text_lower:
+            return True
+    return False


Substring matching on short phrases causes widespread false GLiNER bypasses

_contains_intro_phrase uses plain substring containment (phrase in text_lower) over phrases that include single or very short tokens. For example: "я" (Cyrillic "I") is in GLINER_INTRO_PHRASES and will match as a substring inside меня, моя, твоя, and practically every sentence of Russian text, causing GLiNER to be silently bypassed for the whole language even on non-introduction utterances. Similarly, "olen" matches inside last names like "Bolen", and "sono" matches inside "Sonoma". The function should use word-boundary matching for short tokens:

Suggested change

def _contains_intro_phrase(text: str) -> bool:

"""Check if text contains any intro phrase."""

text_lower = text.lower()

for phrase in GLINER_INTRO_PHRASES:

if phrase in text_lower:

return True

return False

def _contains_intro_phrase(text: str) -> bool:

"""Check if text contains any intro phrase (word-boundary aware)."""

text_lower = text.lower()

for phrase in GLINER_INTRO_PHRASES:

if len(phrase) <= 4:

# Short tokens: require word boundaries to avoid substring false positives

if re.search(r'\b' + re.escape(phrase) + r'\b', text_lower):

return True

else:

if phrase in text_lower:

return True

return False

this issue is fixed

greptile-apps · 2026-04-17T03:05:38Z

+    if first_word in NAME_PREFIXES and len(words) > 1:
+        return words[1].capitalize()
+
+    if len(first_word) <= 2 and first_word not in NAME_PREFIXES:
+        if len(words) > 1:
+            return words[1].capitalize()
+        return None


Two-letter first names incorrectly treated as non-name tokens

The <= 2 guard is designed to skip pronouns like "I" or articles like "A", but it also catches valid two-letter first names (Bo, Ed, Al, Jo, Li). For a GLiNER-extracted entity like "Ed Sullivan", first_word = "ed", len("ed") == 2 <= 2, so the function returns "Sullivan" (the last name) instead of "Ed". The condition should only exclude single-character tokens:

Suggested change

if first_word in NAME_PREFIXES and len(words) > 1:

return words[1].capitalize()

if len(first_word) <= 2 and first_word not in NAME_PREFIXES:

if len(words) > 1:

return words[1].capitalize()

return None

if len(first_word) == 1 and first_word not in NAME_PREFIXES:

if len(words) > 1:

return words[1].capitalize()

return None

greptile-apps · 2026-04-17T03:05:39Z

 fsspec==2024.6.1
 gitdb==4.0.11
 GitPython==3.1.43
+gliner>=0.2.0


Unpinned version constraint breaks reproducible builds

Every other dependency in this file is pinned to an exact version (==). gliner>=0.2.0 allows any future major version to be installed, which can introduce breaking changes silently. Pin to the specific tested version:

Suggested change

gliner>=0.2.0

gliner==0.2.17

(Replace with whatever version was tested locally.)

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

MithilSaiReddy · 2026-04-17T03:20:04Z

@aaravgarg @beastoin guys i cleared the issues by the bot btw !

MithilSaiReddy and others added 2 commits April 17, 2026 08:08

Fixed BasedHardware#3039

57d4909

Merge branch 'BasedHardware:main' into main

55cfd7b

greptile-apps bot reviewed Apr 17, 2026

View reviewed changes

MithilSaiReddy and others added 2 commits April 17, 2026 08:37

Update backend/utils/speaker_identification.py

e8f0dba

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

Fixed!

b104a91

MithilSaiReddy mentioned this pull request Apr 17, 2026

Use NER (Named Entity Recognition) or better techniques (like self-hosted LLM) to improve speaker detection based on transcripts ($500) #3039

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved speaker detection with GLiNER + expanded regex#6735

Improved speaker detection with GLiNER + expanded regex#6735
MithilSaiReddy wants to merge 4 commits intoBasedHardware:mainfrom
MithilSaiReddy:main

MithilSaiReddy commented Apr 17, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Apr 17, 2026

Uh oh!

Uh oh!

greptile-apps bot Apr 17, 2026

Uh oh!

MithilSaiReddy Apr 17, 2026

Uh oh!

greptile-apps bot Apr 17, 2026

Uh oh!

greptile-apps bot Apr 17, 2026

Uh oh!

MithilSaiReddy commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MithilSaiReddy commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Improve Speaker Detection (Issue #3039)

What this delivers

Key implementation

GLiNER filtering

Expanded regex patterns

Lowercase ASR handling

Multi-word name fix

Impact

Test Results

Considerations

Docs

AI Usage

Files

Video Recording (Test Cases)

Uh oh!

greptile-apps bot commented Apr 17, 2026

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

greptile-apps bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

MithilSaiReddy Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

MithilSaiReddy commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

MithilSaiReddy commented Apr 17, 2026 •

edited

Loading