author-watch

Monitor specific authors across multiple publication venues and get notified whenever they publish something new.

Most RSS readers let you follow a publication — but not a specific author within it. author-watch solves that: define the authors you care about, point it at their publication pages or RSS feeds, and it tracks new articles across all of them. When something new appears, it can notify you via Telegram and/or automatically save it to Readwise Reader.

Features

Author-specific tracking — follow individual writers, not whole publications
Multiple venues per author — a single author can publish on a think tank site, a Substack, and a magazine; watch all of them
RSS + HTML scraping — RSS where available, CSS-selector HTML scraping as fallback
Deduplication — same article found on two venues in one run is only reported once
Telegram notifications — get a message with title + link when something new appears
Readwise Reader integration — articles are automatically saved and tagged by author name
Stateful — tracks what's been seen so you only hear about genuinely new articles
Dry-run mode — test your config without sending any notifications
Venue discovery — search the web to find new publication venues for any author, then interactively add them to your watch list

Quick start

# 1. Install dependencies
pip install requests pyyaml python-dotenv beautifulsoup4

# 2. Copy and edit the config
cp authors.yaml.example authors.yaml
$EDITOR authors.yaml

# 3. Copy and edit credentials
cp .env.example .env
$EDITOR .env

# 4. Dry-run to verify article detection without sending anything
python watcher.py --dry-run --verbose

# 5. Run for real
python watcher.py

Discovering new venues

Not sure where an author publishes? Run the discovery tool:

# Show up to 12 new venues where "Michael Doran" has published in the last year
python discover.py "Michael Doran"

# Show 20 results, search back 2 years
python discover.py "Michael Doran" --limit 20 --year 2

# Search then interactively choose which venues to add to authors.yaml
python discover.py "Michael Doran" --add

# Machine-readable JSON output
python discover.py "Michael Doran" --json

The tool:

Runs multiple targeted web searches for the author's recent work
Deduplicates results by domain
Probes each new domain for RSS autodiscovery
Shows already-watched venues (marked ✓) and new ones (numbered)
With --add: lets you type numbers to select venues, then appends them to authors.yaml automatically — using RSS if found, HTML otherwise

For best results, add a free search API key to .env:

# Serper.dev (2,500 free searches/month) — https://serper.dev
SERPER_API_KEY=your_key_here

# OR Brave Search API — https://api.search.brave.com
BRAVE_API_KEY=your_key_here

Without an API key the tool falls back to DuckDuckGo HTML scraping, which is rate-limited after a few queries. A single Serper free account is more than sufficient for occasional discovery runs.

Note on false positives: web search results may include different people who share the author's name. Always review the sample titles/URLs before adding a venue.

Configuration: authors.yaml

authors:
  - name: "Jane Author"           # Display name in notifications
    match_names:                  # Name variants to match in RSS text
      - "Jane Author"
      - "J. Author"
    venues:
      # Option 1 — RSS feed (simplest, most reliable)
      - type: rss
        url: "https://example.com/rss.xml"

      # Option 2 — HTML author page with CSS selectors
      - type: html
        url: "https://example.com/experts/jane-author"
        item_selector: ".article-card"       # Container for each article
        date_selector: ".article-date"       # Date within container (optional)
        # Only include cards that link to this author's profile page
        # (useful on multi-author pages where articles from other authors
        # appear in the same list)
        author_link_contains: "/jane-author"

See authors.yaml.example for more patterns.

Field reference

Field	Type	Description
`name`	str	Author display name used in notifications
`match_names`	list[str]	Substrings to match in RSS author/title/content fields
`venues[].type`	`rss` or `html`	Venue type
`venues[].url`	str	RSS feed URL or HTML page URL
`venues[].item_selector`	str	CSS selector for article cards (HTML only)
`venues[].title_selector`	str	CSS selector for title within a card (HTML only, optional)
`venues[].link_selector`	str	CSS selector for the link within a card (HTML only, optional)
`venues[].date_selector`	str	CSS selector for date within a card (HTML only, optional)
`venues[].author_link_contains`	str	Filter cards by author link substring (HTML only, optional)

Environment variables

Copy .env.example to .env and fill in at least one output:

Variable	Required	Description
`TELEGRAM_BOT_TOKEN`	One of these	Bot token from @BotFather
`TELEGRAM_CHAT_ID`	If using Telegram	Your chat/user ID (get it from @userinfobot)
`READWISE_TOKEN`	One of these	Readwise API token from readwise.io/access_token
`SERPER_API_KEY`	Optional	Serper.dev key for richer venue discovery
`BRAVE_API_KEY`	Optional	Brave Search API key for venue discovery

At least one of TELEGRAM_BOT_TOKEN or READWISE_TOKEN must be set. Both can be active simultaneously.

Outputs

Telegram

When a new article is found, sends a message like:

✍️ Jane Author published on example.com

Article Title Here
📅 2026-05-30

Readwise Reader

Articles are saved to your Reader inbox via the Reader API, automatically tagged with the author's name (e.g. jane-author). Readwise fetches and parses the full article content.

Scheduling

Run watcher.py on a schedule using any method you prefer:

cron (e.g. twice daily at 6am and 6pm):

0 6,18 * * * cd /path/to/author-watch && python watcher.py >> watcher.log 2>&1

systemd timer, launchd (macOS), or any task scheduler also work.

State file

state.json tracks all article IDs that have been notified. It is created automatically on first run. To re-trigger notifications for all currently visible articles (useful for testing), delete it:

rm state.json
python watcher.py --dry-run  # preview what would fire

CLI reference

python watcher.py [options]

Options:
  --config PATH    Path to authors.yaml  (default: ./authors.yaml)
  --state PATH     Path to state.json    (default: ./state.json)
  --dry-run        Check for new articles without sending any notifications
  --verbose / -v   Enable DEBUG logging

python discover.py AUTHOR [options]

Options:
  --limit N        Max new venues to return (default: 12)
  --year N         How many years back to search (default: 1)
  --no-rss-probe   Skip RSS feed probing (faster)
  --add            Interactive: choose venues to add to authors.yaml
  --json           Output results as JSON
  --config PATH    Path to authors.yaml (default: ./authors.yaml)
  --verbose / -v   Enable DEBUG logging

Running tests

pip install pytest
python -m pytest test_watcher.py -v

Tests use mocked HTTP — no real network calls or notifications are made.

Common venue patterns

Publication type	Recommended approach
Substack	RSS: `https://AUTHOR.substack.com/feed`
Medium	RSS: `https://medium.com/feed/@USERNAME`
WordPress blog	RSS: usually `/feed` or `/rss.xml`
Hudson Institute	HTML + `item_selector: ".research-card"` + `author_link_contains: "/EXPERT-ID"`
The Free Press	HTML author page: `https://www.thefp.com/w/AUTHOR-SLUG`
Generic think tank / magazine	HTML author page with appropriate CSS selectors

RSS is always preferred when available — it's faster and more reliable than HTML scraping.

Dependencies

requests — HTTP
pyyaml — config parsing
beautifulsoup4 — HTML scraping (falls back to regex if not installed)
python-dotenv — .env file loading (optional)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

author-watch

Features

Quick start

Discovering new venues

Configuration: authors.yaml

Field reference

Environment variables

Outputs

Telegram

Readwise Reader

Scheduling

State file

CLI reference

Running tests

Common venue patterns

Dependencies

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
authors.yaml.example		authors.yaml.example
discover.py		discover.py
test_watcher.py		test_watcher.py
watcher.py		watcher.py

Folders and files

Latest commit

History

Repository files navigation

author-watch

Features

Quick start

Discovering new venues

Configuration: authors.yaml

Field reference

Environment variables

Outputs

Telegram

Readwise Reader

Scheduling

State file

CLI reference

Running tests

Common venue patterns

Dependencies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages