Monitor specific authors across multiple publication venues and get notified whenever they publish something new.
Most RSS readers let you follow a publication — but not a specific author within it. author-watch solves that: define the authors you care about, point it at their publication pages or RSS feeds, and it tracks new articles across all of them. When something new appears, it can notify you via Telegram and/or automatically save it to Readwise Reader.
- Author-specific tracking — follow individual writers, not whole publications
- Multiple venues per author — a single author can publish on a think tank site, a Substack, and a magazine; watch all of them
- RSS + HTML scraping — RSS where available, CSS-selector HTML scraping as fallback
- Deduplication — same article found on two venues in one run is only reported once
- Telegram notifications — get a message with title + link when something new appears
- Readwise Reader integration — articles are automatically saved and tagged by author name
- Stateful — tracks what's been seen so you only hear about genuinely new articles
- Dry-run mode — test your config without sending any notifications
- Venue discovery — search the web to find new publication venues for any author, then interactively add them to your watch list
# 1. Install dependencies
pip install requests pyyaml python-dotenv beautifulsoup4
# 2. Copy and edit the config
cp authors.yaml.example authors.yaml
$EDITOR authors.yaml
# 3. Copy and edit credentials
cp .env.example .env
$EDITOR .env
# 4. Dry-run to verify article detection without sending anything
python watcher.py --dry-run --verbose
# 5. Run for real
python watcher.pyNot sure where an author publishes? Run the discovery tool:
# Show up to 12 new venues where "Michael Doran" has published in the last year
python discover.py "Michael Doran"
# Show 20 results, search back 2 years
python discover.py "Michael Doran" --limit 20 --year 2
# Search then interactively choose which venues to add to authors.yaml
python discover.py "Michael Doran" --add
# Machine-readable JSON output
python discover.py "Michael Doran" --jsonThe tool:
- Runs multiple targeted web searches for the author's recent work
- Deduplicates results by domain
- Probes each new domain for RSS autodiscovery
- Shows already-watched venues (marked ✓) and new ones (numbered)
- With
--add: lets you type numbers to select venues, then appends them toauthors.yamlautomatically — using RSS if found, HTML otherwise
For best results, add a free search API key to .env:
# Serper.dev (2,500 free searches/month) — https://serper.dev
SERPER_API_KEY=your_key_here
# OR Brave Search API — https://api.search.brave.com
BRAVE_API_KEY=your_key_here
Without an API key the tool falls back to DuckDuckGo HTML scraping, which is rate-limited after a few queries. A single Serper free account is more than sufficient for occasional discovery runs.
Note on false positives: web search results may include different people who share the author's name. Always review the sample titles/URLs before adding a venue.
authors:
- name: "Jane Author" # Display name in notifications
match_names: # Name variants to match in RSS text
- "Jane Author"
- "J. Author"
venues:
# Option 1 — RSS feed (simplest, most reliable)
- type: rss
url: "https://example.com/rss.xml"
# Option 2 — HTML author page with CSS selectors
- type: html
url: "https://example.com/experts/jane-author"
item_selector: ".article-card" # Container for each article
date_selector: ".article-date" # Date within container (optional)
# Only include cards that link to this author's profile page
# (useful on multi-author pages where articles from other authors
# appear in the same list)
author_link_contains: "/jane-author"See authors.yaml.example for more patterns.
| Field | Type | Description |
|---|---|---|
name |
str | Author display name used in notifications |
match_names |
list[str] | Substrings to match in RSS author/title/content fields |
venues[].type |
rss or html |
Venue type |
venues[].url |
str | RSS feed URL or HTML page URL |
venues[].item_selector |
str | CSS selector for article cards (HTML only) |
venues[].title_selector |
str | CSS selector for title within a card (HTML only, optional) |
venues[].link_selector |
str | CSS selector for the link within a card (HTML only, optional) |
venues[].date_selector |
str | CSS selector for date within a card (HTML only, optional) |
venues[].author_link_contains |
str | Filter cards by author link substring (HTML only, optional) |
Copy .env.example to .env and fill in at least one output:
| Variable | Required | Description |
|---|---|---|
TELEGRAM_BOT_TOKEN |
One of these | Bot token from @BotFather |
TELEGRAM_CHAT_ID |
If using Telegram | Your chat/user ID (get it from @userinfobot) |
READWISE_TOKEN |
One of these | Readwise API token from readwise.io/access_token |
SERPER_API_KEY |
Optional | Serper.dev key for richer venue discovery |
BRAVE_API_KEY |
Optional | Brave Search API key for venue discovery |
At least one of TELEGRAM_BOT_TOKEN or READWISE_TOKEN must be set. Both can be active simultaneously.
When a new article is found, sends a message like:
✍️ Jane Author published on example.com
Article Title Here
📅 2026-05-30
Articles are saved to your Reader inbox via the Reader API, automatically tagged with the author's name (e.g. jane-author). Readwise fetches and parses the full article content.
Run watcher.py on a schedule using any method you prefer:
cron (e.g. twice daily at 6am and 6pm):
0 6,18 * * * cd /path/to/author-watch && python watcher.py >> watcher.log 2>&1
systemd timer, launchd (macOS), or any task scheduler also work.
state.json tracks all article IDs that have been notified. It is created automatically on first run. To re-trigger notifications for all currently visible articles (useful for testing), delete it:
rm state.json
python watcher.py --dry-run # preview what would firepython watcher.py [options]
Options:
--config PATH Path to authors.yaml (default: ./authors.yaml)
--state PATH Path to state.json (default: ./state.json)
--dry-run Check for new articles without sending any notifications
--verbose / -v Enable DEBUG logging
python discover.py AUTHOR [options]
Options:
--limit N Max new venues to return (default: 12)
--year N How many years back to search (default: 1)
--no-rss-probe Skip RSS feed probing (faster)
--add Interactive: choose venues to add to authors.yaml
--json Output results as JSON
--config PATH Path to authors.yaml (default: ./authors.yaml)
--verbose / -v Enable DEBUG logging
pip install pytest
python -m pytest test_watcher.py -vTests use mocked HTTP — no real network calls or notifications are made.
| Publication type | Recommended approach |
|---|---|
| Substack | RSS: https://AUTHOR.substack.com/feed |
| Medium | RSS: https://medium.com/feed/@USERNAME |
| WordPress blog | RSS: usually /feed or /rss.xml |
| Hudson Institute | HTML + item_selector: ".research-card" + author_link_contains: "/EXPERT-ID" |
| The Free Press | HTML author page: https://www.thefp.com/w/AUTHOR-SLUG |
| Generic think tank / magazine | HTML author page with appropriate CSS selectors |
RSS is always preferred when available — it's faster and more reliable than HTML scraping.
requests— HTTPpyyaml— config parsingbeautifulsoup4— HTML scraping (falls back to regex if not installed)python-dotenv—.envfile loading (optional)