RFC-002 — RSS / Atom subscription management
- Status: Draft
- Authors: @yiidtw
- Created: 2026-04-21
- Related: RFC-001 (feature flags,
amem rss setup)
TL;DR
Add a first-class subscription layer on top of the existing capture pipeline. Users amem sub add <feed> to follow a source (arxiv category, blog, YouTube channel); a polling loop fetches the feed, dedups by GUID, and routes each new item through the existing capture + (optional) compile flow. MCP exposes amem_subscribe / amem_sub_list so agents can manage the user’s reading queue.
Motivation
amem’s capture flow is reactive: it only runs when a human (or agent) hands it a URL. That makes it useless for tracking ongoing sources:
- Following arxiv
cs.CLas new papers drop - Karpathy / Willison / lesswrong blog posts
- A YouTube channel’s new uploads (YouTube publishes per-channel RSS natively)
Every knowledge worker we’ve talked to does some version of this manually today — Feedly/NetNewsWire for reading, then copy-paste URLs into whatever capture tool they use. amem can collapse both steps.
This also inverts the self-recording story: amem is designed for you to produce content into. RSS lets other people’s content flow in on the same rails, so the wiki grows continuously rather than only after active capture.
Proposal
1. Subscription storage
# ~/.amem/subscriptions.toml
version = 1
[[subscription]]
id = "arxiv-cs-cl"
url = "http://export.arxiv.org/rss/cs.CL"
title = "arXiv cs.CL (Computation and Language)"
auto_compile = false # capture-only by default; compile is opt-in
poll_minutes = 60
enabled = true
added_at = "2026-04-21T00:00:00Z"
last_polled = "2026-04-21T00:30:00Z"
[[subscription]]
id = "3b1b-channel"
url = "https://www.youtube.com/feeds/videos.xml?channel_id=UCYO_jab_esuFRV4b17AJtAw"
title = "3Blue1Brown"
auto_compile = true # small channel, OK to auto-transcribe
poll_minutes = 240
enabled = true
2. Dedup ledger
~/.amem/subscriptions/
ledger.jsonl # append-only, one JSON object per seen item
state/{sub_id}/last_etag # HTTP caching
Each ledger line:
{"sub_id":"3b1b-channel","guid":"yt:video:aircAruvnKk","captured_at":"2026-04-21T00:30:00Z","cite_key":"3blue1brown2017neural"}
Dedup is GUID-based. If a feed republishes an item (edit, repost), the existing capture wins; we don’t re-download.
3. CLI
amem sub add <url> [--auto-compile] [--poll-minutes N] [--title "..."]
amem sub list [--json]
amem sub remove <id>
amem sub enable|disable <id>
amem sub fetch [<id>] # one-shot poll, honours etag
amem sub daemon # long-running poller (used by service unit)
amem rss setup # install daemon (macOS launchd / Linux systemd user)
4. Poll algorithm
For each enabled sub whose now - last_polled >= poll_minutes:
- GET feed with
If-None-Match: {last_etag}andIf-Modified-Since: {last_polled} - 304 → update
last_polled, skip - 200 → parse via
feed-rs, iterate items - For each item not in ledger:
- Route to existing
cite::cmd_capture(item.link)(auto-picks arxiv / PDF / YouTube based on URL) - If
auto_compile = true→ also call the appropriatecmd_compile - Write ledger line
- Route to existing
- Update ledger + state
Failures per-item don’t block the rest of the feed. Aggregate failures re-queue with exponential backoff (15 min → 2 h cap).
5. MCP surface
amem_sub_add(url, auto_compile?) -> sub_id
amem_sub_list() -> [{ id, title, last_polled, enabled, ... }]
amem_sub_remove(id)
amem_sub_fetch(id?) -> {fetched: N, captured: M, errors: [...] }
This lets an agent maintain its own research feed without a human in the loop: “follow every arxiv paper that cites Vaswani 2017” becomes a single MCP call.
6. Gated behind features.rss
Disabled by default. amem rss setup enables it, installs the daemon, and writes features.rss = true to ~/.amem/config.toml (per RFC-001).
Non-goals
- Rich reader UI. amem is not Feedly. Reading lives in the wiki +
amem recall. If people want visual unread counts, that belongs in an extension page, not the core. - OPML import on day 1. Easy add later; skip for MVP to keep surface small.
- Arbitrary scheduling cron.
poll_minutesis enough; cron-syntax scheduling is out of scope. - Podcast audio-only feeds. These would need whisper anyway — treat them as RFC-002b when YouTube pipeline is stable on more models.
Risks
| Risk | Mitigation |
|---|---|
| A popular arxiv category fills disk (dozens of papers/day) | poll_minutes default 120 + per-feed disk quota + user confirmation on first-time auto_compile = true |
| Feed publisher rate-limits us | Honour Retry-After, respect 429; back off to 6 h for repeat offenders |
| Duplicate captures when arxiv updates a paper’s version | Keep first ingest; subsequent versions append a note to the existing wiki entry rather than creating a new cite_key |
| RSS spec is loose — malformed feeds break parser | feed-rs handles common variants; log + skip malformed entries, do not abort the poll |
Concrete work
- Rust crate additions:
feed-rs = "2",toml_edit = "0.22"(config writes preserve comments) (amem-sh) amem subsubcommand family (amem-sh)amem rss setupinstaller andamem sub daemonlong-runner (amem-sh)- MCP tools (
amem-sh) - SPEC.md: add
subscriptionto the storage layout section (amem-hq) - Docs: new
guide/subscriptions.mdpage (amem-hq)
Extension UI for managing subscriptions is deferred — CLI first.