Mnemos Case Study | Sean Walsh

Problem and Motivation

I bookmark a lot on X. Technical threads, Go deep-dives, system design explainers, random debugging tips. Over time it turns into hundreds of posts with zero organization. X gives you a chronological list and nothing else. No search, no filtering, no way to tell learning content apart from memes I saved at 2am.

The bigger problem is that bookmarking something doesn't mean I learned it. I'll save a 15-tweet thread on Go channel internals, never look at it again, and six months later I can't remember any of it. I wanted a system that actually tracks what I've studied and resurfaces material on a schedule, not just a better bookmark viewer.

Architecture Overview

It's a monorepo with a Go backend and a React TypeScript frontend. The backend handles everything: OAuth with X, bookmark syncing on a cron schedule, media downloading, LLM classification, thread expansion, and serving the SPA. The frontend builds to backend/web/dist and gets compiled into the Go binary via go:embed, so the whole thing is a single executable. No separate frontend server, no reverse proxy, just ./mnemos and you're running.

The sync pipeline has a few moving parts. On each tick, SyncService.Run grabs the last synced bookmark ID from SQLite, probes the X API with max_results=1 to check for new content, and if the top ID changed, paginates through everything new. Each page gets persisted immediately with a resume token, so if the process crashes mid-sync you don't re-fetch from scratch. Media gets downloaded in the same pass, photos and videos stored to ./data/media/{post_id}/ with paths recorded in the bookmark's JSON column. After sync, unclassified bookmarks get batched (up to 20 at a time) and sent to the Claude API for topic and intent tagging.

Bookmark fetching is abstracted behind a BookmarkSource interface:

type BookmarkSource interface {
    FetchNew(ctx context.Context, sinceID string) ([]Bookmark, error)
}

The X implementation is the only source right now, but the interface keeps the door open for Bluesky or whatever comes next without touching the sync logic.

Key Technical Decisions and Tradeoffs

SQLite with modernc.org/sqlite instead of CGO. I wanted a single static binary, and the standard mattn/go-sqlite3 driver requires CGO which means cross-compilation gets annoying fast. modernc.org/sqlite is a pure Go translation of the SQLite C source. It's slower, maybe 2-3x on write-heavy workloads, but for a single-user app writing a handful of bookmarks per sync cycle that's completely irrelevant. Worth it for go build just working everywhere.

Paged sync with resume tokens. The initial sync can pull up to 800 bookmarks (X's hard cap). That's 8 pages of API calls, media downloads, and DB inserts. If the process dies halfway through, I didn't want to start over. So each page callback persists a resume token to sync_metadata, and on restart the sync picks up from the last checkpoint. Probably overkill for a personal tool, but I got bitten by a crash during the first full sync while testing and added it immediately after.

fetchedNewestID, fetchErr := source.FetchNewPages(ctx, lastSyncedID, resumeToken, 0,
    func(bookmarks []models.Bookmark, nextToken string) error {
        for _, bookmark := range bookmarks {
            if err := s.db.InsertBookmark(bookmark); err != nil {
                return fmt.Errorf("persist bookmark %q: %w", bookmark.ID, err)
            }
        }
        return s.db.SetSyncResumeToken(nextToken)
    })

Claude API for classification, not local models. I tried thinking about running a local classifier but the quality would've been noticeably worse for topic tagging, and the API cost is negligible. A batch of 20 bookmarks costs maybe a fraction of a cent. The system prompt constrains output to a fixed taxonomy loaded from YAML, and the response parsing is defensive enough to handle the LLM occasionally returning unknown topics or invalid intents (it drops unknown topics and defaults invalid intents to "other").

FTS5 for full-text search. SQLite's FTS5 extension handles the search feature. I set up triggers on the bookmarks table so the FTS index stays in sync automatically on insert, update, and delete. It's not elasticsearch, but for searching through a few hundred bookmarks by text, author, or URL it's instant and requires zero additional infrastructure.

Screenshots and Video

No screenshots yet. Planning to capture the bookmark list with topic chips and intent badges, the thread expansion view, and the topic filtering panel.

Tech Stack with Rationale

Go for the backend because I needed in-process cron scheduling (robfig/cron), goroutines for the sync pipeline, and a single binary output. Also just the language I'm fastest in. The standard library HTTP server handles routing and the SPA serving without needing a framework.
React 19 + TypeScript for the frontend. It's a straightforward SPA with a bookmark list, topic chips, search bar, and thread expansion. Nothing fancy enough to need a meta-framework, so it's just Vite + React Router.
SQLite because there's no reason to run Postgres for a single-user local app. The pure Go driver (modernc.org/sqlite) means no CGO and the whole thing compiles to one binary. FTS5 handles search. JSON columns handle the flexible schema parts (media metadata, topic arrays, raw API responses).
Claude API (Anthropic Messages API) for bookmark classification. Sends batches of bookmark text with a system prompt specifying the allowed taxonomy, gets back structured JSON with topic tags and intent labels. Cost is somewhere around $0.10-0.50/month.
X API v2 with OAuth 2.0 PKCE for bookmark access. The offline.access scope is critical for background sync because without it tokens expire in 2 hours with no refresh token. Thread expansion uses the search API with conversation_id queries.
Vite for frontend builds. Fast, good TypeScript support, the --watch mode feeds into the air hot-reload setup for full-stack development.

Challenges and Learnings

The X API's pay-per-use tier launched in January 2026 and it's been shaky. There are active reports of 401 errors on the bookmarks endpoint. I hit this during development and had to work through it. The fallback is the Basic tier at $200/month, which is absurd for a personal tool, so I'm hoping the pay-per-use stabilizes.
X caps bookmark access at 800. If you have more bookmarks than that, the oldest ones are just gone. I didn't realize this until I read the API docs more carefully. The mitigation is "sync early, sync often" since once a bookmark is in your local DB it's there forever.
The incremental sync has a surprising number of edge cases. What if a bookmark gets deleted between polls? What if the API returns a different ordering? What if the resume token becomes stale after a schema change? I ended up with a selectNewestBookmarkID function that compares by bookmarked_at, then created_at, then raw ID as a tiebreaker, because relying on array position wasn't reliable.
Getting OAuth PKCE right with X took longer than I expected. The callback server spins up a temporary HTTP listener on 127.0.0.1:8080/oauth/callback, waits for the redirect, exchanges the code with a PKCE verifier, and shuts down. It works, but the flow of "open browser, authorize, callback hits local server, exchange code, persist token" has a lot of moving pieces that can fail quietly.
The classification system prompt needs to be pretty explicit about output format. Early versions of the prompt let the LLM return markdown-wrapped JSON (```json blocks), so the parser has to strip those prefixes. It also sometimes returns topic strings that don't match the taxonomy exactly, so there's a validation layer that drops unknown topics and logs them.