The Model on Your Machine

The default for journalism AI has been to risk sending sensitive queries through someone else's model, with logs you can't see, under terms that can change. For a lot of stories that's fine. For the ones that need to stay closed — sources, locations, reporting in progress — it meant doing things the old way.

Not anymore.

I fine-tuned two Gemma4 models for journalism last week. Over 4,000 downloads so far, and gemma4-e4b-journalist runs on a laptop, no API calls. The training set includes 700 reasoning chains designed by Opus, grounded in publicly available investigative methodology from UNESCO, GIJN, Bellingcat, Indicator, CiFAR, Al Jazeera, and SPJ. The results are wild, and fast.

A weekend and a Macbook can do what used to be impossible. The next logical step? Migrating all systems and tools to sovereign models, hosted in Switzerland, and building on them in the future.

The stack

Three tools shipped into the open this month. Two into public beta, one open-sourced as a Claude Code plugin.

Navigator

OSINT Navigator is live — 7,500+ tools aggregated from nine open-source toolkits on weekly updates, sharpened by community votes. A WebGPU toggle downloads the fine-tuned LLM into the browser, so sensitive queries never leave the device. API-ready for agents. Built with Craig Silverman and Alexios Mantzarlis at Indicator.

Scoutpost

Scoutpost is in public beta under MuckRock, successor to Klaxon. Scouts track web pages, locations, beats, social profiles, and council meetings — extracting political promises with due dates and pinging you when they hit. Full API, self-hosted license, open source under Sustainable Use.

Spotlight

I open-sourced Spotlight — my Claude Code plugin for OSINT. Investigation and fact-checking sub-agents loop in cycles, pausing at gates for your approval. Firecrawl, Exa, or Tavily ground the research; evidence stays local. Install from the Claude Code marketplace.

Model-agnostic this week, sovereign in May.

Field notes

Two case studies this cycle.

Cleared — a collaboration model for creators and investigative media

Cleared investigates India's eviction campaign in Assam — 100,000+ displaced across 33 operations since 2021, reported with Ahmer Khan and The New Humanitarian.

Independent creators have production skills. Investigative newsrooms have editorial depth. The collaboration is natural.

The pilot split the work: TNH brought field reporting, editorial, and their audience; I brought visual production, the AI investigation pipeline, and video distribution. Same investigation, two outputs — longform scrollytelling on TNH, a 10-minute YouTube documentary with shorts across TikTok and Instagram on my side. A template for multiplying reach without duplicating effort.

AI archive enrichment for MediaStorm

A semantic search layer over MediaStorm's 20-year documentary archive, 350+ stories searchable by meaning rather than keyword.

Every media organization sits on an archive. AI turns it into the librarian that never forgets.

Worth your time

Tools

Exoscale — Swiss sovereign cloud. Dedicated inference in Zurich from EUR 1.34/hr per GPU.

Fireworks AI — fastest open-weight inference, fine-tuning, and embeddings in one provider.

Turnstone — Bellingcat's new open-source ADS-B flight-data explorer. Crawler and database, not a chat interface.

ImageWhisperer OCR — Henk van Ess' detector for AI-generated documents. Scans text for the model's "tells."

Netryx Astra V2 — open-source geolocation from a single photo. Pure computer vision, no LLMs.

OpenSanctions — 320+ sources aggregated into one sanctions, PEP, and criminal-watchlist database. Start here for any ownership trace.

Visual

The 47th News & Documentary Emmy nominations landed this month — the best snapshot of video journalism right now.

Newpress — Johnny and Iz Harris' creator-journalism studio, nominated for Outstanding Graphic Design: News. The first creator-journalism shop to sit next to Bloomberg Originals in that category. Their launch manifesto video below is the thesis: community-funded reporting, no algorithmic middlemen, journalists who own their audience.

Same ballot: Bellingcat pulled three nominations through co-productions.

Exit signal

Owning your stack handles the sovereignty problem. It doesn't handle the verification problem.

A Reuters Institute analysis argues that generative AI is breaking three of OSINT's foundational assumptions. Location and time no longer anchor authenticity — ChatGPT confidently geolocated AI-fabricated "CCTV" footage without spotting the fake. Verification is no longer replicable — Gemini reinterpreted a Kyiv missile strike as an LA fire because the prompt nudged it.

Even if the tools run on our machines, they can still lie to us there.

The journalists who adapt develop dual literacy — how generative models see, how they fail — combined with local expertise, documented steps, and human verification across independent sources. None of that is new. All of it just becoming non-optional.

— Tom, with GLM 5.1.