Scoopfeeds — Intelligent news, curated.
computer-science

Fine-tuning an LLM to write docs like it's 1995

Hacker News · Jun 5, 2026, 5:46 AM · Also reported by 1 other source

Key takeaways

  • In my predictions for 2030 I wrote that tech writers would be using specialized LLMs, running locally on powerful hardware.
  • To train a personal, local model to write like a technical writer from the 90s, one needs tons of written sources.
  • I downloaded the OCR’d text files and cleaned the content from artifacts and clutter (like indices and frontmatter) using good old Python scripts.

In my predictions for 2030 I wrote that tech writers would be using specialized LLMs, running locally on powerful hardware. I see hints of this move to “local first” among engineering pundits, but we’re not there yet, in part because of how much more powerful connected frontier models are. That doesn’t mean we can’t experiment, though. That’s precisely what I did last week, trying to fine-tune an instruct model to write like a software technical writer from the 80s and 90s.

To train a personal, local model to write like a technical writer from the 90s, one needs tons of written sources. If I wanted to fine-tune a model to write like myself, for example, this blog would not be enough, as it s barely 100k words at the time of this post. You would need more samples for thorough training, and those are not easy to come by, nor simple to produce. The only quick way is to use an existing corpus. Where could I get one?

Meet Bitsavers: it s a website that collects and scans old computer manuals and brochures. It’s an incredibly valuable repository of computer history and ancient tech writing, with mirrors available everywhere. As I’m fond of Microsoft manuals from the 90s, I chose the Microsoft collection as the source of training materials. The collection contains out-of-print docs published between 1977 and 2005: more than 37 million words, covering old systems and SDKs.

Article preview — originally published by Hacker News. Full story at the source.
Read full story on Hacker News → More top stories

Also covered by

Aggregated and edited by the Scoop newsroom. We surface news from Hacker News alongside other reporting so you can compare coverage in one place. Editorial policy · Corrections · About Scoop