LLMs and almost good code
TL;DR: My new prior is that top-of-the-line LLMs working on easy tasks generate code that is maybe 10 % more complicated than necessary. I also think we accept this complexity too easily, because it comes from code that is right here, right now, solving an immediate problem. This may have consequences for maintenance in the long term.(The text of the Less Wrong version of this article is lightly adjusted to fit a more general audience than my usual readership of software product developers.)The background to this discovery was that I needed to do some software plumbing in a work project. It was a simple change that mostly mirrored existing functionality. This is a perfect fit for LLMs, in my experience, so I used a frontier model to generate the code for it. The change ended up being a total of just over 200 lines, mostly additions.The part of the generated code we’ll talk about is a 24-line function that converts an arbitrary (user-supplied) string to a safe HTTP header value.[1]toHeaderValue :: Text -> TexttoHeaderValue raw = let attrChars = "!#$&+-.^_`|~" padHex t = if Text.length t < 2 then "0" <> t else t percentEncode c = if (isAscii c && isAlphaNum c) || elem c attrChars then Text.singleton c else Text.concat [ "%" <> padHex (Text.toUpper (Text.pack (showHex b ""))) | b <- ByteString.unpack (encodeUtf8 (Text.singleton c)) ] rfc5987Encode = Text.concatMap percentEncode isPrintable c = c >= ' ' && c /= '\DEL' replacePathSeparator c = if c == '/' || c == '\\' then '_' else c cleaned = Text.map replacePathSeparator (Text.filter isPrintable raw) in rfc5987Encode cleanedWhen looking at this function in isolation, it obviously seems a bit too complicated, but remember that this was just 24 lines in a 200-line change. I confirmed that the underlying idea was correct, and that the generated tests covered all the edge cases I would want to see covered. It’s not pretty code, but it is proven correct by tests.More importantly, it is highly local. If anything about this co