The Bitter Lesson: Why General Methods Win
Richard Sutton's short 2019 essay makes one claim and defends it with seventy years of evidence. Across game playing, speech recognition, and computer vision, the methods that won in the end were not the ones that encoded human knowledge about the problem. They were general methods that kept improving as you gave them more computation: search and learning. Sutton walks through the cases. In computer chess, the 1997 Deep Blue result came from deep search, and the researchers who had bet on human chess understanding were left behind. In Go the pattern repeated, with self-play search and learning rather than handcrafted features. Speech recognition moved from hand-built linguistic rules to statistical methods and then deep learning. Vision moved from hand-designed edge detectors to learned features. This keeps happening, Sutton argues, because building in human knowledge feels good and helps in the short term, but it eventually plateaus and then blocks further progress. He calls the lesson bitter because researchers keep relearning it. The conclusion is narrow and specific: build in the meta-method of discovery, not the contents of what we think we have discovered. The original essay takes a few minutes to read.
Why it matters
If you are deciding where to spend effort on an AI system, this is the argument for betting on scalable compute and general learning over clever domain encoding. It is the reasoning behind why the large labs scaled instead of hand-engineering, and it still predicts which approaches age badly.