AI benchmarks have been falling left and right and I wanted to stop and take a somber moment to reflect on the benchmarks we’ve lost
I also wanted to try out OpenAI’s new coding model GPT-5-Codex so here’s a video tribute I made in an hour using Codex. Codex wrote all the code, composed the music, and decided on the aesthetics with me giving it some guidance. I could have continued to spruce it up but an hour seemed like a good point to ship :)
Based on this experience, I think we are going to have a longer In Memoriam video next year.
Thanks to R0bk for providing most of the benchmark data:
https://r0bk.github.io/killedbyllm/