Notre Dame is back! (Not AI)

Think Differently. I do

Bigger is not (that much) better. Probably. Future scaling unproven.
  • 2024 hasn’t shown much
  • ChatGPT-4 is again at the top of the benchmarks despite very modest improvements in performance. Llama improved as it grew to 70B and now 405B, but far less than proportionately. I’ve seen similar results other LLMs.
  • The era of “exponential growth” in frontier LLMs seems to be behind us. It’s not at a plateau. Everyone except Gary Marcus sees some improvements as the models get bigger, but less than the earlier predictions.
  • The results keep improving from better data and improved methods. Progress in applications is rapid. The accuracy is up and the hallucion rate is down. Hundreds of agents are reaching the market. Some will soon make a difference.
  • Many of the smartest people in the business, like Dario Amodio of Anthropic, predict the “scaling laws” will hold.
  • It’s possible the apparent slowdown is a temporary artifact. When OpenAI releases XhatGPT-5, it could be a major advance. Claude’s next Opus has potential. Meta and I believe 3 others are building models with ten times built with ten times the computing effort of the current models.
  • My best guess is that progress will continue at a lower rate. It’s right to add “probably” above given the quality of people convinced scaling gains will recover.
  • The AI giants are investing over $200 billion this year and similar amounts in the nest few years. Goldman and others think they’re crazy and the companies won’t see a decent return for many years.

Most people and companies are seeing few benefits from AI.

Fewer than 20% have seen important results, mostly people who write. That will xhange

“Exponential growth” slowed after 2023. Only a modest improvement of top model SOTA in 2024.

Much hope but little proof it will quickly get better. Only incremental improvement in the top model, OpenAI ChatGPT4.

All giant LLMs – from ChatGPT to Baidu’s Ernie to France’s Llama – have roughly similar performance. For most purposes, pick any.

Next week, the benchmark results will change and another LLM will leapfrog over your choice. Any difference less than 5% or even 10% means little. The benchmarks don’t measure more accurately.

Some may call me a heretic

For practical purposes, China & the U.S. are roughly equal

AI is already better than people in many things. Soon, more. Long way from better at all all. AGI will not appear all at once. No firm date

The chip shortage should have ended already, based on TSMC & Samsung production capacity. Overordering for protection is why it seems to be dragging on. Nvidia is already cutting prices in some markets.

All giant LLMs – from ChatGPT to Baidu’s Ernie to France’s Llama – have roughly similar performance. For most purposes, pick any. Next week, the benchmarks will change and another LLM will leapfrog over your choice.

Chip shortage is over, says Sequoia. Many disagree, but availability of cloud server chips is very good.

For practical purposes, China & the U.S. are roughly equal. China has more graduates and patents. Chinese write more papers. The U.S. has perhaps two dozen giant language model, about twice the Chinese count. But since six or eight would be plenty, this makes little difference. My friend at Stanford considers Tsinghua a peer, along with Berkeley and MIT.

There will not be a singularity for AGI. It will not appear all at once but rather one skill at a time. AI is already better than people in many things.