Zuck's new Llama is a beast

In partnership with

Find out why 1M+ professionals read Superhuman AI daily.

In 2 years you will be working for AI

Or an AI will be working for you

Here's how you can future-proof yourself:

  1. Join the Superhuman AI newsletter – read by 1M+ people at top companies

  2. Master AI tools, tutorials, and news in just 3 minutes a day

  3. Become 10X more productive using AI

Join 1,000,000+ pros at companies like Google, Meta, and Amazon that are using AI to get ahead.

Take a demo, get a Blackstone Griddle

  • Automate expense reports so you can focus on strategy

  • Uncapped virtual corporate cards

  • Access scalable credit lines from $500 to $15M

This powerhouse took months to train on 16,000 Nvidia H100 GPUs, costing hundreds of millions of dollars and using enough electricity to power a small country. The result? A massive 405 billion parameter model with a 128,000-token context length that reportedly outperforms OpenAI’s GPT-4 and even beats Claude 3.5 Sonet on some key benchmarks.

However, benchmarks can be deceptive. The only way to truly gauge a model's performance is by using it. In today's update, we’ll dive into LLaMA 3.1 and see if it lives up to the hype.

Key Highlights:

- LLaMA 3.1 comes in three sizes: 8B, 70B, and 405B, where B refers to billions of parameters.

- Despite more parameters often capturing more complex patterns, it doesn't always guarantee a better model. For instance, GPT-4 is rumored to have over 1 trillion parameters.

- LLaMA 3.1 is kind of open source: you can use it freely unless your app has 700 million monthly active users, in which case, you’ll need a license from Meta.

- The training data includes diverse sources like blogs, GitHub repos, and even Facebook posts and WhatsApp messages.

- The training code is remarkably simple, just 300 lines of Python and PyTorch, using Fairscale to distribute training across multiple GPUs.

- The model weights are open, a significant advantage for developers wanting to build AI-powered apps without relying on GPT-4’s API.

While self-hosting this model is expensive (the weights are 230GB and even an RTX 4090 struggles), platforms like Meta, Groq, or Nvidia's Playground offer opportunities to try it out for free.

Initial feedback suggests the smaller versions of LLaMA are more impressive than the massive 405B model. However, the real power lies in its ability to be fine-tuned with custom data, potentially leading to some incredible uncensored models in the near future.

Despite LLaMA 3.1's strengths, it still lags behind Claude in some tasks. For example, it struggled to build a Svelte 5 web application with runes, a new feature that only CL 3.5 Sonet handled correctly in a single shot. Nonetheless, LLaMA 3.1 excels in other areas like creative writing and poetry, though it isn't the best I've seen.

It's fascinating that despite multiple companies training massive models on massive computers, they seem to be plateauing at similar levels of capability. OpenAI made a significant leap from GPT-3 to GPT-4, but subsequent advancements have been incremental.

Reflecting on the current state of AI, it seems we're far from achieving artificial superintelligence, which remains a concept largely confined to Silicon Valley imaginations. Meta, however, is keeping it real in the AI space, and LLaMA 3.1 represents a small step for man but a giant leap for Zuckerberg's redemption arc.

Thank you for reading, and stay tuned for more updates.