The Hustle Hub
Posts
Zuck's new Llama is a beast

Zuck's new Llama is a beast

February 08, 2025

In partnership with

Your Breakfast Is Holding You Back

Coffee alone isn’t breakfast.

For lasting energy, peak performance, and real health benefits, your body needs more than just empty calories. That’s where Huel Black Edition steps in. Packed with 40g of protein and 27 essential vitamins, it’s the ultimate high-protein meal, ready in just 30 seconds.

Start fueling your day the right way. Use code BEHUEL15 for 15% off your first order, plus a FREE t-shirt and shaker.

Upgrade Your Breakfast Today

This powerhouse took months to train on 16,000 Nvidia H100 GPUs, costing hundreds of millions of dollars and using enough electricity to power a small country. The result? A massive 405 billion parameter model with a 128,000-token context length that reportedly outperforms OpenAI’s GPT-4 and even beats Claude 3.5 Sonet on some key benchmarks.

However, benchmarks can be deceptive. The only way to truly gauge a model's performance is by using it. In today's update, we’ll dive into LLaMA 3.1 and see if it lives up to the hype.

Key Highlights:

- LLaMA 3.1 comes in three sizes: 8B, 70B, and 405B, where B refers to billions of parameters.

- Despite more parameters often capturing more complex patterns, it doesn't always guarantee a better model. For instance, GPT-4 is rumored to have over 1 trillion parameters.

- LLaMA 3.1 is kind of open source: you can use it freely unless your app has 700 million monthly active users, in which case, you’ll need a license from Meta.

- The training data includes diverse sources like blogs, GitHub repos, and even Facebook posts and WhatsApp messages.

- The training code is remarkably simple, just 300 lines of Python and PyTorch, using Fairscale to distribute training across multiple GPUs.

- The model weights are open, a significant advantage for developers wanting to build AI-powered apps without relying on GPT-4’s API.

While self-hosting this model is expensive (the weights are 230GB and even an RTX 4090 struggles), platforms like Meta, Groq, or Nvidia's Playground offer opportunities to try it out for free.

Initial feedback suggests the smaller versions of LLaMA are more impressive than the massive 405B model. However, the real power lies in its ability to be fine-tuned with custom data, potentially leading to some incredible uncensored models in the near future.

Despite LLaMA 3.1's strengths, it still lags behind Claude in some tasks. For example, it struggled to build a Svelte 5 web application with runes, a new feature that only CL 3.5 Sonet handled correctly in a single shot. Nonetheless, LLaMA 3.1 excels in other areas like creative writing and poetry, though it isn't the best I've seen.

It's fascinating that despite multiple companies training massive models on massive computers, they seem to be plateauing at similar levels of capability. OpenAI made a significant leap from GPT-3 to GPT-4, but subsequent advancements have been incremental.

Reflecting on the current state of AI, it seems we're far from achieving artificial superintelligence, which remains a concept largely confined to Silicon Valley imaginations. Meta, however, is keeping it real in the AI space, and LLaMA 3.1 represents a small step for man but a giant leap for Zuckerberg's redemption arc.

Thank you for reading, and stay tuned for more updates.