TechingToday

Meta Llama 3.1 is now available.

General

With the release of OpenAI's GPT-4o, Anthropic's Claude 3.5 Sonnet, and Google's Gemini 1.5 family, artificial intelligence has already had a big year, but Meta's massive 405 billion parameter Llama 3.1 release is the ” most important” crowning achievement.

Meta released version 3.1 of its open-source Llama AI model family yesterday, beating OpenAI and Anthropic's proprietary AI in many major benchmarks and quickly gaining a reputation as one of the most powerful and useful models available

As a whole, the model has been used in many different applications.

Overall, its performance is roughly equivalent to GPT-4o and Sonnet, slightly inferior in some areas and slightly superior in others. However, it is not the performance that is important, but the fact that it is open source, widely available, and can be downloaded and used on one's own machine.

This level of access and control over frontier-grade AI models is groundbreaking because it leads to new research, new types of models, and advances in areas that may not be worth investing the cost per token of using GPT-4o or Claude Sonnet 3.5 and is a great way to get a new perspective on the world of science and technology.

If you don't have your own data center, smaller models can run on good gaming laptops. There are also numerous cloud platforms and services offering access, including Groq, Perplexity, and if you are in the US, available on WhatsApp and Meta.ai chatbots.

Training large language models is tremendously expensive. Recently, the focus has been on increasing efficiency rather than scale, and even OpenAI has released a smaller model in GPT-4o mini. [Llama 3.1 405b Meta found a compromise and managed to pack a high quality model with a trillion parameters into half the size. [This is the first frontier-grade model made available in open source, and Meta has gone a step further, allowing companies, organizations, or individuals to tweak or even fully train their own models using the data generated by 405b.

Meta also released a complete ecosystem with sample applications, prompt guards for moderation and guardrails, as well as a family of models, and a new API to facilitate building applications using the AI interface standard is proposed.

Aside from being open source, offering advanced functionality, and having a full ecosystem with small models and custom features, Llama 3.1 405b seems to excel in multilingual translation, general knowledge, and mathematics. It also excels in terms of customization for specific needs.

Victor Botev, CTO of AI research firm Iris.ai, described Llama 3.1 405b as “an important step toward democratizing access to AI technology. Openness and accessibility will make it easier for researchers and developers to “build state-of-the-art language AI without the barriers of proprietary APIs and expensive licensing fees.

Llama 3.1 405b may already be one of the most widely used AI models, but demand for it is so high that even normally flawless platforms like Groq suffer from overload.

The best places to try it out are Meta's own meta.ai chatbot or WhatsApp messaging platform. Both of these offer ways to view and use models “as intended,” and Meta.ai comes with access to image generation and other features.

The downside to this option is that it is only available in the U.S. and also requires a Facebook or Instagram account, which Meta has successfully blocked VPNs.

I have long admired the brilliance of Groq, an AI startup that is building a chip designed to run AI models very fast; Groq provides easy access to all open source models, including previous versions of the Llama family and currently has all three Llama 3.1 models.

Access to 405b is also available, but is limited due to too high demand and may not be displayed even if accessed. This includes chatbots and GroqCloud, but is a great way to try out the 70b and 8b models upgraded in Llama 3.1.

Perplexity is a great tool for searching the web and uses a variety of custom and public AI models to enhance the results returned by traditional web crawlers. It can also generate custom pages that provide Wikipedia-style guides to topics.

One of the newest models available on Perplexity is Llama 3.1 405b, but there is a catch - it is only available if you have a “pro” plan for $20/month.

HuggingChat is something of a hidden gem in the AI chatbot space, offering access to a wide range of models, including some not available elsewhere, and tools for web searching, image creation, etc. A HuggingFace account is required, but setup and getting started is easy.

It is completely free to use, and once you sign in, simply go to settings and select Llama 3.1 405b. The downside of this platform is that there is a learning curve and it does not shy away from using full model names and descriptors. It is not beginner-friendly.

Poe, a Quora-backed chatbot marketplace, is a bit like HuggingChat in that you have access to a variety of models and can customize how they interact with you, but it is more user-friendly and consumer-focused approach it works with.

Unlike HuggingChat, Poe is relatively open and mostly free. While free is relatively adequate per day, 405b is an expensive model, costing 485 compute points per message.

If none of these work and you want more control over how you use Llama 3.1 405b, and if you don't have your own data center, you can use Amazon's AWS, Google Cloud, Microsoft Azure, as well as a number of cloud computing platforms, any of which are worth looking at. But all of these offer access to new models.

Snowflake, Cloudflare, DataBricks, Nvidia AI Foundry, and IBM Cloud are just a few places where you can sign up for a developer account and access open source models. You can also try it directly from SambaNova Systems without signing up.