Product

Announcing Llama 3.1 405B Support on Hyperbolic

XDiscordRedditYoutubeLinkedin

Llama 3.1 405B is poised to disrupt the industry, advancing open-source LLMs towards state-of-the-art performance and rivaling GPT-4 even before instruct tuning. Today, we’re excited to announce support for Llama 3.1 405B under our inference service. This means you can access inference and compute at a fraction of the cost, allowing you to build AI applications without relying on centralized infrastructures.

At Hyperbolic, we’re building the leading open-access AI cloud, and we invite builders, compute providers, researchers, and individuals to join us on this journey.

What is Llama 3.1 405B?

Llama 3.1 405B is the latest version of Meta's open-source large language models, designed to deliver top-tier performance across various tasks. It aims to provide cutting-edge capabilities while being accessible to developers and researchers. Large language models (LLMs) are artificial intelligence systems trained to understand and generate human-like text based on vast amounts of data.

Llama 3.1 405B, trained on over 15 trillion tokens from public sources, offers significant improvements over its predecessor, including multilingual support for seven languages (French, German, Hindi, Italian, Portuguese, Spanish, and Thai) and a 128k context length. Additionally, it incorporates 15 million synthetic samples for fine-tuning.

Llama 3.1 405B Performance Comparison

Here’s a comparison of Llama 3.1 405B with other notable models, including GPT-4o:

Source: Meta

Key Takeaways 

  • Performance Parity with GPT-4o: Llama 3.1 405B, especially the 405B and 70B models, matches or exceeds GPT-4o on many tasks. This shows open-source models can compete with proprietary ones, making advanced AI more accessible to developers and researchers.

  • Benchmark Excellence: Llama 3.1 models consistently score high across various benchmarks such as HumanEval, GSM8K, and ARC Challenge. The 405B model, in particular, achieves the highest scores in most categories, showcasing its state-of-the-art capabilities.

  • Math and Reasoning Superiority: In math and reasoning tasks, Llama 3.1 405B excels, achieving top scores (96.8 in GSM8K and 96.9 in ARC Challenge), indicating its strength in logical and numerical problem-solving.

  • Multilingual Capabilities: Llama 3.1 models, especially the 405B version, demonstrate excellent performance in multilingual benchmarks like MGSM, with scores above 85%, emphasizing its proficiency in handling multiple languages effectively.

Access Affordable Llama 3.1 405B Inference API Endpoint

Hyperbolic’s AI inference services offer top-tier performance without relying on centralized infrastructures, supporting text-to-text, text-to-speech, and text-to-image models, with future plans for text-to-video. 

Key Features

  • Proprietary Optimization: Our system is as fast or faster than well-known solutions.

    • LLM Throughput: For Mistral 8x-7B, Hyperbolic processes 43 tokens per second.

    • Text-to-Image Speed: For SDXL, Hyperbolic creates 17.6 images per minute.

    • Record Speed on AMD MI250: Llama-3-8B produces 145% more output tokens compared to vLLM.

  • User-Friendly API: Our API is compatible with OpenAI, making it easy to integrate into existing workflows.

  • Integration: We collaborate with AI companies specializing in vector embedding, vector databases, and Retrieval-Augmented Generation (RAG).

  • Uptime: Our Hyper-dOS platform ensures strong GPU network support, guaranteeing service uptime for production-ready applications.

  • Developer Experience: Integration is simple and comprehensive, allowing developers to easily plug into existing applications and monitor performance.

Llama 3.1 405B stands shoulder to shoulder with some of the best models available today. We believe it is important for anyone to access this technology at launch, so we made sure to be one of the first to provide the update. 

You can access Llama 3.1 405B for free, with support for up to 200 requests per minute, at app.hyperbolic.xyz/models.

Try Llama 3.1 405B Model and API on Hyperbolic

Llama 3.1 405B Demo

With Hyperbolic, you can begin using Llama 3.1 405B for free and customize your experience by defining tokens, temperature, and more.

Llama 3.1 405B API

Ready to build using Hyperbolic’s API? Use Python, TypesScript and cURL easily. 

Access your API Key on the left side panel under ‘Settings’. Your API keys allow you to securely access Hyperbolic's services. Please protect your key because anyone who has the key has full access to your account.

About Hyperbolic 

Hyperbolic is a leading provider of open access AI computing and inference services, pioneering open access to AI for developers and researchers worldwide. With a mission to break down barriers that limit AI’s potential, Hyperbolic believes in a future where AI technology is universally accessible, empowering every individual and community with the tools to innovate, create, and advance our world. The Hyperbolic founding team is led by award-winning Math and AI researchers from UC Berkeley and the University of Washington.

Website | X | Discord | LinkedIn | YouTube | GitHub | Documentation

Blog
More Articles
Hyperbolic Referral Program
Become GPU Rich with Hyperbolic’s Referral Program

May 2, 2025

LLM Serving Frameworks

Apr 29, 2025

Auto Top Ups
Auto Top-Ups Now Available on Hyperbolic

Apr 28, 2025

GPU Drop: 96 H100s Now Available on Hyperbolic's GPU Marketplace

Apr 21, 2025

Comparing Fine Tuning Frameworks

Apr 10, 2025

Custom Ports for GPU Instances
Custom Port Configuration for GPU Instances Now Available on Hyperbolic’s GPU Marketplace

Apr 8, 2025

march 2025 hyperbolic recap
Hyperbolic Monthly Recap: March 2025

Apr 2, 2025

An Intro To Fine Tuning

Mar 30, 2025

DeepSeek-V3-0324 Now Live on Hyperbolic

Mar 24, 2025

GPU Marketplace Landscape

Mar 11, 2025

AI Inference Provider Landscape

Mar 7, 2025

Hyperbolic Monthly Recap: February 2025

Mar 3, 2025

AI Czar David Sacks Explains the DeepSeek Freak

Feb 27, 2025

AI Infrastructure That Scales for Open-Source Models and Agents

Feb 27, 2025

Taking the Agent GAME Hyperbolic

Feb 27, 2025

The Rise of the Open-Source AI Stack

Feb 26, 2025

Censorship or Cultural Alignment? DeepSeek R1’s Political Sensitivity Explored

Feb 26, 2025

ETHDenver Hackathon: PMF or Die Agent Hackathon

Feb 21, 2025

Growing on Demand: Automated Scaling in AI

Feb 14, 2025

DeepSeek R1: A Trojan Horse for Data Mining or a Leap in AI Reasoning?

Feb 10, 2025

Hyperbolic Monthly Recap: January 2025

Feb 5, 2025

A digital image titled "Google Whitepaper Agents" by Hyperbolic. The image features three segments: Model Component with a green pixelated icon, Tools Component with a purple cube icon, and Orchestration Layer with a blue circular icon.
Summary of Google’s AI Whitepaper ‘Agents’

Jan 31, 2025

Graphic with a blue and green rectangle featuring text "Your AI, Your Data" and "Now Available Deep Seek R1 on Hyperbolic's Privacy-First Platform." A whale illustration and three stacked machines are also depicted.
Your AI, Your Data: DeepSeek-R1 Now Hosted on Hyperbolic’s Privacy-First Platform

Jan 28, 2025

Advertisement for the Coinbase AI Hackathon displaying three challenges: "Build a self-evolving agent," "Create an AI sales agent," and "Develop the most hyperbolic agent," each offering a $1K prize.
Devs: Build Hyperintellgence at Coinbase's AI Hackathon in San Francisco

Jan 28, 2025

A stylized, pixelated green silhouette of a person holding an object is depicted. Text reads, "a new space for ACCELERATION - Hyperbolic e/acc." At the bottom left is a circular logo with abstract design elements.
Introducing Hyperbolic e/acc: A New Space for Acceleration

Jan 28, 2025