Product

Announcing Llama 3.1 405B Support on Hyperbolic

XDiscordRedditYoutubeLinkedin

Llama 3.1 405B is poised to disrupt the industry, advancing open-source LLMs towards state-of-the-art performance and rivaling GPT-4 even before instruct tuning. Today, we’re excited to announce support for Llama 3.1 405B under our inference service. This means you can access inference and compute at a fraction of the cost, allowing you to build AI applications without relying on centralized infrastructures.

At Hyperbolic, we’re building the leading open-access AI cloud, and we invite builders, compute providers, researchers, and individuals to join us on this journey.

What is Llama 3.1 405B?

Llama 3.1 405B is the latest version of Meta's open-source large language models, designed to deliver top-tier performance across various tasks. It aims to provide cutting-edge capabilities while being accessible to developers and researchers. Large language models (LLMs) are artificial intelligence systems trained to understand and generate human-like text based on vast amounts of data.

Llama 3.1 405B, trained on over 15 trillion tokens from public sources, offers significant improvements over its predecessor, including multilingual support for seven languages (French, German, Hindi, Italian, Portuguese, Spanish, and Thai) and a 128k context length. Additionally, it incorporates 15 million synthetic samples for fine-tuning.

Llama 3.1 405B Performance Comparison

Here’s a comparison of Llama 3.1 405B with other notable models, including GPT-4o:

Source: Meta

Key Takeaways 

  • Performance Parity with GPT-4o: Llama 3.1 405B, especially the 405B and 70B models, matches or exceeds GPT-4o on many tasks. This shows open-source models can compete with proprietary ones, making advanced AI more accessible to developers and researchers.

  • Benchmark Excellence: Llama 3.1 models consistently score high across various benchmarks such as HumanEval, GSM8K, and ARC Challenge. The 405B model, in particular, achieves the highest scores in most categories, showcasing its state-of-the-art capabilities.

  • Math and Reasoning Superiority: In math and reasoning tasks, Llama 3.1 405B excels, achieving top scores (96.8 in GSM8K and 96.9 in ARC Challenge), indicating its strength in logical and numerical problem-solving.

  • Multilingual Capabilities: Llama 3.1 models, especially the 405B version, demonstrate excellent performance in multilingual benchmarks like MGSM, with scores above 85%, emphasizing its proficiency in handling multiple languages effectively.

Access Affordable Llama 3.1 405B Inference API Endpoint

Hyperbolic’s AI inference services offer top-tier performance without relying on centralized infrastructures, supporting text-to-text, text-to-speech, and text-to-image models, with future plans for text-to-video. 

Key Features

  • Proprietary Optimization: Our system is as fast or faster than well-known solutions.

    • LLM Throughput: For Mistral 8x-7B, Hyperbolic processes 43 tokens per second.

    • Text-to-Image Speed: For SDXL, Hyperbolic creates 17.6 images per minute.

    • Record Speed on AMD MI250: Llama-3-8B produces 145% more output tokens compared to vLLM.

  • User-Friendly API: Our API is compatible with OpenAI, making it easy to integrate into existing workflows.

  • Integration: We collaborate with AI companies specializing in vector embedding, vector databases, and Retrieval-Augmented Generation (RAG).

  • Uptime: Our Hyper-dOS platform ensures strong GPU network support, guaranteeing service uptime for production-ready applications.

  • Developer Experience: Integration is simple and comprehensive, allowing developers to easily plug into existing applications and monitor performance.

Llama 3.1 405B stands shoulder to shoulder with some of the best models available today. We believe it is important for anyone to access this technology at launch, so we made sure to be one of the first to provide the update. 

You can access Llama 3.1 405B for free, with support for up to 200 requests per minute, at app.hyperbolic.xyz/models.

Try Llama 3.1 405B Model and API on Hyperbolic

Llama 3.1 405B Demo

With Hyperbolic, you can begin using Llama 3.1 405B for free and customize your experience by defining tokens, temperature, and more.

Llama 3.1 405B API

Ready to build using Hyperbolic’s API? Use Python, TypesScript and cURL easily. 

Access your API Key on the left side panel under ‘Settings’. Your API keys allow you to securely access Hyperbolic's services. Please protect your key because anyone who has the key has full access to your account.

About Hyperbolic 

Hyperbolic is a leading provider of open access AI computing and inference services, pioneering open access to AI for developers and researchers worldwide. With a mission to break down barriers that limit AI’s potential, Hyperbolic believes in a future where AI technology is universally accessible, empowering every individual and community with the tools to innovate, create, and advance our world. The Hyperbolic founding team is led by award-winning Math and AI researchers from UC Berkeley and the University of Washington.

Website | X | Discord | LinkedIn | YouTube | GitHub | Documentation

Blog
More Articles