Welcome to Furiosa Docs#

Welcome! FuriosaAI offers a streamlined software stack designed for deep learning model inference on FuriosaAI NPUs. This guide covers the entire workflow for creating inference applications, starting from a PyTorch model, through model quantization, and model serving and deployment.

Warning

This document is based on the Furiosa SDK 2025.1.0 (beta 1) version. The features and APIs described herein are subject to change in the future.

📢 Latest Release 2025.1.0

Stay up to date with the newest features, improvements, and fixes in the latest release. Version 2025.1.0 (beta 0)

What’s New
🚀 Quick Start with Furiosa LLM

Furiosa LLM is a high-performance inference engine for LLM models. This document explains how to install and use Furiosa LLM.

Quick Start with Furiosa LLM
đź“Š Running MLPerf Benchmark

This document describes how to reproduce the MLPerf™ Inference Benchmark using the FuriosaAI Software Stack.

Running MLPerf™ Inference Benchmark
đź“‹ Roadmap Overview

See what’s ahead for FuriosaAI with our planned releases and upcoming features. Stay informed on development progress and key milestones.

Roadmap

Overview#

FuriosaAI Software Stack

Getting Started#

Furiosa LLM#

Cloud Native Toolkit#

Device Management#

Customer Support#