Welcome to Furiosa Docs#

Welcome! FuriosaAI offers a streamlined software stack designed for deep learning model inference on FuriosaAI NPUs. This guide covers the entire workflow for creating inference applications, starting from a PyTorch model, through model quantization, and model serving and deployment.

Warning

This document is based on the Furiosa SDK 2025.2.0 (beta 1) version. The features and APIs described herein are subject to change in the future.

📢 Latest Release 2025.2.0

Stay up to date with the newest features, improvements, and fixes in the latest release.

What’s New
🚀 Quick Start with Furiosa-LLM

Furiosa-LLM is a high-performance inference engine for LLM models. This document explains how to install and use Furiosa-LLM.

Quick Start with Furiosa-LLM
đź“‹ Roadmap Overview

See what’s ahead for FuriosaAI with our planned releases and upcoming features. Stay informed on development progress and key milestones.

Roadmap
🤗 Hugging Face Hub

Pre-optimized and pre-compiled models for FuriosaAI NPUs are available on the Hugging Face Hub. Check out the latest models and their capabilities.

https://huggingface.co/furiosa-ai

Overview#

FuriosaAI Software Stack

Getting Started#

Furiosa-LLM#

Cloud Native Toolkit#

Device Management#

Customer Support#