FuriosaAI Developer Center#
Welcome to the FuriosaAI Developer Center. FuriosaAI provides an streamlined software stack for deep learning model inference on FuriosaAI NPUs. This document provides a guide to easily perform the entire workflow of writing inference applications, from starting with PyTorch model to model quantization, serving, and production deployment.
Warning
This document is based on Furiosa SDK 2024.2.1 (beta0) version, and the features and APIs described in this document may change in the future.
2024.2.1 is the latest SDK release for RNGD. This document provides an overview of the new features and changes in the latest release.
Furiosa LLM is a high-performance inference engine for LLM models. This document explains how to install and use Furiosa LLM.
This document describes how to reproduce the MLPerf™ Inference Benchmark using the FuriosaAI Software Stack.
Overview#
FuriosaAI RNGD: RNGD Hardware Specification, and features
FuriosaAI’s Software Stack: An overview of the FuriosaAI software stack
Supported Models: A list of supported models
What’s New: New features and changes in the latest release
Roadmap: The future roadmap of FuriosaAI Software Stack
Getting Started#
Installing Prerequisites: How to install the prerequisites for FuriosaAI Software Stack
Upgrading Furiosa Software Stack: How to upgrade the FuriosaAI Software Stack
Furiosa LLM#
Furiosa LLM: An introduction to Furiosa LLM
OpenAI Compatible Server: More details about the OpenAI-compatible server and its features
Model Preparation Workflow: How to quantize and compile Huggingface models
Model Parallelism: Tensor/Pipeline/Data parallelism in Furiosa LLM
References: The Python API reference for Furiosa LLM
Cloud Native Toolkit#
Cloud Native Toolkit: An overview of the Cloud Native Toolkit
Kubernetes Support: An overview of the Kubernetes Support
Device Management#
Furiosa SMI CLI: A command line utility for managing FuriosaAI NPUs
Furiosa SMI Library: A library for managing FuriosaAI NPUs