FuriosaAI Developer Center#
Welcome to the FuriosaAI Developer Center! FuriosaAI offers a streamlined software stack designed for deep learning model inference on FuriosaAI NPUs. This guide covers the entire workflow for creating inference applications, starting from a PyTorch model, through model quantization, and model serving and deployment.
Warning
This document is based on the Furiosa SDK 2024.2.1 (beta0) version. The features and APIs described herein are subject to change in the future.
2024.2.1 is the latest SDK release for RNGD. This document provides an overview of the new features and changes in the latest release.
Furiosa LLM is a high-performance inference engine for LLM models. This document explains how to install and use Furiosa LLM.
This document describes how to reproduce the MLPerf™ Inference Benchmark using the FuriosaAI Software Stack.
Overview#
FuriosaAI RNGD: RNGD Hardware Specification, and features
FuriosaAI’s Software Stack: An overview of the FuriosaAI software stack
Supported Models: A list of supported models
What’s New: New features and changes in the latest release
Roadmap: The future roadmap of FuriosaAI Software Stack
Getting Started#
Installing Prerequisites: How to install the prerequisites for FuriosaAI Software Stack
Upgrading FuriosaAI’s Software: How to upgrade the FuriosaAI Software Stack
Furiosa LLM#
Furiosa LLM: An introduction to Furiosa LLM
OpenAI-Compatible Server: More details about the OpenAI-compatible server and its features
Model Preparation Workflow: How to quantize and compile Huggingface models
Model Parallelism: Tensor/Pipeline/Data parallelism in Furiosa LLM
API Reference: The Python API reference for Furiosa LLM
Cloud Native Toolkit#
Cloud Native Toolkit: An overview of the Cloud Native Toolkit
Kubernetes Support: An overview of the Kubernetes Support
Device Management#
Furiosa SMI CLI: A command line utility for managing FuriosaAI NPUs
Furiosa SMI Library: A library for managing FuriosaAI NPUs