What’s New

What’s New#

This page describes the changes and functionality available in in the latest releases of Furiosa SDK 2024.2.0.

Furiosa SDK 2024.2.0 (2024-12-23)#

2024.2.0 is the second SDK release for RNGD. This release is beta 0 release, and the features and APIs described in this document may change in the future.

Highlights#

New Model support: Solar, EXAONE-3.0, CodeLLaMA2, Vicuna, …
Tensor Parallelism support (tensor_parallel_size <= 8)
Torch 2.4.1 support
Transformers 4.44.2 support
Furiosa LLM
- Huggingface Transformers compatible API support (furiosa_llm.optimum)
  
  AutoModel, AutoModelForCausalLM, AutoModelForQuestionAnswering API
  
  QuantizerForCausalLM API support for calibration and quantization
- ArtifactBuilder API and CLI tools (refer to ArtifactBuilder API)
- LLMEngine, AsyncLLMEngine API support compatiable with vLLM
Up to 8k context length support in LLaMA 3.1 models
20% performance improvements in LLaMA 3.1 models
- 3300 tokens/sec on 8B model

Furiosa SDK 2024.1.0 (2024-10-11)#

2024.1.0 is the first SDK release for RNGD. This release is alpha release, and the features and APIs described in this document may change in the future.

Highlights#

Model Support: LLaMA 3.1 8B/70B, BERT Large, GPT-J 6B
Furiosa Quantizer supports the following quantization methods:
- BF16 (W16A16)
- INT8 Weight-Only (W8A16)
- FP8 (W8A8)
- INT8 SmoothQuant (W8A8)
Furiosa LLM
- Efficient KV cache management with PagedAttention
- Continuous batching support in serving
- OpenAI-compatible API server
- Greedy search and beam search
- Pipeline Parallelism and Data Parallelism across multiple NPUs
furiosa-mlperf command
- Server and Offline scenarios
- BERT, GPT-J, LLaMA 3.1 benchmarks
System Management Interface
- System Management Interface Library and CLI for Furiosa NPU family
Cloud Native Toolkit
- Kubernetes integration for managing and monitoring the Furiosa NPU family

Component version#
Package name	Version
furiosa-compiler	2024.2.0
furiosa-device-plugin	2024.2.0
furiosa-driver-rngd	2024.2.0
furiosa-feature-discovery	2024.2.0
furiosa-firmware-image-tools	2024.2.0
furiosa-firmware-image-rngd	2024.2.0
furiosa-libsmi	2024.2.0
furiosa-llm	2024.2.0
furiosa-llm-models	2024.2.0
furiosa-mlperf	2024.2.0
furiosa-mlperf-resources	4.1.0
furiosa-model-compressor	2024.2.0
furiosa-model-compressor-impl	2024.2.0
furiosa-native-compiler	2024.2.0
furiosa-native-runtime	2024.2.0
furiosa-smi	2024.2.0
furiosa-torch-ext	2024.2.0

What’s New

Contents

What’s New#

Furiosa SDK 2024.2.0 (2024-12-23)#

Highlights#

Furiosa SDK 2024.1.0 (2024-10-11)#

Highlights#