Supported Models#
FuriosaAI’s software stack supports a wide range of Transformer-based models available on the Hugging Face Hub. Below is a list of model architectures currently supported by Furiosa-LLM. If your model is based on any of these architectures, you can use Furiosa-LLM to compile and run the model efficiently on Furiosa’s NPUs.
For many of these architectures, FuriosaAI also publishes pre-compiled
models under the
Hugging Face Hub 🤗 - FuriosaAI organization,
each shipping a Furiosa Executable Bundle (FXB) so you can download and run it
quickly with Furiosa-LLM. The architecture names in the tables below link to
per-architecture guides covering how to launch the pre-compiled variants with
the furiosa-llm serve command — including the model-specific options each
needs — and how to use their features, such as reasoning, tool calling, and
multimodal input, with example requests. Each guide also notes the
quantization and parallelism strategy for reference. Individual
repository-level model cards live on the Hugging Face Hub.
Decoder-only Models (Text Generation)#
Model Name |
Architecture |
Example Hugging Face Models |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Pooling Models#
Model Name |
Architecture |
Task |
Example Hugging Face Models |
|---|---|---|---|
|
Embedding |
|
|
|
Reranking |
|
Vision-Language Models (Multimodal)#
See Vision-Language Models for a guide on launching a VL server and sending image inputs over the OpenAI-compatible Chat Completions API.
Model Name |
Architecture |
Modalities |
Example Hugging Face Models |
|---|---|---|---|
|
Text, Image |
|
Status of Models#
You can compile and run any of the architectures listed above on RNGD yourself. The models below go a step further: each is one that FuriosaAI actively validates, and the table shows how far that validation has progressed across three checks — does it run (Function), does it produce correct results (Correctness), and has its performance been tuned (Performance).
The status of each check uses the following scale:
Status |
Meaning |
|---|---|
✅ Passed |
Verified and working as expected. |
🟡 Experimental |
Works, but not yet fully validated or tuned. |
⛔️ Unplanned |
Not planned for this model. |
Model |
Type |
Function |
Correctness |
Performance |
|---|---|---|---|---|
Text |
✅ |
✅ |
✅ |
|
Text (MoE) |
✅ |
✅ |
🟡 |
|
Text |
✅ |
✅ |
✅ |
|
Text |
✅ |
✅ |
✅ |
|
Text (MoE) |
✅ |
✅ |
🟡 |
|
Text (MoE) |
✅ |
✅ |
🟡 |
|
Text (MoE) |
✅ |
✅ |
🟡 |
|
Text |
✅ |
✅ |
✅ |
|
Text |
✅ |
✅ |
⛔️ |
|
Text |
✅ |
✅ |
🟡 |
|
Text (MoE) |
✅ |
✅ |
🟡 |
|
Embedding |
✅ |
✅ |
🟡 |
|
Reranking |
✅ |
✅ |
🟡 |
|
Multimodal |
✅ |
✅ |
🟡 |
|
Text (MoE) |
✅ |
✅ |
🟡 |
|
Text (MoE) |
✅ |
✅ |
🟡 |
|
Text (MoE) |
✅ |
✅ |
🟡 |
For models planned for future releases, see the Roadmap.