Roadmap#

FurisaAI strives to deliver the releases for each month, while offering patch releases. This page shows the forward-looking roadmap of ongoing & upcoming projects and when they are expected to land, broken down by areas on our software stack.

Latest Recent Release#

The latest release is 2024.2.1 (beta 0) on Jan 10, 2025. You can find the release notes here.

Future Releases#

2025 Q1#

  • 🔲 Tensor Parallelism support Phase 2: Inter-chip (planned for 2025.1.0 release)

  • 🔲 Speculating with a draft model (planned for 2025.1.0 release)

  • 🔲 CPU memory swapping of KV cache in Furiosa LLM (planned for 2025.2.0 release)

  • 🔲 torch.compile() backend (planned for 2025.2.0 release)

  • 🔲 Embedding API support in Furiosa LLM (planned for 2025.1.0 release)

  • 🔲 Tool-calling support in Furiosa LLM (planned for 2025.1.0 release)

  • 🔲 Chunked Prefill support in Furiosa LLM (planned for 2025.2.0 release)

2024 Q4#

  • ✅ Language Model Support: CodeLLaMA2, Vicuna, Solar, EXAONE-3.0 (2024.2.0 release)

  • ✅ Vision Model Support: MobileNetV1, MobileNetV2, ResNet152, ResNet50, EfficientNet, YOLOv8m, .. (2024.2.0 release)

  • ✅ Tensor Parallelism support Phase 1: Intra-chip (2024.2.0 release)

  • ✅ Torch 2.4.1 support (2024.2.0)

  • ✅ Huggingface Optimum integration (2024.2.0 release)

  • 🔲 Device remapping support (e.g., /dev/rngd/npu2pe0-3 -> /dev/rngd/npu0pe0-3) for container (planned 2024.2.2 release)

  • 🔲 CPU memory swapping of KV cache in Furiosa LLM (postponed to 2025 Q1)

  • 🔲 Speculating with a draft model (postponed to 2025 Q1)

  • 🔲 torch.compile() backend (postponed to 2025 Q1)