Bringing Function Calling to DeepSeek Models on SGLang

If you’ve worked with large language models (LLMs) in production, you’ve probably heard of SGLang — an open-source platform designed for serving LLMs and vision-language models at scale. With over 13k stars on GitHub, SGLang has quickly become a go-to choice for developers and researchers looking for a flexible, high-performance way to deploy models. One of the key features that makes SGLang so powerful is its support for function calling — a mechanism that allows models to invoke external tools or APIs by generating structured JSON outputs. This unlocks a wide range of use cases: fetching real-time data, querying databases, triggering backend services, and more. Function calling transforms static model responses into dynamic, action-driven workflows. But until recently, there was one important piece missing. The Gap: No Function Calling for DeepSeek Models While SGLang has supported function calling for several popular model families, it didn’t work for the DeepSeek series of models out of the box. That meant if you were deploying DeepSeek models — like DeepSeek-V3 or R1 — through SGLang, they couldn’t take advantage of tool use. For example, if your app asked the model, “What’s the weather in Boston today?” a DeepSeek model wouldn’t be able to call out to a weather API and fetch the current forecast. Without function calling, the model is limited to whatever knowledge it was trained on — no real-time data, no external integrations. In modern LLM-powered systems, this is a serious limitation. Our Contribution: Adding DeepSeek Function Calling to SGLang Our team at NetMind decided to contribute our methods to the open source community. In Pull Request #5224, our engineering team added a dedicated function-call parser for DeepSeek models to SGLang. The contribution included: A new parsing module tailored for DeepSeek’s function-calling format An updated chat template (deepseek2.jinja) for DeepSeek-V3 Fixes for token-ID handling to ensure proper alignment with the V3 tokenizer Documentation updates aligned with the official V3 tokenizer config These improvements enable reliable function-calling support for DeepSeek models within SGLang, helping unlock tool use and external function integration for this model family. We’re grateful to the SGLang maintainers for their review and support, and we’re excited that this feature is now part of the official project. Why This Matters As LLMs become key components in real-world systems, the ability to reason with language and act through tools is becoming essential. Whether it’s querying live data, running calculations, or interacting with other services, function calling bridges the gap between language understanding and real-world action. By contributing this feature to the SGLang ecosystem, we hope to make it easier for developers, researchers, and companies to build more capable, interactive AI systems — with DeepSeek models fully in the mix. We can’t wait to see what the community builds next. About NetMind NetMind is an AI infrastructure and services company focused on making large language models easier to use and deploy. We provide flexible model inference APIs, AI service APIs, and their Model Context Protocol (MCP) version— an open standard for connecting models with tools and services. Beyond APIs, we also offer infrastructure solutions, private deployment options, and AI consulting to help businesses build reliable, production-ready AI systems. This contribution to SGLang is part of our ongoing effort to support the open-source ecosystem and help more teams unlock the full potential of LLMs. Learn more at NetMind.ai.

Apr 23, 2025 - 13:56

Bringing Function Calling to DeepSeek Models on SGLang

If you’ve worked with large language models (LLMs) in production, you’ve probably heard of SGLang — an open-source platform designed for serving LLMs and vision-language models at scale. With over 13k stars on GitHub, SGLang has quickly become a go-to choice for developers and researchers looking for a flexible, high-performance way to deploy models.

One of the key features that makes SGLang so powerful is its support for function calling — a mechanism that allows models to invoke external tools or APIs by generating structured JSON outputs. This unlocks a wide range of use cases: fetching real-time data, querying databases, triggering backend services, and more. Function calling transforms static model responses into dynamic, action-driven workflows.

But until recently, there was one important piece missing.

The Gap: No Function Calling for DeepSeek Models
While SGLang has supported function calling for several popular model families, it didn’t work for the DeepSeek series of models out of the box. That meant if you were deploying DeepSeek models — like DeepSeek-V3 or R1 — through SGLang, they couldn’t take advantage of tool use.

For example, if your app asked the model,

“What’s the weather in Boston today?”
a DeepSeek model wouldn’t be able to call out to a weather API and fetch the current forecast. Without function calling, the model is limited to whatever knowledge it was trained on — no real-time data, no external integrations.

In modern LLM-powered systems, this is a serious limitation.

Our Contribution: Adding DeepSeek Function Calling to SGLang
Our team at NetMind decided to contribute our methods to the open source community.

In Pull Request #5224, our engineering team added a dedicated function-call parser for DeepSeek models to SGLang. The contribution included:

A new parsing module tailored for DeepSeek’s function-calling format
An updated chat template (deepseek2.jinja) for DeepSeek-V3
Fixes for token-ID handling to ensure proper alignment with the V3 tokenizer
Documentation updates aligned with the official V3 tokenizer config

These improvements enable reliable function-calling support for DeepSeek models within SGLang, helping unlock tool use and external function integration for this model family.

We’re grateful to the SGLang maintainers for their review and support, and we’re excited that this feature is now part of the official project.

Why This Matters
As LLMs become key components in real-world systems, the ability to reason with language and act through tools is becoming essential. Whether it’s querying live data, running calculations, or interacting with other services, function calling bridges the gap between language understanding and real-world action.

By contributing this feature to the SGLang ecosystem, we hope to make it easier for developers, researchers, and companies to build more capable, interactive AI systems — with DeepSeek models fully in the mix.

We can’t wait to see what the community builds next.

About NetMind
NetMind is an AI infrastructure and services company focused on making large language models easier to use and deploy. We provide flexible model inference APIs, AI service APIs, and their Model Context Protocol (MCP) version— an open standard for connecting models with tools and services.

Beyond APIs, we also offer infrastructure solutions, private deployment options, and AI consulting to help businesses build reliable, production-ready AI systems.

This contribution to SGLang is part of our ongoing effort to support the open-source ecosystem and help more teams unlock the full potential of LLMs.

Learn more at NetMind.ai.

Bringing Function Calling to DeepSeek Models on SGLang

Tags:

Related Posts

Popular Posts

Recommended Posts