Jina-Serve

Jina-serve is a framework for building and deploying AI services that communicate via gRPC, HTTP and WebSockets. Scale your services from local development to production while focusing on your core logic.

Key Features

Native support for all major ML frameworks and data types
High-performance service design with scaling, streaming, and dynamic batching
LLM serving with streaming output
Built-in Docker integration and Executor Hub
One-click deployment to Jina AI Cloud
Enterprise-ready with Kubernetes and Docker Compose support

Comparison with FastAPI

Key advantages over FastAPI:

DocArray-based data handling with native gRPC support
Built-in containerization and service orchestration
Seamless scaling of microservices
One-command cloud deployment

Install

See guides for Apple Silicon and Windows.

Core Concepts

Three main layers:

Data: BaseDoc and DocList for input/output
Serving: Executors process Documents, Gateway connects services
Orchestration: Deployments serve Executors, Flows create pipelines

Build AI Services

Let's create a gRPC-based AI service using StableLM:

Deploy with Python or YAML:

Use the client:

Build Pipelines

Chain services into a Flow:

Scaling and Deployment

Local Scaling

Boost throughput with built-in features:

Replicas for parallel processing
Shards for data partitioning
Dynamic batching for efficient model inference

Example scaling a Stable Diffusion deployment:

Cloud Deployment

Containerize Services

Structure your Executor:

equirements.txt

Configure:

Push to Hub:

Deploy to Kubernetes

Use Docker Compose

JCloud Deployment

Deploy with a single command:

LLM Streaming

Enable token-by-token streaming for responsive LLM applications:

Define schemas:

Initialize service:

Implement streaming:

Serve and use:

Support

Jina-serve is backed by Jina AI and licensed under Apache-2.0.

Reviews

Reviews (0)

No reviews found!

Comments

Comments (0)

No comments found for this product. Be the first to comment!

Add Comment

Jina-Serve

Jina-Serve

Key Features

Install

Core Concepts

Build AI Services

Build Pipelines