Qubrid AI Inference API
The Qubrid AI Inference API provides a single, OpenAI-compatible endpoint for orchestrating 40+ open-source models running on NVIDIA GPU infrastructure. By abstracting hardware orchestration through TensorRT-LLM and Triton Inference Server, the API allows enterprise developers to run inference on models without managing underlying infrastructure.
Documentation
Specifications
Schemas & Data
OpenAPI
#Artificial Intelligence
#Inference
#Large Language Models
#Machine Learning
#NVIDIA
#Serverless