qubrid-ai

Qubrid AI Inference API

The Qubrid AI Inference API provides a single, OpenAI-compatible endpoint for orchestrating 40+ open-source models running on NVIDIA GPU infrastructure. By abstracting hardware orchestration through TensorRT-LLM and Triton Inference Server, the API allows enterprise developers to run inference on models without managing underlying infrastructure.

Documentation GitHub OpenAPI

OpenAPI

#Artificial Intelligence #Inference #Large Language Models #Machine Learning #NVIDIA #Serverless

← Back to Q APIs

API Learnings

Toolbox

API Evangelist LLC

Qubrid AI Inference API

Documentation

Specifications

Schemas & Data

OpenAPI

API Details

Provider

Explore more