API Reference
Die folgende API-Dokumentation wird aus der OpenAPI-Spezifikation der Inference-Endpoints generiert.
LLMaaS API 1.0.0
LLMaaS Inference API — unified interface for AI model inference (models, chat completions, embeddings, rerank).
Model parameters use the format provider/model (e.g. ew/minimax27, anthropic/claude-4-7-opus).
Servers
| Description | URL |
|---|---|
| Base URL of the LLMaaS service. | https://ai.ewcs.ch/ |
Models
GET /v1/models
List available models
Description
Lists available models. If provider is not specified, lists all models from all configured providers.
If a virtual key is provided, Bifrost only lists (and only queries) providers allowed by that virtual key.
Input parameters
| Parameter | In | Type | Default | Nullable | Description |
|---|---|---|---|---|---|
ApiKeyAuth |
header | string | N/A | No | API key authentication via the `x-api-key` header. Virtual keys (prefixed with `sk-bf-`) can also be passed here. |
VirtualKeyAuth |
header | string | N/A | No | LLMaaS Virtual Key for governance, routing, and access control. Supported on all inference endpoints (`/v1/*`, `/openai/*`, `/anthropic/*`, `/bedrock/*`, `/cohere/*`, `/genai/*`, `/langchain/*`, `/litellm/*`, `/pydanticai/*`, `/mcp`), not on management APIs (`/api/*`). Example: `sk-bf-*` prefixed keys. |
BasicAuth |
header | string | N/A | No | Basic authentication using username and password. |
BearerAuth |
header | string | N/A | No | Bearer token authentication. Use your provider API key or LLMaaS authentication token. Virtual keys (prefixed with `sk-bf-`) can also be passed here. |
page_size |
query | integer | No | Maximum number of models to return | |
page_token |
query | string | No | Token for pagination | |
provider |
query | No | Filter by provider (e.g., openai, anthropic, bedrock) |
Responses
{
"data": [
{
"id": "string",
"canonical_slug": "string",
"name": "string",
"deployment": "string",
"created": 258,
"context_length": 0,
"max_input_tokens": 0,
"max_output_tokens": 0,
"architecture": {
"modality": "string",
"tokenizer": "string",
"instruct_type": "string",
"input_modalities": [
"string"
],
"output_modalities": [
"string"
]
},
"pricing": {
"prompt": "string",
"completion": "string",
"request": "string",
"image": "string",
"web_search": "string",
"internal_reasoning": "string",
"input_cache_read": "string",
"input_cache_write": "string"
},
"top_provider": {
"is_moderated": true,
"context_length": 0,
"max_completion_tokens": 0
},
"per_request_limits": {
"prompt_tokens": 0,
"completion_tokens": 0
},
"supported_parameters": [
"string"
],
"default_parameters": {
"temperature": 10.12,
"top_p": 10.12,
"frequency_penalty": 10.12
},
"hugging_face_id": "string",
"description": "string",
"owned_by": "string",
"supported_methods": [
"string"
]
}
],
"extra_fields": {
"request_type": "string",
"provider": "openai",
"model_requested": "string",
"model_deployment": "string",
"latency": 109,
"chunk_index": 0,
"raw_request": {},
"raw_response": {},
"cache_debug": {
"cache_hit": true,
"cache_id": "string",
"hit_type": "string",
"requested_provider": "string",
"requested_model": "string",
"provider_used": "string",
"model_used": "string",
"input_tokens": 0,
"threshold": 10.12,
"similarity": 10.12
}
},
"next_page_token": "string"
}
Schema of the response body
{
"type": "object",
"properties": {
"data": {
"type": "array",
"items": {
"$ref": "#/components/schemas/Model"
}
},
"extra_fields": {
"$ref": "#/components/schemas/BifrostResponseExtraFields"
},
"next_page_token": {
"type": "string"
}
}
}
{
"event_id": "string",
"type": "string",
"is_bifrost_error": true,
"status_code": 0,
"error": {
"type": "string",
"code": "string",
"message": "string",
"param": "string",
"event_id": "string"
},
"extra_fields": {
"provider": "openai",
"model_requested": "string",
"request_type": "string"
}
}
Schema of the response body
{
"type": "object",
"description": "Error response from Bifrost",
"properties": {
"event_id": {
"type": "string"
},
"type": {
"type": "string"
},
"is_bifrost_error": {
"type": "boolean"
},
"status_code": {
"type": "integer"
},
"error": {
"$ref": "#/components/schemas/ErrorField"
},
"extra_fields": {
"$ref": "#/components/schemas/BifrostErrorExtraFields"
}
}
}
{
"event_id": "string",
"type": "string",
"is_bifrost_error": true,
"status_code": 0,
"error": {
"type": "string",
"code": "string",
"message": "string",
"param": "string",
"event_id": "string"
},
"extra_fields": {
"provider": "openai",
"model_requested": "string",
"request_type": "string"
}
}
Schema of the response body
{
"type": "object",
"description": "Error response from Bifrost",
"properties": {
"event_id": {
"type": "string"
},
"type": {
"type": "string"
},
"is_bifrost_error": {
"type": "boolean"
},
"status_code": {
"type": "integer"
},
"error": {
"$ref": "#/components/schemas/ErrorField"
},
"extra_fields": {
"$ref": "#/components/schemas/BifrostErrorExtraFields"
}
}
}
Chat Completions
POST /v1/chat/completions
Create a chat completion
Description
Creates a completion for the provided messages. Supports streaming via SSE.
Input parameters
| Parameter | In | Type | Default | Nullable | Description |
|---|---|---|---|---|---|
ApiKeyAuth |
header | string | N/A | No | API key authentication via the `x-api-key` header. Virtual keys (prefixed with `sk-bf-`) can also be passed here. |
VirtualKeyAuth |
header | string | N/A | No | LLMaaS Virtual Key for governance, routing, and access control. Supported on all inference endpoints (`/v1/*`, `/openai/*`, `/anthropic/*`, `/bedrock/*`, `/cohere/*`, `/genai/*`, `/langchain/*`, `/litellm/*`, `/pydanticai/*`, `/mcp`), not on management APIs (`/api/*`). Example: `sk-bf-*` prefixed keys. |
BasicAuth |
header | string | N/A | No | Basic authentication using username and password. |
BearerAuth |
header | string | N/A | No | Bearer token authentication. Use your provider API key or LLMaaS authentication token. Virtual keys (prefixed with `sk-bf-`) can also be passed here. |
Request body
{
"model": "openai/gpt-4",
"messages": [
{
"role": "assistant",
"name": "string",
"content": null,
"tool_call_id": "string",
"refusal": "string",
"audio": {
"id": "string",
"data": "string",
"expires_at": 0,
"transcript": "string"
},
"reasoning": "string",
"reasoning_details": [
{
"id": "string",
"index": 0,
"type": "reasoning.summary",
"summary": "string",
"text": "string",
"signature": "string",
"data": "string"
}
],
"annotations": [
{
"type": "string",
"url_citation": {
"start_index": 0,
"end_index": 0,
"title": "string",
"url": "string",
"sources": {},
"type": "string"
}
}
],
"tool_calls": [
{
"index": 0,
"type": "string",
"id": "string",
"function": {
"name": "string",
"arguments": "string"
}
}
]
}
],
"fallbacks": [
"string"
],
"stream": true,
"frequency_penalty": 10.12,
"logit_bias": {},
"logprobs": true,
"max_completion_tokens": 0,
"metadata": {},
"modalities": [
"string"
],
"parallel_tool_calls": true,
"presence_penalty": 10.12,
"prompt_cache_key": "string",
"reasoning": {
"effort": "none",
"max_tokens": 0
},
"response_format": {},
"safety_identifier": "string",
"service_tier": "string",
"stream_options": {
"include_obfuscation": true,
"include_usage": true
},
"store": true,
"temperature": 10.12,
"tool_choice": null,
"tools": [
{
"type": "function",
"function": {
"name": "string",
"description": "string",
"parameters": {
"type": "string",
"description": "string",
"required": [
"string"
],
"properties": {},
"enum": [
"string"
],
"additionalProperties": true
},
"strict": true
},
"custom": {
"format": {
"type": "string",
"grammar": {
"definition": "string",
"syntax": "lark"
}
}
},
"cache_control": {
"type": "ephemeral",
"ttl": "string"
}
}
],
"seed": 0,
"top_p": 10.12,
"top_logprobs": 0,
"stop": null,
"prediction": {
"type": "string",
"content": null
},
"prompt_cache_retention": "in-memory",
"web_search_options": {
"search_context_size": "low",
"user_location": {
"type": "string",
"approximate": {
"city": "string",
"country": "string",
"region": "string",
"timezone": "string"
}
}
},
"truncation": "string",
"user": "string",
"verbosity": "low"
}
Schema of the request body
{
"type": "object",
"required": [
"model",
"messages"
],
"properties": {
"model": {
"type": "string",
"description": "Model in provider/model format (e.g., openai/gpt-4)",
"example": "openai/gpt-4"
},
"messages": {
"type": "array",
"items": {
"$ref": "#/components/schemas/ChatMessage"
},
"description": "List of messages in the conversation"
},
"fallbacks": {
"type": "array",
"items": {
"type": "string"
},
"description": "Fallback models in provider/model format"
},
"stream": {
"type": "boolean",
"description": "Whether to stream the response"
},
"frequency_penalty": {
"type": "number",
"minimum": -2.0,
"maximum": 2.0
},
"logit_bias": {
"type": "object",
"additionalProperties": {
"type": "number"
}
},
"logprobs": {
"type": "boolean"
},
"max_completion_tokens": {
"type": "integer"
},
"metadata": {
"type": "object",
"additionalProperties": true
},
"modalities": {
"type": "array",
"items": {
"type": "string"
}
},
"parallel_tool_calls": {
"type": "boolean"
},
"presence_penalty": {
"type": "number",
"minimum": -2.0,
"maximum": 2.0
},
"prompt_cache_key": {
"type": "string"
},
"reasoning": {
"type": "object",
"properties": {
"effort": {
"type": "string",
"description": "Reasoning effort level",
"enum": [
"none",
"minimal",
"low",
"medium",
"high",
"xhigh"
]
},
"max_tokens": {
"type": "integer"
}
}
},
"response_format": {
"type": "object",
"description": "Format for the response"
},
"safety_identifier": {
"type": "string"
},
"service_tier": {
"type": "string"
},
"stream_options": {
"type": "object",
"properties": {
"include_obfuscation": {
"type": "boolean"
},
"include_usage": {
"type": "boolean"
}
}
},
"store": {
"type": "boolean"
},
"temperature": {
"type": "number",
"minimum": 0,
"maximum": 2
},
"tool_choice": {
"oneOf": [
{
"type": "string",
"enum": [
"none",
"auto",
"required"
]
},
{
"type": "object",
"required": [
"type"
],
"properties": {
"type": {
"type": "string",
"enum": [
"none",
"any",
"required",
"function",
"allowed_tools",
"custom"
]
},
"function": {
"type": "object",
"required": [
"name"
],
"properties": {
"name": {
"type": "string"
}
}
},
"allowed_tools": {
"type": "object",
"properties": {
"mode": {
"type": "string",
"enum": [
"auto",
"required"
]
},
"tools": {
"type": "array",
"items": {
"type": "object",
"required": [
"type"
],
"properties": {
"type": {
"type": "string"
},
"function": {
"type": "object",
"required": [
"name"
],
"properties": {
"name": {
"type": "string"
}
}
}
}
}
}
}
}
}
}
]
},
"tools": {
"type": "array",
"items": {
"type": "object",
"required": [
"type"
],
"properties": {
"type": {
"type": "string",
"enum": [
"function",
"custom"
]
},
"function": {
"type": "object",
"required": [
"name"
],
"properties": {
"name": {
"type": "string"
},
"description": {
"type": "string"
},
"parameters": {
"type": "object",
"properties": {
"type": {
"type": "string"
},
"description": {
"type": "string"
},
"required": {
"type": "array",
"items": {
"type": "string"
}
},
"properties": {
"type": "object",
"additionalProperties": true
},
"enum": {
"type": "array",
"items": {
"type": "string"
}
},
"additionalProperties": {
"type": "boolean"
}
}
},
"strict": {
"type": "boolean"
}
}
},
"custom": {
"type": "object",
"properties": {
"format": {
"type": "object",
"required": [
"type"
],
"properties": {
"type": {
"type": "string"
},
"grammar": {
"type": "object",
"required": [
"definition",
"syntax"
],
"properties": {
"definition": {
"type": "string"
},
"syntax": {
"type": "string",
"enum": [
"lark",
"regex"
]
}
}
}
}
}
}
},
"cache_control": {
"$ref": "#/components/schemas/CacheControl"
}
}
}
},
"seed": {
"type": "integer",
"description": "Deterministic sampling seed"
},
"top_p": {
"type": "number",
"minimum": 0,
"maximum": 1,
"description": "Nucleus sampling parameter"
},
"top_logprobs": {
"type": "integer",
"minimum": 0,
"maximum": 20,
"description": "Number of most likely tokens to return at each position"
},
"stop": {
"oneOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "string"
}
}
],
"description": "Up to 4 sequences where the API will stop generating tokens"
},
"prediction": {
"type": "object",
"description": "Predicted output content for the model to reference (OpenAI only). Can reduce latency.",
"properties": {
"type": {
"type": "string",
"description": "Always \"content\""
},
"content": {
"description": "Predicted content (string or array of content parts)",
"oneOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "object",
"additionalProperties": true
}
}
]
}
}
},
"prompt_cache_retention": {
"type": "string",
"enum": [
"in-memory",
"24h"
],
"description": "Prompt cache retention policy"
},
"web_search_options": {
"type": "object",
"description": "Web search options for chat completions (OpenAI only)",
"properties": {
"search_context_size": {
"type": "string",
"enum": [
"low",
"medium",
"high"
],
"description": "Amount of search context to include"
},
"user_location": {
"type": "object",
"properties": {
"type": {
"type": "string",
"description": "Location type (e.g., \"approximate\")"
},
"approximate": {
"type": "object",
"properties": {
"city": {
"type": "string"
},
"country": {
"type": "string",
"description": "Two-letter ISO country code (e.g., \"US\")"
},
"region": {
"type": "string",
"description": "Region or state (e.g., \"California\")"
},
"timezone": {
"type": "string",
"description": "IANA timezone (e.g., \"America/Los_Angeles\")"
}
}
}
}
}
}
},
"truncation": {
"type": "string"
},
"user": {
"type": "string"
},
"verbosity": {
"type": "string",
"enum": [
"low",
"medium",
"high"
]
}
}
}
Responses
{
"id": "string",
"choices": [
{
"index": 0,
"finish_reason": "string",
"log_probs": {
"content": [
{
"bytes": [
0
],
"logprob": 10.12,
"token": "string",
"top_logprobs": [
{
"bytes": [
0
],
"logprob": 10.12,
"token": "string"
}
]
}
],
"refusal": [
{
"bytes": [
0
],
"logprob": 10.12,
"token": "string"
}
],
"text_offset": [
0
],
"token_logprobs": [
10.12
],
"tokens": [
"string"
],
"top_logprobs": [
{}
]
},
"text": "string",
"message": {
"role": "assistant",
"name": "string",
"content": null,
"tool_call_id": "string",
"refusal": "string",
"audio": {
"id": "string",
"data": "string",
"expires_at": 0,
"transcript": "string"
},
"reasoning": "string",
"reasoning_details": [
{
"id": "string",
"index": 0,
"type": "reasoning.summary",
"summary": "string",
"text": "string",
"signature": "string",
"data": "string"
}
],
"annotations": [
{
"type": "string",
"url_citation": {
"start_index": 0,
"end_index": 0,
"title": "string",
"url": "string",
"sources": {},
"type": "string"
}
}
],
"tool_calls": [
{
"index": 0,
"type": "string",
"id": "string",
"function": {
"name": "string",
"arguments": "string"
}
}
]
},
"delta": {
"role": "string",
"content": "string",
"refusal": "string",
"audio": {
"id": "string",
"data": "string",
"expires_at": 0,
"transcript": "string"
},
"reasoning": "string",
"reasoning_details": [
{
"id": "string",
"index": 0,
"type": "reasoning.summary",
"summary": "string",
"text": "string",
"signature": "string",
"data": "string"
}
],
"tool_calls": [
{
"index": 0,
"type": "string",
"id": "string",
"function": {
"name": "string",
"arguments": "string"
}
}
]
}
}
],
"created": 0,
"model": "string",
"object": "string",
"service_tier": "string",
"system_fingerprint": "string",
"usage": {
"prompt_tokens": 0,
"prompt_tokens_details": {
"text_tokens": 0,
"audio_tokens": 0,
"image_tokens": 0,
"cached_read_tokens": 0,
"cached_write_tokens": 0
},
"completion_tokens": 0,
"completion_tokens_details": {
"text_tokens": 0,
"accepted_prediction_tokens": 0,
"audio_tokens": 0,
"citation_tokens": 0,
"num_search_queries": 0,
"reasoning_tokens": 0,
"image_tokens": 0,
"rejected_prediction_tokens": 0
},
"total_tokens": 0,
"cost": {
"input_tokens_cost": 10.12,
"output_tokens_cost": 10.12,
"reasoning_tokens_cost": 10.12,
"citation_tokens_cost": 10.12,
"search_queries_cost": 10.12,
"request_cost": 10.12,
"total_cost": 10.12
}
},
"extra_fields": {
"request_type": "string",
"provider": "openai",
"model_requested": "string",
"model_deployment": "string",
"latency": 85,
"chunk_index": 0,
"raw_request": {},
"raw_response": {},
"cache_debug": {
"cache_hit": true,
"cache_id": "string",
"hit_type": "string",
"requested_provider": "string",
"requested_model": "string",
"provider_used": "string",
"model_used": "string",
"input_tokens": 0,
"threshold": 10.12,
"similarity": 10.12
}
},
"search_results": [
{
"title": "string",
"url": "string",
"date": "string",
"last_updated": "string",
"snippet": "string",
"source": "string"
}
],
"videos": [
{
"url": "string",
"thumbnail_url": "string",
"thumbnail_width": 0,
"thumbnail_height": 0,
"duration": 10.12
}
],
"citations": [
"string"
]
}
Schema of the response body
{
"type": "object",
"properties": {
"id": {
"type": "string"
},
"choices": {
"type": "array",
"items": {
"type": "object",
"properties": {
"index": {
"type": "integer"
},
"finish_reason": {
"type": "string"
},
"log_probs": {
"type": "object",
"properties": {
"content": {
"type": "array",
"items": {
"type": "object",
"properties": {
"bytes": {
"type": "array",
"items": {
"type": "integer"
}
},
"logprob": {
"type": "number"
},
"token": {
"type": "string"
},
"top_logprobs": {
"type": "array",
"items": {
"type": "object",
"properties": {
"bytes": {
"type": "array",
"items": {
"type": "integer"
}
},
"logprob": {
"type": "number"
},
"token": {
"type": "string"
}
}
}
}
}
}
},
"refusal": {
"type": "array",
"items": {
"type": "object",
"properties": {
"bytes": {
"type": "array",
"items": {
"type": "integer"
}
},
"logprob": {
"type": "number"
},
"token": {
"type": "string"
}
}
}
},
"text_offset": {
"type": "array",
"items": {
"type": "integer"
}
},
"token_logprobs": {
"type": "array",
"items": {
"type": "number"
}
},
"tokens": {
"type": "array",
"items": {
"type": "string"
}
},
"top_logprobs": {
"type": "array",
"items": {
"type": "object",
"additionalProperties": {
"type": "number"
}
}
}
}
},
"text": {
"type": "string",
"description": "For text completions"
},
"message": {
"$ref": "#/components/schemas/ChatMessage",
"description": "For non-streaming chat completions"
},
"delta": {
"type": "object",
"properties": {
"role": {
"type": "string"
},
"content": {
"type": "string"
},
"refusal": {
"type": "string"
},
"audio": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"data": {
"type": "string"
},
"expires_at": {
"type": "integer"
},
"transcript": {
"type": "string"
}
}
},
"reasoning": {
"type": "string"
},
"reasoning_details": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"index": {
"type": "integer"
},
"type": {
"type": "string",
"enum": [
"reasoning.summary",
"reasoning.encrypted",
"reasoning.text"
]
},
"summary": {
"type": "string"
},
"text": {
"type": "string"
},
"signature": {
"type": "string"
},
"data": {
"type": "string"
}
}
}
},
"tool_calls": {
"type": "array",
"items": {
"type": "object",
"required": [
"function"
],
"properties": {
"index": {
"type": "integer"
},
"type": {
"type": "string"
},
"id": {
"type": "string"
},
"function": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"arguments": {
"type": "string"
}
}
}
}
}
}
},
"description": "For streaming chat completions"
}
}
}
},
"created": {
"type": "integer"
},
"model": {
"type": "string"
},
"object": {
"type": "string"
},
"service_tier": {
"type": "string"
},
"system_fingerprint": {
"type": "string"
},
"usage": {
"$ref": "#/components/schemas/BifrostLLMUsage"
},
"extra_fields": {
"$ref": "#/components/schemas/BifrostResponseExtraFields"
},
"search_results": {
"type": "array",
"items": {
"$ref": "#/components/schemas/PerplexitySearchResult"
}
},
"videos": {
"type": "array",
"items": {
"$ref": "#/components/schemas/PerplexityVideoResult"
}
},
"citations": {
"type": "array",
"items": {
"type": "string"
}
}
}
}
{
"id": "string",
"choices": [
{
"index": 0,
"finish_reason": "string",
"log_probs": {
"content": [
{
"bytes": [
0
],
"logprob": 10.12,
"token": "string",
"top_logprobs": [
{
"bytes": [
0
],
"logprob": 10.12,
"token": "string"
}
]
}
],
"refusal": [
{
"bytes": [
0
],
"logprob": 10.12,
"token": "string"
}
],
"text_offset": [
0
],
"token_logprobs": [
10.12
],
"tokens": [
"string"
],
"top_logprobs": [
{}
]
},
"text": "string",
"message": {
"role": "assistant",
"name": "string",
"content": null,
"tool_call_id": "string",
"refusal": "string",
"audio": {
"id": "string",
"data": "string",
"expires_at": 0,
"transcript": "string"
},
"reasoning": "string",
"reasoning_details": [
{
"id": "string",
"index": 0,
"type": "reasoning.summary",
"summary": "string",
"text": "string",
"signature": "string",
"data": "string"
}
],
"annotations": [
{
"type": "string",
"url_citation": {
"start_index": 0,
"end_index": 0,
"title": "string",
"url": "string",
"sources": {},
"type": "string"
}
}
],
"tool_calls": [
{
"index": 0,
"type": "string",
"id": "string",
"function": {
"name": "string",
"arguments": "string"
}
}
]
},
"delta": {
"role": "string",
"content": "string",
"refusal": "string",
"audio": {
"id": "string",
"data": "string",
"expires_at": 0,
"transcript": "string"
},
"reasoning": "string",
"reasoning_details": [
{
"id": "string",
"index": 0,
"type": "reasoning.summary",
"summary": "string",
"text": "string",
"signature": "string",
"data": "string"
}
],
"tool_calls": [
{
"index": 0,
"type": "string",
"id": "string",
"function": {
"name": "string",
"arguments": "string"
}
}
]
}
}
],
"created": 0,
"model": "string",
"object": "string",
"usage": {
"prompt_tokens": 0,
"prompt_tokens_details": {
"text_tokens": 0,
"audio_tokens": 0,
"image_tokens": 0,
"cached_read_tokens": 0,
"cached_write_tokens": 0
},
"completion_tokens": 0,
"completion_tokens_details": {
"text_tokens": 0,
"accepted_prediction_tokens": 0,
"audio_tokens": 0,
"citation_tokens": 0,
"num_search_queries": 0,
"reasoning_tokens": 0,
"image_tokens": 0,
"rejected_prediction_tokens": 0
},
"total_tokens": 0,
"cost": {
"input_tokens_cost": 10.12,
"output_tokens_cost": 10.12,
"reasoning_tokens_cost": 10.12,
"citation_tokens_cost": 10.12,
"search_queries_cost": 10.12,
"request_cost": 10.12,
"total_cost": 10.12
}
},
"extra_fields": {
"request_type": "string",
"provider": "openai",
"model_requested": "string",
"model_deployment": "string",
"latency": 13,
"chunk_index": 0,
"raw_request": {},
"raw_response": {},
"cache_debug": {
"cache_hit": true,
"cache_id": "string",
"hit_type": "string",
"requested_provider": "string",
"requested_model": "string",
"provider_used": "string",
"model_used": "string",
"input_tokens": 0,
"threshold": 10.12,
"similarity": 10.12
}
}
}
Schema of the response body
{
"type": "object",
"description": "Streaming chat completion response (SSE format)",
"properties": {
"id": {
"type": "string"
},
"choices": {
"type": "array",
"items": {
"type": "object",
"properties": {
"index": {
"type": "integer"
},
"finish_reason": {
"type": "string"
},
"log_probs": {
"type": "object",
"properties": {
"content": {
"type": "array",
"items": {
"type": "object",
"properties": {
"bytes": {
"type": "array",
"items": {
"type": "integer"
}
},
"logprob": {
"type": "number"
},
"token": {
"type": "string"
},
"top_logprobs": {
"type": "array",
"items": {
"type": "object",
"properties": {
"bytes": {
"type": "array",
"items": {
"type": "integer"
}
},
"logprob": {
"type": "number"
},
"token": {
"type": "string"
}
}
}
}
}
}
},
"refusal": {
"type": "array",
"items": {
"type": "object",
"properties": {
"bytes": {
"type": "array",
"items": {
"type": "integer"
}
},
"logprob": {
"type": "number"
},
"token": {
"type": "string"
}
}
}
},
"text_offset": {
"type": "array",
"items": {
"type": "integer"
}
},
"token_logprobs": {
"type": "array",
"items": {
"type": "number"
}
},
"tokens": {
"type": "array",
"items": {
"type": "string"
}
},
"top_logprobs": {
"type": "array",
"items": {
"type": "object",
"additionalProperties": {
"type": "number"
}
}
}
}
},
"text": {
"type": "string",
"description": "For text completions"
},
"message": {
"$ref": "#/components/schemas/ChatMessage",
"description": "For non-streaming chat completions"
},
"delta": {
"type": "object",
"properties": {
"role": {
"type": "string"
},
"content": {
"type": "string"
},
"refusal": {
"type": "string"
},
"audio": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"data": {
"type": "string"
},
"expires_at": {
"type": "integer"
},
"transcript": {
"type": "string"
}
}
},
"reasoning": {
"type": "string"
},
"reasoning_details": {
"type": "array",
"items": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"index": {
"type": "integer"
},
"type": {
"type": "string",
"enum": [
"reasoning.summary",
"reasoning.encrypted",
"reasoning.text"
]
},
"summary": {
"type": "string"
},
"text": {
"type": "string"
},
"signature": {
"type": "string"
},
"data": {
"type": "string"
}
}
}
},
"tool_calls": {
"type": "array",
"items": {
"type": "object",
"required": [
"function"
],
"properties": {
"index": {
"type": "integer"
},
"type": {
"type": "string"
},
"id": {
"type": "string"
},
"function": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"arguments": {
"type": "string"
}
}
}
}
}
}
},
"description": "For streaming chat completions"
}
}
}
},
"created": {
"type": "integer"
},
"model": {
"type": "string"
},
"object": {
"type": "string"
},
"usage": {
"$ref": "#/components/schemas/BifrostLLMUsage"
},
"extra_fields": {
"$ref": "#/components/schemas/BifrostResponseExtraFields"
}
}
}
{
"event_id": "string",
"type": "string",
"is_bifrost_error": true,
"status_code": 0,
"error": {
"type": "string",
"code": "string",
"message": "string",
"param": "string",
"event_id": "string"
},
"extra_fields": {
"provider": "openai",
"model_requested": "string",
"request_type": "string"
}
}
Schema of the response body
{
"type": "object",
"description": "Error response from Bifrost",
"properties": {
"event_id": {
"type": "string"
},
"type": {
"type": "string"
},
"is_bifrost_error": {
"type": "boolean"
},
"status_code": {
"type": "integer"
},
"error": {
"$ref": "#/components/schemas/ErrorField"
},
"extra_fields": {
"$ref": "#/components/schemas/BifrostErrorExtraFields"
}
}
}
{
"event_id": "string",
"type": "string",
"is_bifrost_error": true,
"status_code": 0,
"error": {
"type": "string",
"code": "string",
"message": "string",
"param": "string",
"event_id": "string"
},
"extra_fields": {
"provider": "openai",
"model_requested": "string",
"request_type": "string"
}
}
Schema of the response body
{
"type": "object",
"description": "Error response from Bifrost",
"properties": {
"event_id": {
"type": "string"
},
"type": {
"type": "string"
},
"is_bifrost_error": {
"type": "boolean"
},
"status_code": {
"type": "integer"
},
"error": {
"$ref": "#/components/schemas/ErrorField"
},
"extra_fields": {
"$ref": "#/components/schemas/BifrostErrorExtraFields"
}
}
}
Embeddings
POST /v1/embeddings
Create embeddings
Description
Creates an embedding vector representing the input text.
Input parameters
| Parameter | In | Type | Default | Nullable | Description |
|---|---|---|---|---|---|
ApiKeyAuth |
header | string | N/A | No | API key authentication via the `x-api-key` header. Virtual keys (prefixed with `sk-bf-`) can also be passed here. |
VirtualKeyAuth |
header | string | N/A | No | LLMaaS Virtual Key for governance, routing, and access control. Supported on all inference endpoints (`/v1/*`, `/openai/*`, `/anthropic/*`, `/bedrock/*`, `/cohere/*`, `/genai/*`, `/langchain/*`, `/litellm/*`, `/pydanticai/*`, `/mcp`), not on management APIs (`/api/*`). Example: `sk-bf-*` prefixed keys. |
BasicAuth |
header | string | N/A | No | Basic authentication using username and password. |
BearerAuth |
header | string | N/A | No | Bearer token authentication. Use your provider API key or LLMaaS authentication token. Virtual keys (prefixed with `sk-bf-`) can also be passed here. |
Request body
{
"model": "string",
"input": null,
"fallbacks": [
"string"
],
"encoding_format": "float",
"dimensions": 0
}
Schema of the request body
{
"type": "object",
"required": [
"model",
"input"
],
"properties": {
"model": {
"type": "string",
"description": "Model in provider/model format"
},
"input": {
"oneOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "string"
}
},
{
"type": "array",
"items": {
"type": "integer"
}
},
{
"type": "array",
"items": {
"type": "array",
"items": {
"type": "integer"
}
}
}
],
"description": "Input for embedding - text or token arrays"
},
"fallbacks": {
"type": "array",
"items": {
"type": "string"
}
},
"encoding_format": {
"type": "string",
"enum": [
"float",
"base64"
]
},
"dimensions": {
"type": "integer"
}
}
}
Responses
{
"data": [
{
"index": 0,
"object": "string",
"embedding": null
}
],
"model": "string",
"object": "string",
"usage": {
"prompt_tokens": 0,
"prompt_tokens_details": {
"text_tokens": 0,
"audio_tokens": 0,
"image_tokens": 0,
"cached_read_tokens": 0,
"cached_write_tokens": 0
},
"completion_tokens": 0,
"completion_tokens_details": {
"text_tokens": 0,
"accepted_prediction_tokens": 0,
"audio_tokens": 0,
"citation_tokens": 0,
"num_search_queries": 0,
"reasoning_tokens": 0,
"image_tokens": 0,
"rejected_prediction_tokens": 0
},
"total_tokens": 0,
"cost": {
"input_tokens_cost": 10.12,
"output_tokens_cost": 10.12,
"reasoning_tokens_cost": 10.12,
"citation_tokens_cost": 10.12,
"search_queries_cost": 10.12,
"request_cost": 10.12,
"total_cost": 10.12
}
},
"extra_fields": {
"request_type": "string",
"provider": "openai",
"model_requested": "string",
"model_deployment": "string",
"latency": 38,
"chunk_index": 0,
"raw_request": {},
"raw_response": {},
"cache_debug": {
"cache_hit": true,
"cache_id": "string",
"hit_type": "string",
"requested_provider": "string",
"requested_model": "string",
"provider_used": "string",
"model_used": "string",
"input_tokens": 0,
"threshold": 10.12,
"similarity": 10.12
}
}
}
Schema of the response body
{
"type": "object",
"properties": {
"data": {
"type": "array",
"items": {
"type": "object",
"properties": {
"index": {
"type": "integer"
},
"object": {
"type": "string"
},
"embedding": {
"oneOf": [
{
"type": "string"
},
{
"type": "array",
"items": {
"type": "number"
}
},
{
"type": "array",
"items": {
"type": "array",
"items": {
"type": "number"
}
}
}
]
}
}
}
},
"model": {
"type": "string"
},
"object": {
"type": "string"
},
"usage": {
"$ref": "#/components/schemas/BifrostLLMUsage"
},
"extra_fields": {
"$ref": "#/components/schemas/BifrostResponseExtraFields"
}
}
}
{
"event_id": "string",
"type": "string",
"is_bifrost_error": true,
"status_code": 0,
"error": {
"type": "string",
"code": "string",
"message": "string",
"param": "string",
"event_id": "string"
},
"extra_fields": {
"provider": "openai",
"model_requested": "string",
"request_type": "string"
}
}
Schema of the response body
{
"type": "object",
"description": "Error response from Bifrost",
"properties": {
"event_id": {
"type": "string"
},
"type": {
"type": "string"
},
"is_bifrost_error": {
"type": "boolean"
},
"status_code": {
"type": "integer"
},
"error": {
"$ref": "#/components/schemas/ErrorField"
},
"extra_fields": {
"$ref": "#/components/schemas/BifrostErrorExtraFields"
}
}
}
{
"event_id": "string",
"type": "string",
"is_bifrost_error": true,
"status_code": 0,
"error": {
"type": "string",
"code": "string",
"message": "string",
"param": "string",
"event_id": "string"
},
"extra_fields": {
"provider": "openai",
"model_requested": "string",
"request_type": "string"
}
}
Schema of the response body
{
"type": "object",
"description": "Error response from Bifrost",
"properties": {
"event_id": {
"type": "string"
},
"type": {
"type": "string"
},
"is_bifrost_error": {
"type": "boolean"
},
"status_code": {
"type": "integer"
},
"error": {
"$ref": "#/components/schemas/ErrorField"
},
"extra_fields": {
"$ref": "#/components/schemas/BifrostErrorExtraFields"
}
}
}
Rerank
POST /v1/rerank
Rerank documents
Description
Reorders input documents by relevance to a query.
Input parameters
| Parameter | In | Type | Default | Nullable | Description |
|---|---|---|---|---|---|
ApiKeyAuth |
header | string | N/A | No | API key authentication via the `x-api-key` header. Virtual keys (prefixed with `sk-bf-`) can also be passed here. |
VirtualKeyAuth |
header | string | N/A | No | LLMaaS Virtual Key for governance, routing, and access control. Supported on all inference endpoints (`/v1/*`, `/openai/*`, `/anthropic/*`, `/bedrock/*`, `/cohere/*`, `/genai/*`, `/langchain/*`, `/litellm/*`, `/pydanticai/*`, `/mcp`), not on management APIs (`/api/*`). Example: `sk-bf-*` prefixed keys. |
BasicAuth |
header | string | N/A | No | Basic authentication using username and password. |
BearerAuth |
header | string | N/A | No | Bearer token authentication. Use your provider API key or LLMaaS authentication token. Virtual keys (prefixed with `sk-bf-`) can also be passed here. |
Request body
{
"model": "cohere/rerank-v3.5",
"query": "string",
"documents": [
{
"text": "string",
"id": "string",
"meta": {}
}
],
"fallbacks": [
"string"
],
"top_n": 0,
"max_tokens_per_doc": 0,
"priority": 0,
"return_documents": true
}
Schema of the request body
{
"type": "object",
"required": [
"model",
"query",
"documents"
],
"properties": {
"model": {
"type": "string",
"description": "Model in provider/model format",
"example": "cohere/rerank-v3.5"
},
"query": {
"type": "string",
"minLength": 1,
"description": "Query used to score and reorder documents"
},
"documents": {
"type": "array",
"description": "Documents to rerank",
"minItems": 1,
"items": {
"$ref": "#/components/schemas/RerankDocument"
}
},
"fallbacks": {
"type": "array",
"items": {
"type": "string"
},
"description": "Fallback models in provider/model format"
},
"top_n": {
"type": "integer",
"minimum": 1,
"description": "Maximum number of ranked results to return"
},
"max_tokens_per_doc": {
"type": "integer",
"minimum": 1,
"description": "Maximum tokens to consider per document (provider-dependent)"
},
"priority": {
"type": "integer",
"description": "Request priority hint (provider-dependent)"
},
"return_documents": {
"type": "boolean",
"description": "Whether to include document content in each result"
}
}
}
Responses
{
"id": "string",
"results": [
{
"index": 0,
"relevance_score": 10.12,
"document": {
"text": "string",
"id": "string",
"meta": {}
}
}
],
"model": "string",
"usage": {
"prompt_tokens": 0,
"prompt_tokens_details": {
"text_tokens": 0,
"audio_tokens": 0,
"image_tokens": 0,
"cached_read_tokens": 0,
"cached_write_tokens": 0
},
"completion_tokens": 0,
"completion_tokens_details": {
"text_tokens": 0,
"accepted_prediction_tokens": 0,
"audio_tokens": 0,
"citation_tokens": 0,
"num_search_queries": 0,
"reasoning_tokens": 0,
"image_tokens": 0,
"rejected_prediction_tokens": 0
},
"total_tokens": 0,
"cost": {
"input_tokens_cost": 10.12,
"output_tokens_cost": 10.12,
"reasoning_tokens_cost": 10.12,
"citation_tokens_cost": 10.12,
"search_queries_cost": 10.12,
"request_cost": 10.12,
"total_cost": 10.12
}
},
"extra_fields": {
"request_type": "string",
"provider": "openai",
"model_requested": "string",
"model_deployment": "string",
"latency": 157,
"chunk_index": 0,
"raw_request": {},
"raw_response": {},
"cache_debug": {
"cache_hit": true,
"cache_id": "string",
"hit_type": "string",
"requested_provider": "string",
"requested_model": "string",
"provider_used": "string",
"model_used": "string",
"input_tokens": 0,
"threshold": 10.12,
"similarity": 10.12
}
}
}
Schema of the response body
{
"type": "object",
"required": [
"results",
"model"
],
"properties": {
"id": {
"type": "string",
"description": "Unique identifier for the rerank response"
},
"results": {
"type": "array",
"description": "Ranked results ordered by relevance score descending",
"items": {
"$ref": "#/components/schemas/RerankResult"
}
},
"model": {
"type": "string",
"description": "Model used to perform reranking"
},
"usage": {
"$ref": "#/components/schemas/BifrostLLMUsage"
},
"extra_fields": {
"$ref": "#/components/schemas/BifrostResponseExtraFields"
}
}
}
{
"event_id": "string",
"type": "string",
"is_bifrost_error": true,
"status_code": 0,
"error": {
"type": "string",
"code": "string",
"message": "string",
"param": "string",
"event_id": "string"
},
"extra_fields": {
"provider": "openai",
"model_requested": "string",
"request_type": "string"
}
}
Schema of the response body
{
"type": "object",
"description": "Error response from Bifrost",
"properties": {
"event_id": {
"type": "string"
},
"type": {
"type": "string"
},
"is_bifrost_error": {
"type": "boolean"
},
"status_code": {
"type": "integer"
},
"error": {
"$ref": "#/components/schemas/ErrorField"
},
"extra_fields": {
"$ref": "#/components/schemas/BifrostErrorExtraFields"
}
}
}
{
"event_id": "string",
"type": "string",
"is_bifrost_error": true,
"status_code": 0,
"error": {
"type": "string",
"code": "string",
"message": "string",
"param": "string",
"event_id": "string"
},
"extra_fields": {
"provider": "openai",
"model_requested": "string",
"request_type": "string"
}
}
Schema of the response body
{
"type": "object",
"description": "Error response from Bifrost",
"properties": {
"event_id": {
"type": "string"
},
"type": {
"type": "string"
},
"is_bifrost_error": {
"type": "boolean"
},
"status_code": {
"type": "integer"
},
"error": {
"$ref": "#/components/schemas/ErrorField"
},
"extra_fields": {
"$ref": "#/components/schemas/BifrostErrorExtraFields"
}
}
}
Common responses
This section describes common responses that are reused across operations.
BadRequest
Bad request
{
"event_id": "string",
"type": "string",
"is_bifrost_error": true,
"status_code": 0,
"error": {
"type": "string",
"code": "string",
"message": "string",
"param": "string",
"event_id": "string"
},
"extra_fields": {
"provider": "openai",
"model_requested": "string",
"request_type": "string"
}
}
Schema of the response body
{
"type": "object",
"description": "Error response from Bifrost",
"properties": {
"event_id": {
"type": "string"
},
"type": {
"type": "string"
},
"is_bifrost_error": {
"type": "boolean"
},
"status_code": {
"type": "integer"
},
"error": {
"$ref": "#/components/schemas/ErrorField"
},
"extra_fields": {
"$ref": "#/components/schemas/BifrostErrorExtraFields"
}
}
}
NotFound
Resource not found
{
"event_id": "string",
"type": "string",
"is_bifrost_error": true,
"status_code": 0,
"error": {
"type": "string",
"code": "string",
"message": "string",
"param": "string",
"event_id": "string"
},
"extra_fields": {
"provider": "openai",
"model_requested": "string",
"request_type": "string"
}
}
Schema of the response body
{
"type": "object",
"description": "Error response from Bifrost",
"properties": {
"event_id": {
"type": "string"
},
"type": {
"type": "string"
},
"is_bifrost_error": {
"type": "boolean"
},
"status_code": {
"type": "integer"
},
"error": {
"$ref": "#/components/schemas/ErrorField"
},
"extra_fields": {
"$ref": "#/components/schemas/BifrostErrorExtraFields"
}
}
}
InternalError
Internal server error
{
"event_id": "string",
"type": "string",
"is_bifrost_error": true,
"status_code": 0,
"error": {
"type": "string",
"code": "string",
"message": "string",
"param": "string",
"event_id": "string"
},
"extra_fields": {
"provider": "openai",
"model_requested": "string",
"request_type": "string"
}
}
Schema of the response body
{
"type": "object",
"description": "Error response from Bifrost",
"properties": {
"event_id": {
"type": "string"
},
"type": {
"type": "string"
},
"is_bifrost_error": {
"type": "boolean"
},
"status_code": {
"type": "integer"
},
"error": {
"$ref": "#/components/schemas/ErrorField"
},
"extra_fields": {
"$ref": "#/components/schemas/BifrostErrorExtraFields"
}
}
}
Common parameters
This section describes common parameters that are reused across operations.
AsyncJobId
| Name | In | Type | Default | Nullable | Description |
|---|---|---|---|---|---|
job_id |
path | string | No |
AsyncResultTTL
| Name | In | Type | Default | Nullable | Description |
|---|---|---|---|---|---|
x-bf-async-job-result-ttl |
header | integer | 3600 | No |
Security schemes
| Name | Type | Scheme | Description |
|---|---|---|---|
| BearerAuth | http | bearer | Bearer token authentication. Use your provider API key or LLMaaS authentication token. Virtual keys (prefixed with `sk-bf-`) can also be passed here. |
| BasicAuth | http | basic | Basic authentication using username and password. |
| VirtualKeyAuth | apiKey | LLMaaS Virtual Key for governance, routing, and access control. Supported on all inference endpoints (`/v1/*`, `/openai/*`, `/anthropic/*`, `/bedrock/*`, `/cohere/*`, `/genai/*`, `/langchain/*`, `/litellm/*`, `/pydanticai/*`, `/mcp`), not on management APIs (`/api/*`). Example: `sk-bf-*` prefixed keys. | |
| ApiKeyAuth | apiKey | API key authentication via the `x-api-key` header. Virtual keys (prefixed with `sk-bf-`) can also be passed here. |
Tags
| Name | Description |
|---|---|
| Models | Model listing and information |
| Chat Completions | Chat-based text generation |
| Rerank | Document reranking by relevance to a query |
| Embeddings | Text embedding generation |