Chat Completion

Create chat completion

curl --request POST \
  --url https://api.edgee.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "openai/gpt-4o",
  "messages": [
    {
      "role": "system",
      "content": "<string>",
      "name": "<string>",
      "tool_call_id": "<string>",
      "refusal": "<string>",
      "tool_calls": [
        {
          "id": "<string>",
          "type": "function",
          "function": {
            "name": "<string>",
            "arguments": "<string>"
          }
        }
      ]
    }
  ],
  "max_tokens": 2,
  "stream": false,
  "stream_options": {
    "include_usage": true
  },
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "<string>",
        "description": "<string>",
        "parameters": {}
      }
    }
  ],
  "tool_choice": "none",
  "edgee_tool_ids": [
    "edgee_current_time",
    "edgee_generate_uuid"
  ],
  "edgee_pending_id": "<string>",
  "tags": [
    "<string>"
  ]
}
'

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "openai/gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I assist you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 10,
    "total_tokens": 20,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens_details": {
      "reasoning_tokens": 0
    }
  },
  "compression": {
    "saved_tokens": 450,
    "cost_savings": 27000,
    "reduction": 48,
    "time_ms": 12
  }
}

POST

chat

completions

Create chat completion

curl --request POST \
  --url https://api.edgee.ai/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "openai/gpt-4o",
  "messages": [
    {
      "role": "system",
      "content": "<string>",
      "name": "<string>",
      "tool_call_id": "<string>",
      "refusal": "<string>",
      "tool_calls": [
        {
          "id": "<string>",
          "type": "function",
          "function": {
            "name": "<string>",
            "arguments": "<string>"
          }
        }
      ]
    }
  ],
  "max_tokens": 2,
  "stream": false,
  "stream_options": {
    "include_usage": true
  },
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "<string>",
        "description": "<string>",
        "parameters": {}
      }
    }
  ],
  "tool_choice": "none",
  "edgee_tool_ids": [
    "edgee_current_time",
    "edgee_generate_uuid"
  ],
  "edgee_pending_id": "<string>",
  "tags": [
    "<string>"
  ]
}
'

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "openai/gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I assist you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 10,
    "total_tokens": 20,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens_details": {
      "reasoning_tokens": 0
    }
  },
  "compression": {
    "saved_tokens": 450,
    "cost_savings": 27000,
    "reduction": 48,
    "time_ms": 12
  }
}

Creates a completion for the chat message. The Edgee API is OpenAI-compatible and works with any model and provider. Supports both streaming and non-streaming responses.

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your API key. More info here

Headers

X-Edgee-Enable-Compression

boolean

Enable token compression for this request. When true, the gateway compresses the prompt at the edge before forwarding to the provider, reducing input token costs by up to 50%. When compression is applied, the response includes a compression object with savings metrics.

X-Edgee-Tags

string

Comma-separated list of tags for categorizing and filtering requests in analytics and logs. Example: production,chatbot,customer-support

X-Edgee-Debug

boolean

Enable debug mode to include additional debugging information in the response.

Body

application/json

model

string

required

ID of the model to use. Format: {author_id}/{model_id} (e.g. openai/gpt-4o)

Example:

"openai/gpt-4o"

messages

object[]

required

A list of messages comprising the conversation so far.

Minimum array length: 1

Show child attributes

max_tokens

integer

The maximum number of tokens that can be generated in the chat completion.

Required range: x >= 1

stream

boolean

default:false

If set, partial message deltas will be sent, as in OpenAI. Streamed chunks are sent as Server-Sent Events (SSE).

stream_options

object

Options for streaming response.

Show child attributes

tools

object[]

A list of tools the model may call. Currently, only function type is supported.

Show child attributes

tool_choice

Controls which tool is called by the model.

Available options:

none,

auto

edgee_tool_ids

string[]

List of Edge Tool IDs to inject (e.g. edgee_current_time, edgee_generate_uuid). Each ID must be activated for your API key. When omitted or empty, only tools with hydration enabled for your org or API key are auto-injected. Invalid or non-activated IDs return 400 with invalid_edgee_tool_ids.

Example:

["edgee_current_time", "edgee_generate_uuid"]

edgee_pending_id

string

Pending operation ID when continuing a conversation after Edge Tool execution (e.g. when mixing client-side and Edge Tools). The gateway injects stored Edge Tool results into the message history.

Response

Chat completion created successfully

string

required

A unique identifier for the chat completion.

Example:

"chatcmpl-123"

object

enum<string>

required

The object type, which is always chat.completion.

Available options:

chat.completion

created

integer

required

The Unix timestamp (in seconds) of when the chat completion was created.

Example:

1677652288

model

string

required

The model used for the chat completion.

Example:

"openai/gpt-4o"

choices

object[]

required

A list of chat completion choices. Can be more than one if n is greater than 1.

Show child attributes

usage

object

required

Usage statistics for the completion. In streaming responses, this is only present in the final chunk when stream_options.include_usage is true.

Show child attributes

compression

object

Token compression metrics. Present in the response when token compression was applied to the request (via X-Edgee-Enable-Compression: true header or console settings). The usage.prompt_tokens field reflects the compressed token count actually billed by the provider.

Show child attributes

Errors

Anthropic Messages

⌘I

Introduction

Endpoints

Authorizations

Headers

Body

Response