Tool calling

모델이 함수를 호출하게 합니다. OpenAI SDK와 Anthropic SDK 모두에서 동작합니다.

POST/v1/chat/completions

도구 호출(함수 호출)을 사용하면 모델이 직접 답하는 대신, 미리 선언해 둔 함수를 호출하도록 구조화된 인자를 반환합니다. 외부 API 조회, 데이터베이스 검색, 계산 등 모델이 직접 할 수 없는 작업을 코드가 대신 실행하고 그 결과를 다시 모델에 돌려줄 수 있습니다. PleumRouter는 OpenAI 형식의 tools를 받으며, Anthropic(Claude) 모델로 라우팅될 때는 라우터가 자동으로 Anthropic 도구 형식으로 변환합니다 — 요청 형식은 바꿀 필요가 없습니다.

요청#

파라미터	타입	필수	설명
tools	array	선택	모델이 호출할 수 있는 함수 목록. OpenAI 형식 `[{"type": "function", "function": {"name", "description", "parameters"}}]`을 사용하며, `parameters`는 JSON Schema입니다.
tool_choice	string \| object	선택	`"auto"`(모델이 판단) · `"required"`(반드시 도구 호출) · `"none"`(도구 사용 안 함), 또는 `{"type": "function", "function": {"name": "..."}}`로 특정 함수를 강제.
parallel_tool_calls	boolean	선택	한 응답에서 여러 도구를 동시에 호출할지 여부. OpenAI 호환 프로바이더에서만 적용되며, Anthropic으로는 변환되지 않습니다.

채팅 요청 본문에 tools와 (선택적으로) tool_choice를 포함해 POST /v1/chat/completions를 호출합니다. 인증은 plm_ API 키를 Authorization: Bearer 또는 x-api-key 헤더로 전달합니다.

request

curl https://router.pleum.ai/v1/chat/completions \
  -H "Authorization: Bearer $PLEUM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "What is the weather in Seoul?"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get the current weather for a city.",
          "parameters": {
            "type": "object",
            "properties": {
              "city": {"type": "string", "description": "City name, e.g. Seoul"},
              "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["city"]
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'

응답#

모델이 도구를 호출하기로 하면 choices[0].message.content는 null이 되고, message.tool_calls 배열에 호출할 함수와 인자가 담깁니다. arguments는 JSON 문자열이므로 파싱해서 사용하세요. 이때 finish_reason은 "tool_calls"입니다. 모든 채팅 응답에는 PleumRouter 확장인 cost(원화 비용·환율·마크업)와 request_id도 포함됩니다.

200 OK (tool_calls)

{
  "id": "chatcmpl-gpt-4o-612ms",
  "object": "chat.completion",
  "model": "gpt-4o",
  "provider": "openai",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"city\": \"Seoul\", \"unit\": \"celsius\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 78,
    "completion_tokens": 21,
    "total_tokens": 99
  },
  "cost": {
    "usd": 0.000396,
    "krw": 1,
    "fx_rate": 1390.5,
    "markup_rate": 0.0
  },
  "request_id": "req_01J9X2Qm7..."
}

멀티턴 루프#

tool_calls를 받으면 코드에서 해당 함수를 실행한 뒤, 같은 messages 배열에 모델의 assistant 메시지(tool_calls 포함)와 실행 결과를 role: "tool" 메시지로 이어 붙여 다시 호출합니다. tool 메시지에는 tool_call_id, name, 결과를 담은 content를 넣습니다. 모델이 최종 답변을 낼 때까지 이 루프를 반복합니다.

follow-up request body

{
  "model": "gpt-4o",
  "messages": [
    {"role": "user", "content": "What is the weather in Seoul?"},
    {
      "role": "assistant",
      "content": null,
      "tool_calls": [
        {
          "id": "call_abc123",
          "type": "function",
          "function": {
            "name": "get_weather",
            "arguments": "{\"city\": \"Seoul\", \"unit\": \"celsius\"}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "tool_call_id": "call_abc123",
      "name": "get_weather",
      "content": "{\"temp\": 21, \"unit\": \"celsius\", \"sky\": \"clear\"}"
    }
  ],
  "tools": [
    {"type": "function", "function": {"name": "get_weather", "description": "Get the current weather for a city.", "parameters": {"type": "object", "properties": {"city": {"type": "string"}, "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}}, "required": ["city"]}}}
  ]
}

parallel_tool_calls는 OpenAI 호환 프로바이더 전용입니다. Anthropic(Claude) 모델은 라우터가 도구 형식을 자동 변환해 처리하므로 요청을 따로 바꿀 필요는 없지만, 이 플래그는 Anthropic으로 전달되지 않습니다.