> ## Documentation Index
> Fetch the complete documentation index at: https://developers.deepl.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Request Session


## OpenAPI

````yaml post /v3/voice/realtime
openapi: 3.0.3
info:
  title: DeepL API Documentation
  description: >-
    The DeepL API provides programmatic access to DeepL’s language AI
    technology.


    Note: this OpenAPI spec is embedded into our API documentation and has
    shortened descriptions.
  termsOfService: https://www.deepl.com/pro-license
  contact:
    name: DeepL - Contact us
    url: https://www.deepl.com/contact-us
  version: 3.11.0
servers:
  - url: https://api.deepl.com
    description: DeepL API Pro
  - url: https://api-free.deepl.com
    description: DeepL API Free
security: []
tags:
  - name: beta
    description: >-
      Experimental features that are under testing and not yet intended for
      production use.
  - name: TranslateText
    description: >-
      The text-translation API currently consists of a single endpoint,
      `translate`, which is described below.
  - name: TranslateDocuments
    description: >-
      The document translation API allows you to translate whole documents and
      supports the following file types and extensions:
        * `docx` - Microsoft Word Document
        * `pptx` - Microsoft PowerPoint Document
        * `xlsx` - Microsoft Excel Document
        * `pdf` - Portable Document Format
        * `htm / html` - HTML Document
        * `txt` - Plain Text Document
        * `xlf / xliff` - XLIFF Document, version 2.1
        * `srt` - SRT Document
        * `jpeg` / `jpg` / `png` - Image (currently in beta)
  - name: RephraseText
    description: >-
      The `rephrase` endpoint  is used to make corrections and adjustments to
      texts based on style or tone.
  - name: CorrectText
    description: >-
      The `correct` endpoint fixes spelling and grammar errors without broader
      rephrasing. Use it when you want

      a minimal-change correction pass rather than the broader rewriting
      performed by `rephrase`.
  - name: ManageMultilingualGlossaries
    description: >-
      The *glossary* functions allow you to create, inspect, edit and delete
      glossaries.

      Glossaries created with the glossary function can be used in translate
      requests by specifying the

      `glossary_id` parameter. A glossary contains (several) dictionaries.

      A dictionary is a mapping of source phrases to target phrases for a single
      language pair.

      If you encounter issues, please let us know at support@DeepL.com.


      Currently you can create glossaries with any of the languages DeepL
      supports (with the exception of Thai).


      The maximum size limit for a glossary is 10 MiB = 10485760 bytes and each
      source/target text,

      as well as the name of the glossary, is limited to 1024 UTF-8 bytes.

      A total of 1000 glossaries are allowed per account.


      When creating a dictionary with target language `EN`, `PT`, or `ZH`, it's
      not necessary to specify a variant

      (e.g. `EN-US`, `EN-GB`, `PT-PT`, `PT-BR`, or `ZH-HANS`).

      Dictionaries with target language `EN` can be used in translations with
      either English variant.

      Similarly `PT`, and `ZH` dictionaries can be used in translations with
      their corresponding variants.

      (When you provide the ID of a glossary to a translation, the appropriate
      dictionary is automatically applied. Currently glossaries can not yet be
      used with source language detection.)


      Glossaries created via the DeepL API are now unified with glossaries
      created via the DeepL website and DeepL apps.

      Please only use the v3 glossary API in conjunction with multilingual or
      edited glossaries from the website.
  - name: ManageGlossaries
    description: >-
      Please note that this is the spec for the (old) v2 glossary endpoint.

      We recommend users switch to the newer v3 glossary endpoints, which
      support editability and multilinguality.


      The *glossary* functions allow you to create, inspect, and delete
      glossaries.

      Glossaries created with the glossary function can be used in translate
      requests by specifying the

      `glossary_id` parameter.

      If you encounter issues, please let us know at support@DeepL.com.


      Currently you can create glossaries with any of the languages DeepL
      supports (with the exception of Thai).
  - name: MetaInformation
    description: Information about API usage and value ranges
  - name: TranslationMemories
    description: >-
      The translation memory endpoints allow you to interact with your account's
      translation memories, used to store

      and reuse previously created translations. Translation memories can be
      used in text translation requests by

      specifying the `translation_memory_id` parameter to denote a specific
      translation memory and the

      `translation_memory_threshold` which defines the minimum matching
      percentage required for a translation memory

      segment to be applied (recommended to be 75% or higher).
  - name: VoiceAPI
    description: >-
      The Voice API provides real-time voice transcription and translation
      services.

      Use a two-step flow: first request a streaming URL via REST, then
      establish a WebSocket connection for streaming audio and receiving
      transcriptions.
  - name: VoiceTranslateJob
    description: >-
      **Alpha.** Async voice translation jobs. This API may change without
      notice.
  - name: AdminApi
    description: >-
      Endpoints for organization administrators to manage API keys and retrieve
      usage analytics.
  - name: QualityEvaluation
    description: >-
      **Closed alpha.** Evaluate translation quality. Submit source/target
      segment pairs and retrieve per-segment quality issues categorized by error
      type and severity, with character spans pointing to where each issue
      occurs.
externalDocs:
  description: DeepL Pro - Plans and pricing
  url: https://www.deepl.com/pro#developer
paths:
  /v3/voice/realtime:
    servers:
      - url: https://api.deepl.com
        description: Override base path for all operations with the /v3/voice path
    post:
      tags:
        - VoiceAPI
      summary: Get Streaming URL
      operationId: getVoiceStreamingUrl
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              required:
                - source_media_content_type
              properties:
                message_format:
                  $ref: '#/components/schemas/VoiceMessageFormat'
                source_media_content_type:
                  $ref: '#/components/schemas/VoiceSourceMediaContentType'
                source_language:
                  $ref: '#/components/schemas/VoiceSourceLanguage'
                source_language_mode:
                  $ref: '#/components/schemas/VoiceSourceLanguageMode'
                target_languages:
                  $ref: '#/components/schemas/VoiceTargetLanguages'
                target_media_languages:
                  $ref: '#/components/schemas/VoiceTargetMediaLanguages'
                target_media_content_type:
                  $ref: '#/components/schemas/VoiceTargetMediaContentType'
                target_media_voice:
                  $ref: '#/components/schemas/VoiceTargetMediaVoice'
                spoken_terms_id:
                  $ref: '#/components/schemas/SpokenTermsId'
                glossary_id:
                  $ref: '#/components/schemas/GlossaryId'
                formality:
                  $ref: '#/components/schemas/VoiceFormality'
            examples:
              basic:
                summary: Basic configuration
                value:
                  source_media_content_type: audio/ogg; codecs=opus
                  source_language: en
                  source_language_mode: auto
                  target_languages:
                    - de
                    - fr
                    - es
                  message_format: json
              with_glossary:
                summary: With glossary and formality to customize translation
                value:
                  source_media_content_type: audio/pcm; encoding=s16le; rate=16000
                  source_language: en
                  source_language_mode: fixed
                  target_languages:
                    - de
                    - fr
                  message_format: msgpack
                  glossary_id: def3a26b-3e84-45b3-84ae-0c0aaf3525f7
                  formality: formal
              with_spoken_terms:
                summary: With spoken terms to inform transcription
                value:
                  source_media_content_type: audio/pcm; encoding=s16le; rate=16000
                  source_language: en
                  source_language_mode: fixed
                  target_languages:
                    - de
                  spoken_terms_id: 7c4f1080-cfe2-41d4-8269-0e6ec15a0354
              with_tts:
                summary: With translated audio (default format)
                value:
                  source_media_content_type: audio/ogg;codecs=opus
                  source_language: en
                  target_languages:
                    - de
                    - fr
                    - es
                  target_media_languages:
                    - de
                  target_media_content_type: audio/webm;codecs=opus
                  target_media_voice: female
              with_tts_short_form:
                summary: With translated audio using short-form MIME types
                value:
                  source_media_content_type: audio/webm
                  source_language: en
                  target_languages:
                    - de
                    - es
                  target_media_languages:
                    - de
                  target_media_content_type: audio/ogg
              with_tts_high_quality:
                summary: With translated audio using high-quality PCM
                value:
                  source_media_content_type: audio/pcm;encoding=s16le;rate=16000
                  source_language: en
                  target_languages:
                    - de
                  target_media_languages:
                    - de
                  target_media_content_type: audio/pcm;encoding=s16le;rate=24000
              with_tts_raw_opus:
                summary: With translated audio using raw Opus
                value:
                  source_media_content_type: audio/pcm;encoding=s16le;rate=16000
                  source_language: en
                  target_languages:
                    - de
                  target_media_languages:
                    - de
                  target_media_content_type: audio/opus
      responses:
        '200':
          description: Successfully obtained streaming URL and token.
          headers:
            X-Trace-ID:
              $ref: '#/components/headers/X-Trace-ID'
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/VoiceStreamingResponse'
              example:
                streaming_url: wss://api.deepl.com/v3/voice/realtime/connect
                token: VGhpcyBpcyBhIGZha2UgdG9rZW4K
                session_id: 4f911080-cfe2-41d4-8269-0e6ec15a0354
        '400':
          $ref: '#/components/responses/BadRequest'
        '401':
          $ref: '#/components/responses/Unauthorized'
        '403':
          $ref: '#/components/responses/Forbidden'
        '429':
          $ref: '#/components/responses/TooManyRequests'
        '456':
          $ref: '#/components/responses/QuotaExceeded'
        '500':
          $ref: '#/components/responses/InternalServerError'
        '503':
          $ref: '#/components/responses/ServiceUnavailable'
      security:
        - auth_header: []
components:
  schemas:
    VoiceMessageFormat:
      description: >-
        Message encoding format for WebSocket communication. Determines how
        messages are serialized and transmitted.

        Using `json`,  messages are JSON-encoded and sent as TEXT WebSocket
        frames. All binary fields (such as audio data) are base64-encoded
        strings.

        Using `msgpack`, messages are MessagePack-encoded and sent as BINARY
        WebSocket frames. All binary fields (such as audio data) contain raw
        binary data.


        For more details, see [Message
        Encoding](/api-reference/voice#message-encoding).
      type: string
      enum:
        - json
        - msgpack
      default: json
      example: json
    VoiceSourceMediaContentType:
      type: string
      description: >2-
         The audio format for streaming, which specifies container, codec, and encoding parameters. See the table below for supported formats. If `audio/auto` is specified, the server will auto-detect the container and codec for all supported combinations, except PCM. That requires explicit encoding parameters. All formats need to be single channel audio.
         
         | Content Type                          | Container                                         | Codec                                     |
         | :------------------------------------ | :------------------------------------------------ | :---------------------------------------- |
         | `audio/auto`                          | Auto-detect: FLAC / Matroska / MPEG / Ogg / WebM  | Auto-detect AAC / FLAC / MP3 / OPUS       |
         | `audio/flac`                          | FLAC (flac)                                       | FLAC                                      |
         | `audio/mpeg`                          | MPEG (mp3/m4a)                                    | MP3                                       |
         | `audio/ogg`                           | Ogg (ogg/oga)                                     | Auto-detect FLAC / OPUS                   |
         | `audio/webm`                          | WebM (webm)                                       | OPUS                                      |
         | `audio/x-matroska`                    | Matroska (mkv/mka)                                | Auto-detect: AAC / FLAC / MP3 / OPUS      |
         | `audio/ogg;codecs=flac`               | Ogg (ogg/oga)                                     | FLAC                                      |
         | `audio/ogg;codecs=opus`               | Ogg (ogg/oga)                                     | OPUS                                      |
         | `audio/pcm;encoding=alaw;rate=8000`   | -                                                 | PCM A-Law 8000 Hz (G.711)                 |
         | `audio/pcm;encoding=ulaw;rate=8000`   | -                                                 | PCM µ-Law 8000 Hz (G.711)                 |
         | `audio/pcm;encoding=s16le;rate=8000`  | -                                                 | PCM signed 16-bit little-endian 8000 Hz   |
         | `audio/pcm;encoding=s16le;rate=16000` | -                                                 | PCM signed 16-bit little-endian 16000 Hz  |
         | `audio/pcm;encoding=s16le;rate=44100` | -                                                 | PCM signed 16-bit little-endian 44100 Hz  |
         | `audio/pcm;encoding=s16le;rate=48000` | -                                                 | PCM signed 16-bit little-endian 48000 Hz  |
         | `audio/webm;codecs=opus`              | WebM (webm)                                       | OPUS                                      |
         | `audio/x-matroska;codecs=aac`         | Matroska (mkv/mka)                                | AAC                                       |
         | `audio/x-matroska;codecs=flac`        | Matroska (mkv/mka)                                | FLAC                                      |
         | `audio/x-matroska;codecs=mp3`         | Matroska (mkv/mka)                                | MP3                                       |
         | `audio/x-matroska;codecs=opus`        | Matroska (mkv/mka)                                | OPUS                                      |
         

        We recommend the following bitrates as good tradeoff between quality and
        bandwidth:
         - AAC: 96 kbps
         - FLAC: 256 kbps  (16000 Hz)
         - MP3: 128 kbps
         - OPUS: 32 kbps (recommendation for low bandwidth scenarios)
         - PCM: 256 kbps (16000 Hz, default recommendation)
         
      enum:
        - audio/auto
        - audio/flac
        - audio/mpeg
        - audio/ogg
        - audio/webm
        - audio/x-matroska
        - audio/ogg;codecs=flac
        - audio/ogg;codecs=opus
        - audio/pcm;encoding=alaw;rate=8000
        - audio/pcm;encoding=ulaw;rate=8000
        - audio/pcm;encoding=s16le;rate=8000
        - audio/pcm;encoding=s16le;rate=16000
        - audio/pcm;encoding=s16le;rate=44100
        - audio/pcm;encoding=s16le;rate=48000
        - audio/webm;codecs=opus
        - audio/x-matroska;codecs=aac
        - audio/x-matroska;codecs=flac
        - audio/x-matroska;codecs=mp3
        - audio/x-matroska;codecs=opus
      example: audio/ogg;codecs=opus
    VoiceSourceLanguage:
      type: string
      description: >
        The source language of the audio stream. It can be left empty or must be
        one of the supported Voice API source languages and comply with IETF BCP
        47 language tags.

        Note: Some source transcription languages are provided through external
        service partners. See the [supported languages
        table](/api-reference/voice#show-supported-languages) for details.
      enum:
        - ar
        - bg
        - bn
        - cs
        - da
        - de
        - el
        - en
        - es
        - et
        - fi
        - fr
        - ga
        - he
        - hr
        - hu
        - id
        - it
        - ja
        - ko
        - lt
        - lv
        - mt
        - nb
        - nl
        - pl
        - pt
        - ro
        - ru
        - sk
        - sl
        - sv
        - th
        - tl
        - tr
        - uk
        - vi
        - zh
      default: null
      example: en
    VoiceSourceLanguageMode:
      type: string
      description: >-
        Controls how the source_language value is used.

        - `auto`: Treats source language as a hint; server can override

        - `fixed`: Treats source language as mandatory; server must use this
        language
      enum:
        - auto
        - fixed
      default: auto
      example: fixed
    VoiceTargetLanguages:
      type: array
      description: >
        List of target languages for translation. The stream will emit
        translations for each language. Language identifiers must comply with
        IETF BCP 47. See the [supported languages
        table](/api-reference/voice#show-supported-languages) for details.
      items:
        type: string
        enum:
          - ar
          - bg
          - bn
          - cs
          - da
          - de
          - el
          - en
          - en-GB
          - en-US
          - es
          - et
          - fi
          - fr
          - ga
          - he
          - hr
          - hu
          - id
          - it
          - ja
          - ko
          - lt
          - lv
          - mt
          - nb
          - nl
          - pl
          - pt
          - pt-BR
          - pt-PT
          - ro
          - ru
          - sk
          - sl
          - sv
          - th
          - tl
          - tr
          - uk
          - vi
          - zh
          - zh-HANS
          - zh-HANT
      maxItems: 5
      default: []
      example:
        - de
        - fr
        - es
    VoiceTargetMediaLanguages:
      type: array
      description: >
        (closed beta) List of target languages for which to generate synthesized
        audio. Languages specified here will automatically be added to
        target_languages if not already present, ensuring you receive both text
        translation and audio synthesis for these languages. If omitted, only
        text transcription and translation will be provided (no audio
        synthesis). Language identifiers must comply with IETF BCP 47.

        Note: Some translated audio languages are provided through external
        service partners. See the [supported languages
        table](/api-reference/voice#show-supported-languages) for details.
      items:
        type: string
        enum:
          - ar
          - bg
          - cs
          - da
          - de
          - el
          - en
          - en-GB
          - en-US
          - es
          - fi
          - fr
          - hu
          - id
          - it
          - ja
          - ko
          - nb
          - nl
          - pl
          - pt
          - pt-BR
          - pt-PT
          - ro
          - ru
          - sk
          - sv
          - tr
          - uk
          - vi
          - zh
          - zh-HANS
          - zh-HANT
      maxItems: 1
      default: []
      example:
        - de
    VoiceTargetMediaContentType:
      type: string
      description: |2-
         (closed beta) The audio format for synthesized target media streaming.
         Specifies container, codec, and encoding parameters for the audio returned in target_media_chunk messages.
         If not specified, defaults to audio/webm;codecs=opus.
         Only applies when target_media_languages is specified.
         
         | Content Type | Container | Codec |
         | :--- | :--- | :--- |
         | `audio/flac` | FLAC (flac) | FLAC 24000 Hz |
         | `video/mp2t;codecs=aac` | MPEG Transport Stream (Audio only) | AAC 70 kbit/s |
         | `video/mp2t;codecs=opus` | MPEG Transport Stream (Audio only) | OPUS 32 kbit/s |
         | `audio/ogg` | Ogg (ogg/oga) | OPUS 32 kbit/s |
         | `audio/ogg;codecs=flac` | Ogg (ogg/oga) | FLAC 24000 Hz |
         | `audio/ogg;codecs=opus` | Ogg (ogg/oga) | OPUS 32 kbit/s |
         | `audio/opus` | - | OPUS 32 kbit/s |
         | `audio/pcm;encoding=alaw;rate=8000` | - | PCM A-Law 8000 Hz (G.711) |
         | `audio/pcm;encoding=ulaw;rate=8000` | - | PCM µ-Law 8000 Hz (G.711) |
         | `audio/pcm;encoding=s16le;rate=16000` | - | PCM signed 16-bit little-endian 16000 Hz |
         | `audio/pcm;encoding=s16le;rate=24000` | - | PCM signed 16-bit little-endian 24000 Hz |
         | `audio/webm` | WebM (webm) | OPUS 32 kbit/s  |
         | `audio/webm;codecs=opus` | WebM (webm) | OPUS 32 kbit/s |
         | `audio/x-matroska;codecs=aac` | Matroska (mkv/mka) | AAC 70 kbit/s |
         | `audio/x-matroska;codecs=flac` | Matroska (mkv/mka) | FLAC 24000 Hz |
         | `audio/x-matroska;codecs=opus` | Matroska (mkv/mka) | OPUS 32 kbit/s |
         
         We recommend the following formats as good tradeoffs between quality and bandwidth:
         - OPUS (WebM): 32 kbps, recommended for low bandwidth scenarios (default)
         - PCM 24kHz: 384 kbps, high quality
      enum:
        - audio/flac
        - video/mp2t;codecs=aac
        - video/mp2t;codecs=opus
        - audio/ogg
        - audio/ogg;codecs=flac
        - audio/ogg;codecs=opus
        - audio/opus
        - audio/pcm;encoding=alaw;rate=8000
        - audio/pcm;encoding=ulaw;rate=8000
        - audio/pcm;encoding=s16le;rate=16000
        - audio/pcm;encoding=s16le;rate=24000
        - audio/webm
        - audio/webm;codecs=opus
        - audio/x-matroska;codecs=aac
        - audio/x-matroska;codecs=flac
        - audio/x-matroska;codecs=opus
      default: audio/webm;codecs=opus
      example: audio/webm;codecs=opus
    VoiceTargetMediaVoice:
      description: >-
        (closed beta) Target audio voice selection for synthesized speech. The
        default voice is language dependent.
      type: string
      enum:
        - male
        - female
      example: female
    SpokenTermsId:
      type: string
      format: uuid
      description: (beta) The ID of a spoken terms list used to inform transcription.
      example: 7c4f1080-cfe2-41d4-8269-0e6ec15a0354
    GlossaryId:
      type: string
      description: A unique ID assigned to a glossary.
      example: def3a26b-3e84-45b3-84ae-0c0aaf3525f7
    VoiceFormality:
      description: >-
        Sets whether the translated text should lean towards formal or informal
        language.

        Possible options are:
          * `default` - use the default formality for the target language
          * `formal`/`more` - for a more formal language
          * `informal`/`less` - for a more informal language
      type: string
      enum:
        - default
        - formal
        - more
        - informal
        - less
      default: default
      example: formal
    VoiceStreamingResponse:
      type: object
      required:
        - streaming_url
        - token
      properties:
        streaming_url:
          type: string
          description: >
            The WebSocket URL to use for establishing [the stream
            connection](/api-reference/voice/websocket-streaming).
          example: wss://api.deepl.com/v3/voice/realtime/connect
        token:
          type: string
          description: >
            A unique ephemeral token for authentication with the streaming
            endpoint. Pass this as a query parameter when connecting to [the
            streaming URL](/api-reference/voice/websocket-streaming). This token
            is ephemeral and valid for a short time and one-time use only.
          example: VGhpcyBpcyBhIGZha2UgdG9rZW4K
        session_id:
          type: string
          description: |
            Internal use only. A unique identifier for the requested stream.
          example: 4f911080-cfe2-41d4-8269-0e6ec15a0354
    ErrorResponse:
      type: object
      required:
        - message
      properties:
        message:
          type: string
          description: A human-readable description of the error.
        code:
          type: string
          description: >-
            A machine-readable identifier for the error, when available. Clients
            should match on this value rather than on `message` when branching
            on error types.
          example: invalid_content_type
  headers:
    X-Trace-ID:
      description: >-
        A unique identifier for the request that can be included in bug reports
        to DeepL support.
      schema:
        type: string
      example: 501c3d93cc0c4f11ae2f60a226c2f0f0
  responses:
    BadRequest:
      description: Bad request. Please check error message and your parameters.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/ErrorResponse'
    Unauthorized:
      description: >-
        Authorization failed. Please supply a valid `DeepL-Auth-Key` via the
        `Authorization` header.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/ErrorResponse'
    Forbidden:
      description: >-
        Authorization failed. Please supply a valid `DeepL-Auth-Key` via the
        `Authorization` header.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/ErrorResponse'
    TooManyRequests:
      description: Too many requests. Please wait and resend your request.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/ErrorResponse'
    QuotaExceeded:
      description: Quota exceeded. The character limit has been reached.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/ErrorResponse'
    InternalServerError:
      description: Internal error.
    ServiceUnavailable:
      description: Resource currently unavailable. Try again later.
  securitySchemes:
    auth_header:
      type: apiKey
      description: >
        Authentication with `Authorization` header and `DeepL-Auth-Key`
        authentication scheme. Example: `DeepL-Auth-Key <api-key>`
      name: Authorization
      in: header
      x-default: 'DeepL-Auth-Key '

````