🔒 Free tier data may be used to improve AI models. Upgrade Pro for 100% Privacy

Azure Speech on TTSForge

A focused workspace for Azure-only voices, mstts styles, and Azure-friendly SSML iteration.

Write inner SSML only, insert Azure-friendly tags from the toolbar, validate your markup, and generate downloadable audio without leaving the page.

TTSForge / Azure

Azure Speech SSML Editor

Convert Azure-friendly SSML into speech with support for mstts styles, silence control, pronunciation, and more expressive neural voice testing.

Quick guide

1. This workspace is SSML-only; the page adds the Azure <speak> root and selected <voice> wrapper automatically.

SSML editor

Write inner SSML only. The page adds Azure speak and voice wrappers automatically when generating.

967/2000

Azure SSML Toolbar

Insert snippets quickly or wrap the selected text. The page auto-adds the Azure speak and voice wrappers for you.

Within limitYou do not need to type the root <speak> or selected <voice> tags.
Limit exceeded by 967 chars! Resets at .

Preset templates

Insert Azure-friendly structures with one click.

Supported tags

A short Azure-oriented cheat sheet so you can test mstts and standard SSML tags faster.

break
<break>
Timing

Insert a pause in the spoken output.

<break time="500ms"/>
mstts:silence
<mstts:silence>
Timing

Control silence around sentence or punctuation boundaries with Azure-specific timing.

<mstts:silence type="Sentenceboundary" value="200ms"/>
say-as
<say-as>
Pronunciation

Read numbers, dates, and characters with the right interpretation.

<say-as interpret-as="date" format="mdy">03/18/2026</say-as>
sub
<sub>
Pronunciation

Replace spoken output with a friendlier alias.

<sub alias="Azure Speech">Azure TTS</sub>
phoneme
<phoneme>
Pronunciation

Override pronunciation using IPA or another phoneme alphabet.

<phoneme alphabet="ipa" ph="təˈmeɪtoʊ">tomato</phoneme>
prosody
<prosody>
Styling

Adjust speaking rate, pitch, or volume for a section.

<prosody rate="-10%" pitch="+1st">Hello everyone</prosody>
emphasis
<emphasis>
Styling

Emphasize a keyword or phrase.

<emphasis level="strong">very important</emphasis>
mstts:express-as
<mstts:express-as>
Styling

Azure-specific tag for changing style, styledegree, and role in a section.

<mstts:express-as style="customerservice" role="YoungAdultFemale">How can I help you today?</mstts:express-as>
p
<p>
Structure

Group content into a paragraph.

<p>Paragraph content</p>
s
<s>
Structure

Group content into a sentence.

<s>Sentence content</s>
lang
<lang>
Multilingual

Temporarily switch the language for a section.

<lang xml:lang="en-US">hello world</lang>
bookmark
<bookmark>
Timing

Place a bookmark marker for synchronization events in the audio timeline.

<bookmark mark="scene-1"/>

Voice settings

Show only Azure Speech voices and keep generation aligned with the main TTS workflow.

No voices found for "ALL"

Output

Audio player, request info, and file download.

No audio yet. The player and download button will appear here after a successful generation.

Tutorial Video

Related Workflows You Can Run Next

Working with documents? Use PDF to Speech or Document to Speech to convert files directly into audio.

For subtitle-based video workflow, go from Video to SRT to SRT to Speech for timeline-synced dubbing.

For multi-character scripts, use Multi-Voice TTS to assign different voices in one script.

Review, play and download your generated audios. Files will be deleted after 90 days. Download now!

Total: 0 items
No items

Quick workflow

Designed for both first-time users and Azure SSML-heavy editing.

  1. 1

    Step 1: Compose inner SSML

    Write directly, start from a preset, or use the toolbar for Azure-friendly tags.

  2. 2

    Step 2: Choose an Azure voice

    Pick the language and the exact Azure neural voice you want to test.

  3. 3

    Step 3: Adjust pitch

    Use pitch when needed and keep the rest of the playback customization in the output player.

  4. 4

    Step 4: Validate and generate

    The form adds Azure speak and voice wrappers automatically before creating audio.

Tip: start with mstts:express-as and change one variable at a time, such as style, styledegree, or prosody.

Why use the Azure TTS page?

A cleaner workflow for Azure Speech users who need more than the home page form.

Azure-only workspace

The page filters the voice list to Azure Speech voices only, so style testing is not mixed with Google or other providers.

Azure SSML wrapper built in

You write inner SSML only. The page automatically adds the Azure speak root, mstts namespace, xml:lang, and the selected voice wrapper.

Style and role testing

Quick presets and toolbar snippets make it faster to test mstts:express-as, role, sentence timing, and prosody variations.

Same output workflow

Quota, generation, audio playback, download, and share-link logic stay aligned with the main TTS workflow.

Frequently Asked Questions

Q: How is this Azure page different from the main Text-to-Speech page?

A: The main page is broader and text-first. This Azure page is narrower by design: it focuses on Azure-only voices, Azure SSML, mstts styles, and faster iteration for expressive speech workflows.

Q: Do I need to type the <speak> or <voice> tags myself?

A: No. You only write inner SSML content. The page automatically wraps it with the Azure speak root, mstts namespace, xml:lang, and the selected voice tag before sending the request.

Q: Which Azure-specific tags are emphasized here?

A: The page highlights Azure-friendly tags such as mstts:express-as and mstts:silence, alongside standard SSML tags like break, prosody, say-as, sub, and phoneme.

Q: Will every Azure voice support every style or role?

A: Not always. Azure styles, styledegree, and role support depend on the specific neural voice. This page helps you test quickly, but voice-level capability still depends on Azure Speech support.

Q: Can I use plain text instead of SSML here?

A: This workspace is intentionally SSML-first. If you only need quick plain-text generation, the main Text to Speech page is a better fit.

Q: Can I download and share the result?

A: Yes. After a successful generation, you can play the audio, download it, and copy a share link from the output panel.

Related articles