Guides

TTS Verbalization Guide and Specification for en-GB

Introduction

This document describes how a specific series of strings are read by the TTS Engine for the English language in the GB domain. All the specifications and usage are valid for the following solutions:

Voice Pack

Version

Release Date

en-GB Voice Pack

TBD

TBD

And for the TTS Engine version 2.8.0.

Numbers

Integers

Integers less than 1000

Format

Example

Input String

Verbalized Form

[1-9][0-9]

31

thirty one

[1-9][0-9]…​[0-9]

159

one hundred and fifty nine

<dash> [1-9]…​[0-9]

-58

minus fifty eight

In en-GB, the word "and" is used after "hundred" (e.g., "one hundred and fifty nine").

Integers greater than 999

Allowed thousands delimiters

  • Comma

  • None

Space delimiter will result in two separate number groups.

Format

Example

Input String

Verbalized Form

[1-9][0-9]…​[0-9][0-9]

2824

two thousand eight hundred and twenty four

[1-9] <comma> [0-9] …​[0-9][0-9]

2,824

two thousand eight hundred and twenty four

[1-9][0-9][0-9] <comma> [1-9][0-9][0-9] <comma> [1-9][0-9][0-9]

857,467,987

eight hundred and fifty seven million, four hundred and sixty seven thousand, nine hundred and eighty seven

[1-9][0-9][0-9][1-9][0-9][0-9][1-9][0-9][0-9]

857467987

eight hundred and fifty seven million, four hundred and sixty seven thousand, nine hundred and eighty seven

<dash> [1-9][0-9][0-9]…​[0-9]

-12345

minus twelve thousand three hundred and forty five

<dash> [1-9][0-9][0-9] <comma> …​ <comma> [1-9][0-9][0-9]

-12,345

minus twelve thousand three hundred and forty five

Maximum supported value is 999 billions. Numbers greater than this will be spelled number by number.

Ordinal numbers

Format

Example

Input String

Verbalized Form

[1-9]…​[0-9]{st,nd,rd,th}

1st

first

Decimals

Allowed decimal separator

  • dot

The individual numbers after the dot delimiter are spelled as single digits.

Format

Example

Input String

Verbalized Form

[1-9]…​[0-9] <dot> [0-9][0-9]…​[0-9]

2.4527

two point four five two seven

<dash> [1-9]…​[0-9] <dot> [0-9][0-9]…​[1-9]

-25.78

minus twenty five point seventy eight

Number groups

Format

Example

Input String

Verbalized Form

[0-9] <dash> …​ [0-9] <space> [0-9] <dash> …​ [0-9] <space> …​

4-2-1-4 2-3-4 7-5-3

four two one, two three four, seven five three

16-digit card numbers

Card numbers belong to the number groups. In order to pronounce the 16-digit card number in groups of four digits with a short pause in between them, the input number has to contain the dashes, e.g.:

"Just to confirm, you said 5-8-9-7 4-2-6-5 1-2-3-4 7-4-6-8 9-7-2-6, correct?"

Input text in the format of: "5896 4265 1234 7468 9726"

will result in: "five thousand eight hundred and ninety six four thousand two hundred and sixty five,…​"

Telephone numbers

Telephone numbers should be composed as number groups also. The following example:

"Your telephone number is 3-8-7 8-5-6 9-3-0."

would be read as "three eight seven <short pause> eight five six <short pause> nine three zero".

Combination of numbers and words

Format

Example

Input String

Verbalized Form

[0-9] …​ [0-9] <dash> {word}

16-digit

sixteen digit

{word} <dash> [0-9] …​ [0-9]

COVID-19

covid nineteen

Beware of using spaces. When the dash sign is surrounded by space, it is verbalized as word "dash". For example, the verbalization of "16 - digit" is "sixteen dash digit".

Ratio

Format

Example

Input String

Verbalized Form

[0-9] …​ [0-9] <colon> [0-9] …​ [0-9]

16:9

sixteen to nine

[0-9] …​ [0-9] <dash> [0-9] …​ [0-9]

16-9

sixteen to nine

Percentage

Format

Example

Input String

Verbalized Form

[1-9]…​[0-9] <percent sign>

15%

fifteen percent

Time

Format

Example

Input String

Verbalized Form

[00-99] <colon> [00-59] <colon> [00-59]

12:11:01

twelve hours ten minutes and one second

[00-24] <colon> [00-59]

23:15

twenty three fifteen

[1-12] <space> {PM, AM}

2 PM

two p m

[1-12] <space> {P.M., A.M.}

2 P.M.

two p m

[0-12] <colon> [0-59] <space> {PM, AM}

02:20 PM

two twenty p m

[0-12] <colon> [0-59] <space> {P.M., A.M.}

02:20 P.M.

two twenty p m

00:00

00:00

midnight

Time Intervals

Format

Example

Input String

Verbalized Form

[0-9] …​ [0-9] <dash> [1-9] …​ [0-9] <space> {days, hours, months, minutes}

3-5 days

three to five days

Date

Format

Example

Input String

Verbalized Form

[1-31] <space> {month} <space> [0-9][0-9][0-9][0-9]

14 May 2022

the fourteenth of may twenty twenty two

[1-31] <space> {month}

14 May

the fourteenth of may

{month} <space> [1-31] <comma> <space> [1-9][0-9][0-9][0-9]

June 6, 2020

the sixth of june twenty twenty

{month} <space> [1-31]

June 6

the sixth of june

[00-31] <slash> [00-12] <slash> [1-9][0-9]

31/12/19

the thirty first of december twenty nineteen

[00-12] <slash> [00-31] <slash> [1-9][0-9]

12/31/22

the thirty first of december twenty twenty two

[00-31] <dash> [00-12] <dash> [1-9][0-9][0-9][0-9]

27-05-2021

the twenty seventh of may twenty twenty one

[00-31] <dot> [00-12] <dot> [1-9][0-9][0-9][0-9]

27.05.2021

the twenty seventh of may twenty twenty one

[0-31] <dot> [0-12] <dot> [1-9][0-9][0-9][0-9]

27.5.2021

the twenty seventh of may twenty twenty one

[1-9][0-9][0-9][0-9] <dash> [00-12] <dash> [00-31]

2022-11-02

the second of november twenty twenty two

[00-31] <slash> [00-12]

05/12

the fifth of december

{month}<space> [1-31]{st,nd,rd,th}

October 21st

october the twenty first

In en-GB, dates are verbalized with "the" before ordinals and "of" before the month when following the day-first pattern (e.g., "the fourteenth of may"). When month comes first, "the" is placed before the ordinal (e.g., "october the twenty first"). The slash-separated date format follows DD/MM/YY(YY) convention in en-GB.

Units

Format

Example

Input String

Verbalized Form

{amount} {unit}

2.5GB

two point five gigabytes

{amount} <space> {unit}

3 m

three metres

{amount} {unit} <slash> {unit}

1500MB/s

one thousand five hundred megabytes per second

{amount} <space> {unit} <slash> {unit}

1 km/h

one kilometre per hour

Unit

Abbreviation

Verbalization

Abbreviation

Verbalization

mm

millimetre(s)

cm

centimetre(s)

m

metre(s)

km,Km

kilometre(s)

m²,m2

square metre(s)

m³,m3

cubic metre(s)

",in

inch(es)

ft

foot(feet)

yd

yard(s)

mi

mile(s)

mph,MPH

mile(s) per hour

h,hr

hour(s)

min,mins

minute(s)

s

second(s)

g,gm

gram(s)

kg,Kg

kilogram(s)

oz

ounce(s)

lb,lbs

pound(s)

ml,mL

millilitre(s)

cl,cL

centilitre(s)

l,L

litre(s)

gal

gallon(s)

Hz,hz

hertz

kHz,KHz,khz

kilohertz

MHz

megahertz

GHz,Ghz,ghz

gigahertz

mW

milliwatt(s)

w,W

watt(s)

kW,KW

kilowatt(s)

MW

megawatt(s)

GW

gigawatt(s)

kwh,kWh,KWh

kilowatt(s) hour

v,V

volt(s)

mA

milliampere(s)

A

ampere(s)

db,dB

decibel(s)

kb,Kb,kbit

kilobit(s)

mb,Mb,Mbit

megabit(s)

gb,Gb,Gbit

gigabit(s)

tb,Tb,Tbit

terabit(s)

pb,Pb,Pbit

petabit(s)

bps

bit(s) per second

Kbps

kilobit(s) per second

Mbps

megabit(s) per second

Gbps

gigabit(s) per second

Tbps

terabit(s) per second

kB,KB

kilobyte(s)

MB

megabyte(s)

GB

gigabyte(s)

TB

terabyte(s)

PB

petabyte(s)

kBps,KBps

kilobyte(s) per second

MBps

megabyte(s) per second

GBps

gigabyte(s) per second

TBps

terabyte(s) per second

PBps

petabyte(s) per second

°C,°c

degree(s) celsius

°F,°f

degree(s) fahrenheit

Words

Abbreviations and Spelling

Format

Example

Input String

Verbalized Form

[A-Z] …​ [A-Z]

ABC

a b c

[A-Z] <dot> [A-Z] <dot> …​ [A-Z] <dot>

B.C.

b c

[a-z] <dot> [a-z] <dot> …​ [a-z] <dot>

e.g.

for example

[a-zA-Z] & [a-zA-Z]

R&D

r-n-d

[a-zA-Z] , like [a-zA-Z]

a, like alpha

A, like alpha

{common abbreviation}

Dr.

doctor

<space> [A-Z] {<comma>, <dot>, <space>}

A or B

a or b

All the capitalized letters are supposed to be spelled.

Common abbreviation

Abbreviation

Verbalization

Abbreviation

Verbalization

Prof.

professor

Mr.

mister

Dr.

Doctor

Mrs.

Misses

e.g.

for example

Ms.

miss

etc.

and so on

Jr.

junior

Inc.

incorporated

Co.

company

Ph.d

P H D

Sr.

senior

hon.

honourable

Pres.

President

Gov.

Governor

Lt.

Lieutenant

Col.

Colonel

Gen.

General

corp.

corporation

Univ.

university

assn.

association

Dept.

department

ave.

avenue

rd.

road

st.

street

blvd.

boulevard

ln.

lane

vs.

versus

ft.

foot

in.

inch

lb.

pound

oz.

ounce

gal.

gallon

pt.

pint

qt.

quart

cu.

cubic

sq.

square

hr.

hour

min.

minute

sec.

second

yr.

year

mo.

month

wk.

week

no.

number

pg.

page

vol.

volume

vol.

volume

ed.

edition

rev.

revised

trans.

translated

pub.

published

ltd.

limited

govt.

government

inst.

institute

acad.

academy

soc.

society

bros.

brothers

mfg.

manufacturing

natl.

national

intl.

international

org.

organisation

temp.

temperature

approx.

approximately

prelim.

preliminary

prov.

province

jan.

January

feb.

February

mar.

March

apr.

April

jun.

June

jul.

July

aug.

August

sept.

September

oc.

October

nov.

November

dec.

December

mon,

Monday

tue.

Tuesday

wed.

Wednesday

thu.

Thursday

fri.

Friday

sat.

Saturday

sun.

Sunday

attn

attention

attn.

attention

attn:

attention

P0

P 0

P.0.

P 0

Dashed words

Format

Example

Input String

Verbalized Form

[a-Z] …​ [a-z] <dash> [a-Z] …​ [a-z]

E-mail

e mail

[a-Z]…​[a-z] <space> <dash> <space> [a-Z] …​ [a-z]

Paris - Texas

paris dash texas

URLs and E-mails

Format

Example

Input String

Verbalized Form

(www) <dot> {word} <dot> {domain}

www.google.com

w w w dot google dot com

(www) <dot> {word} <dot> {domain} <forward slash> {path}

www.google.com/maps

w w w dot google dot com forward slash maps

{protocol} <colon> <forward slash> <forward slash> {word} <dot> {domain}

https://google.com

h t t p s colon forward slash forward slash google dot com

{protocol} <colon> <forward slash> <forward slash> {word} <dot> {domain} <forward slash> {word} <hyphen> {word}

https://www.example.com/my-page

h t t p s colon forward slash forward slash w w w dot example dot com forward slash my hyphen page

{protocol} <colon> <forward slash> <forward slash> {word} <dot> {domain} <colon> {port} <forward slash> {path}

http://example.com:8080/path/to/resource

h t t p colon forward slash forward slash example dot com colon eight thousand and eighty forward slash path forward slash to forward slash resource

{protocol} <colon> <forward slash> <forward slash> {ip} <forward slash> {path}

http://123.123.123.123/path/to/resource

h t t p colon forward slash forward slash one hundred and twenty three dot one hundred and twenty three dot one hundred and twenty three dot one hundred and twenty three forward slash path forward slash to forward slash resource

{protocol} <colon> <forward slash> <forward slash> {word} <dot> {domain} <forward slash> {path} <percent> {encoded}

http://example.com/path%20with%20spaces

h t t p colon forward slash forward slash example dot com forward slash path percent two zero with percent two zero spaces

{protocol} <colon> <forward slash> <forward slash> {word} <dot> {domain} <forward slash> {word} <equal sign> <question mark> {query}

https://www.ocp.ai/input=?$ad

h t t p s colon forward slash forward slash w w w dot ocp dot ai forward slash input equal sign question mark dollar sign ad

ftp <colon> <forward slash> <forward slash> ftp <dot> {word} <dot> {domain} <forward slash> {path}

ftp://ftp.example.com/resource

f t p colon forward slash forward slash f t p dot example dot com forward slash resource

{word} <at sign> {word} <dot> {domain}

info@google.com

i n f o at google dot com

Note: In all English locales, the following characters are verbalized in URLs and email addresses:

Character

Verbalization

.

dot

@

at

/

forward slash

-

hyphen

:

colon

_

underscore

=

equal sign

?

question mark

%

percent

$

dollar sign

Currency

Format

Example

Input String

Verbalized Form

{currency symbol} {amount}

$25.10

twenty five dollars and ten cents


£25,000

twenty five thousand pounds

{amount} {currency symbol}

25.10€

twenty five euros and ten cents

{amount} <space> {currency code}

25 USD

twenty five us dollars

{currency code} <space> {amount}

GBP 25

twenty five pounds sterling

{currency code} {amount} <space> {quantity}

$1.3 million

one point three million dollars

{amount} <space> {quantity} <space> {currency code}

1 million USD

one million us dollars

Currency code

Code

Verbalized form

Subdivision

USD

u s dollar(s)

cent(s)

EUR

euro(s)

cent(s)

GBP

pound(s) sterling

penny, pence

Currency symbol

Symbol

Verbalized form

$

dollar(s)

euro(s)

£

pound(s)

Amount examples

Amount

Verbalized form

$235,125,250.12

two hundred and thirty five million, one hundred and twenty five thousand, two hundred and fifty dollars and twelve cents

$3.458

three point four five eight dollars

$5,000.00

five thousand dollars

$0.01

one cent

$-0.01

minus one cent

£1,250.99

one thousand two hundred and fifty pounds and ninety nine pence

£0.01

one penny

Quantity

million

billion

trillion

Punctuation and Characters Processing

Supported characters

Char

Description

Used as/in

Char

Description

Used as/in

.

period

sentence ending

@

at sign

email address


decimal separator

-

dash

number group


,

comma

sentence break


word group



thousands separator


minus sign



?

question mark

sentence ending

+

plus sign

math equation

!

exclamation mark

sentence ending

$

dollar sign

part of currency

'

apostrophe

contraction

euro sign

part of currency


possessive noun

£

pound sign

part of currency


%

percent sign

percentage value

&

ampersand

and (except for the R&D like cases)

Other characters or punctuation will be removed and will not affect the voice.

Punctuation

If no punctuation is present in the input text, the period will be added automatically. There is no check of the sentence content or a type in order to distinguish the punctuation.

Thank you for using our service → Thank you for using our service.

What can I do for you → What can I do for you.

Char

Description

Effect

.

period

falling intonation

?

question mark

raising intonation


space

pause


dash

short pause

,

comma

falling intonation

!

exclamation mark

increased intensity

The question mark does not always result in rising intonation. It follows the question intonation rules of en-GB referring to rising, falling and rise-fall intonation for yes/no, wh- or multiple choice questions.

Examples

This section illustrates the examples of common prompts and verbalization usage.

Input:

"Welcome to voice-enabled service line of ABC Bank! What can I do for you today?"

Verbalized output:

"welcome to voice enabled service line of a b c bank ! what can I do for you today ?"


Input:

"Ok, first, please say or enter your 16-digit card number."

Verbalized output:

"ok , first , please say or enter your sixteen digit card number ."


Input:

"Just to confirm, you said: 5-8-9-7 4-2-6-5 1-2-3-4 7-4-6-8 9-7-2-6."

Verbalized output:

"just to confirm , you said five eight nine seven , four two six five , one two three four , seven four six eight , nine seven two six ."


Input:

"The balance on your visa is $2670.52 and it is due the 21st of March. Anything else?"

Verbalized output: "the balance on your visa is two thousand six hundred and seventy dollars and fifty two cents and it is due the twenty first of march . anything else ?"


Input:

"Your email is cadomaitis@omilia.com. Is that correct?"

Verbalized output:

"your email is c a d o m a i t i s at o m i l i a dot com. is that correct?"


Input:

"To confirm, Your appointment will be set for Tuesday, the 25th of October, 09:00 to 10:30 AM. Shall I proceed?"

Verbalized output:

"to confirm, your appointment will be set for tuesday , the twenty fifth of october, nine to ten thirty a m . shall I proceed ?"


Input:

"The balance on your mastercard is -1.35 USD. The debt has to be paid until the 2nd of November 2022."

Verbalized output:

"the balance on your mastercard is minus one u s dollar and thirty five cents . the debt has to be paid until the second of november twenty twenty two ."


Input:

"Please visit www.omilia.com/support for more information."

Verbalized output:

"please visit w w w dot omilia dot com forward slash support for more information ."