Skip to content

Google Text-to-Speech API misses some words when speaking Chinese

I faced a problem while using Google text to speech API when running the following code:

from urllib import response
from google.cloud import texttospeech 
import google
import socket

client = texttospeech.TextToSpeechClient()

def synthesize_text(text, speaking_rate=1.0):

    from google.cloud import texttospeech

    client = texttospeech.TextToSpeechClient()

    input_text = texttospeech.SynthesisInput(ssml = text)

    voice = texttospeech.VoiceSelectionParams(

        #Chinese
        language_code="cmn-TW",
        name = "cmn-TW-Wavenet-B",
        ssml_gender = texttospeech.SsmlVoiceGender.MALE,
    )

    audio_config = texttospeech.AudioConfig(
        audio_encoding = texttospeech.AudioEncoding.MP3,
        speaking_rate=speaking_rate  # Adjust the speaking rate here, A value less than 1.0 will slow down the speech, and a value greater than 1.0 will speed it up.
    )

    response = client.synthesize_speech(
        request={"input": input_text, "voice": voice, "audio_config": audio_config}
    )

    with open("output.mp3", "wb") as out:
        out.write(response.audio_content)
        print('Audio content written to file "output.mp3"')

text = '<speak>毛利率為45.87%,創歷史新⾼,主要來⾃於產品⽑利上升,服務營收上升,應為⻑期趨勢。</speak>'

synthesize_text(text, speaking_rate=0.91)

The problem is that the output Chinese audio missed some words when speaking, including 高, 自, 毛, 長, please provide the reasons and solutions, thank you!

You can listen to the output audio here:

tag: google-text-to-speech