How to use Siri voice using speech from Pythonista module.

cvp

@Vent easier and minimal

from objc_util import *

txt = 'hello'

AVSpeechUtterance=ObjCClass('AVSpeechUtterance')
AVSpeechSynthesizer=ObjCClass('AVSpeechSynthesizer')
AVSpeechSynthesisVoice=ObjCClass('AVSpeechSynthesisVoice')

voices=AVSpeechSynthesisVoice.speechVoices()
for i in range(0,len(voices)):
	#print(i,voices[i].language(),voices[i].identifier())
	if 'ja-JP' in str(voices[i].language()) and 'siri' in str(voices[i].identifier()):
		vi = i
		break
		
utterance=AVSpeechUtterance.speechUtteranceWithString_(txt)
utterance.rate = 0.5
utterance.useCompactVoice=False
utterance.voice = voices[vi]

synthesizer=AVSpeechSynthesizer.new()
synthesizer.speakUtterance_(utterance)

Vent

I successfully got Siri to speak any string of text I wanted, but ran into the following problem.
When I run the reading multiple times, the voices overlap.
It seems that the next reading is performed without waiting for the reading to finish.

I learned that to wait for this I need to use the AVSpeechSynthesizerDelegate , but that is too difficult for me to use.
Can anyone help?

from objc_util import *

txt = ['こんにちは', '私はSiriです。']

AVSpeechUtterance=ObjCClass('AVSpeechUtterance')
AVSpeechSynthesizer=ObjCClass('AVSpeechSynthesizer')
AVSpeechSynthesisVoice=ObjCClass('AVSpeechSynthesisVoice')

voices=AVSpeechSynthesisVoice.speechVoices()
for i in range(0,len(voices)):
    #print(i,voices[i].language(),voices[i].identifier())
    
    if 'ja-JP' in str(voices[i].identifier()): # if u have Japanese Siri voice, replace from 'ja-JP' to 'siri_O-ren_ja-JP'
        vi = i
        break

for t in txt:    
    utterance=AVSpeechUtterance.speechUtteranceWithString_(t)
    utterance.rate = 0.5
    utterance.useCompactVoice=False
    utterance.voice = voices[vi]

    synthesizer=AVSpeechSynthesizer.new()
    synthesizer.speakUtterance_(utterance)```

cvp

@Vent why do you use an array for txt and not a blank between your both texts?

ccc

A utility function...

from typing import List


def get_voices_by_substring(substring: str = "ja-JP") -> List[]:
    """
    Returns a list of all voices that have the substring in their identifier.
    """
    voices = AVSpeechSynthesisVoice.speechVoices()
    return [voice for voice in voices if substring in str(voice.identifier())]

[ ... ] 

    utterance.voice = get_voices_by_substring()[0]  # Use the first voice in the list.

Vent

@cvp It will eventually be used to read out the e-book; once one episode is read out, the next episode must be read out. It is not a good idea to read everything and run a 10,000 word readout.

Vent

@ccc Thank you for making that. But just select just one voice that contains siri_O-ren_en-JP ...

cvp

@Vent you're right but you could have a button that you tap when you have listen to one entire episode, and then the action could be to read the next episode and to speak it..

But I agree it should be better to be warned when the speak is finished thanks to the delegate.
It is possible but I don't have enough time actually. I had eye surgery and I'm not authorized to use a screen a long time per day. Hoping you can be patient or somebody else can help you.

Vent

Your answer helped me a lot. Thanks to you, I was able to get Siri to speak. I will post this question once again elsewhere.

JonB

The delegate looks fairy simple. The help for objc_util shows a good simple example of creating a mail delegate, this would be very similar.

You will be creating an AVSpeechSynthesizerDelegate. The method you want to implement will be

def speechSynthesizer_didFinishSpeechUtterance(_obj,_sel, synthesizer, utterance):
   # do something ... Like setting a semaphore or releasing a threading.Lock, or calling the next utterance in a list, etc
   print ('Utterance complete')

methods=[speechSynthesizer_didFinishSpeechUtterance]
protocols=['AVSpeechSynthesizerDelegate']

MyAVSpeechSynthesizerDelegate = create_objc_class( 'MyAVSpeechSynthesizerDelegate', NSObject, methods=methods, protocols=protocols)

delegate=MyAVSpeechSynthesizerDelegate.alloc().init()

#create your synth, etc then call
synthesizer.delegate=delegate

JonB

Also, a simpler method, you can just monitor synthesizer.speaking to see if it is currently talking.

Are you creating a new synthesizer for each utterance? That will cause things to overlap. Instead, if you create the synthesizer object up front, calling speakUtterance should just queue up utterances, to be spoken one at a time.

JonB

Just moving the synthesizer creation outside your loop should do what you want without needing the delegate. The delegate would allow you to pause, figure out what is being spoken at a given time, etc, but isn't needed to prevent overlaps.

from objc_util import *

txt = ['こんにちは', '私はSiriです。']

AVSpeechUtterance=ObjCClass('AVSpeechUtterance')
AVSpeechSynthesizer=ObjCClass('AVSpeechSynthesizer')
AVSpeechSynthesisVoice=ObjCClass('AVSpeechSynthesisVoice')

voices=AVSpeechSynthesisVoice.speechVoices()
for i in range(0,len(voices)):
    #print(i,voices[i].language(),voices[i].identifier())
    
    if 'ja-JP' in str(voices[i].identifier()): # if u have Japanese Siri voice, replace from 'ja-JP' to 'siri_O-ren_ja-JP'
        vi = i
        break

synthesizer=AVSpeechSynthesizer.new()
for t in txt:    
    utterance=AVSpeechUtterance.speechUtteranceWithString_(t)
    utterance.rate = 0.5
    utterance.useCompactVoice=False
    utterance.voice = voices[vi]

    
    synthesizer.speakUtterance_(utterance)