Welcome!
This is the community forum for my apps Pythonista and Editorial.
For individual support questions, you can also send an email. If you have a very short question or just want to say hello — I'm @olemoritz on Twitter.
implementing live voice commands?
-
-
Thanks @cvp! I should have mentioned I did look at that thread, but it doesn't really solve my problem. The issue with the ping-ponging recordings (or in my case staggered sampling windows) is that you need to run speech recognition on each one concurrently in order to keep up. The runtime error I'm getting seems to indicate that the underlying SFSpeechRecognizer only supports one active instance, and that instead I need to register a callback to handle partial speech results (enabled via shouldReportPartialResults).
Is there anyone with objc_util experience who's played around with SFSpeechRecognizer, or would be willing to help me get started? (@omz @JonB @dgelessus @zrzka @Brun0oO @mikael @scj643 @shaun-h @filippocld)
-
it would be possible to set up the audiosession, and feed the speechrecognizer the way it is meant to be used. your link has the code, just needs to be converted to objc_util. sadly, im using an older ios version so wouldnt be able to try it
-
Ok thanks @JonB! The code I linked to is in Swift right? Even though I selected Obj C in the dropdown. Trying to produce the Obj C equivalent of the example would be a lot for me.. any chance you could take an untested stab at it, and then I could iterate from there? Something to start with would be super helpful. I know that's a lot to ask and no worries if you don't have the time or inclination.
-
@daltonb I could try to translate it to Objectivec in Pythonista but not sure of the result...nor the delay 😢
-
@cvp that would be awesome.. even if it doesn’t work out I’d love to see a partial result
-
https://github.com/yao23/iOS_Playground/blob/master/SpeechRecognitionPractice/SpeechRecognitionPractice/ViewController.m
is an objc implementation.The tricky bit obviously is getting those blocks implemented in objc_util
-
-
@daltonb, I am tempted to give it a try, but not this week.
-
-
@cvp, the man is fast! :-D
-
@mikael He, that is not my code 😂, just found there, I just begin to try to modify it...
-
First part (enough for today)
AVAudioEngine = ObjCClass('AVAudioEngine').alloc().init() AVAudioSession = ObjCClass('AVAudioSession') AVAudioRecorder = ObjCClass('AVAudioRecorder') shared_session = AVAudioSession.sharedInstance() category_set = shared_session.setCategory_mode_options_error_(ns('AVAudioSessionCategoryRecord'), ns('AVAudioSessionModeMeasurement'),ns('AVAudioSession.CategoryOptionsDuckOthers'),None) setActiveOptions = 0 # notifyOthersOnDeactivation shared_session.setActive_withOptions_error_(True,setActiveOptions,None) inputNode = AVAudioEngine.inputNode() ```
-
-
2nd part and really enough for today
AVAudioEngine = ObjCClass('AVAudioEngine').alloc().init() AVAudioSession = ObjCClass('AVAudioSession') AVAudioRecorder = ObjCClass('AVAudioRecorder') shared_session = AVAudioSession.sharedInstance() category_set = shared_session.setCategory_mode_options_error_(ns('AVAudioSessionCategoryRecord'), ns('AVAudioSessionModeMeasurement'),ns('AVAudioSession.CategoryOptionsDuckOthers'),None) setActiveOptions = 0 # notifyOthersOnDeactivation shared_session.setActive_withOptions_error_(True,setActiveOptions,None) inputNode = AVAudioEngine.inputNode() # Configure the microphone input. recordingFormat = inputNode.outputFormatForBus_(0) def handler(_cmd,obj1_ptr,obj2_ptr): # param1 = AVAudioPCMBuffer # The buffer parameter is a buffer of audio captured # from the output of an AVAudioNode. # param2 = AVAudioTime # The when parameter is the time the buffer was captured if obj1_ptr: obj1 = ObjCInstance(obj1_ptr) #self.recognitionRequest?.append(buffer) handler_block = ObjCBlock(handler, restype=None, argtypes=[c_void_p, c_void_p, c_void_p]) inputNode.installTapOnBus_bufferSize_format_block_(0,1024,recordingFormat, handler_block) AVAudioEngine.prepare() err_ptr = c_void_p() AVAudioEngine.startAndReturnError_(byref(err_ptr)) if err_ptr: err = ObjCInstance(err) print(err) # Create and configure the speech recognition request. recognitionRequest = ObjCClass('SFSpeechAudioBufferRecognitionRequest').alloc() print(dir(recognitionRequest)) recognitionRequest.setShouldReportPartialResults_(True)
And
Fatal Python error: Bus error
Thread 0x000000016fb67000 (most recent call first):
No error if I comment the line
AVAudioEngine.startAndReturnError_(byref(err_ptr))
-
This post is deleted! -
you had some errors on one of your constants (the audiosession options should have been 0x2 for the duckothers option -- this is a mask, not a string)
here is a minor mod -- i verified the handler gets called, but i dont have speech recogognize to test against
https://gist.github.com/ad17f52c8944993092f537d963ce1963 -
@JonB Thanks, I'll try to continue today...
-
@JonB Really need help now:
- segmentation fault if no underscore before appendAudioPCMBuffer_(obj1)
- segmentation fault in last line not commented
from objc_util import * AVAudioEngine = ObjCClass('AVAudioEngine').alloc().init() AVAudioSession = ObjCClass('AVAudioSession') AVAudioRecorder = ObjCClass('AVAudioRecorder') shared_session = AVAudioSession.sharedInstance() category_set= shared_session.setCategory_withOptions_error_( ns('AVAudioSessionCategoryRecord'), 0x2, #duckothers None) shared_session.setMode_error_(ns('AVAudioSessionModeMeasurement'),None) setActiveOptions = 0# notifyOthersOnDeactivation shared_session.setActive_withOptions_error_(True,setActiveOptions,None) inputNode = AVAudioEngine.inputNode() # Configure the microphone input. recordingFormat = inputNode.outputFormatForBus_(0) # Create and configure the speech recognition request. recognitionRequest = ObjCClass('SFSpeechAudioBufferRecognitionRequest').alloc() print(dir(recognitionRequest)) recognitionRequest.setShouldReportPartialResults_(True) retain_global(recognitionRequest) @on_main_thread def handler_buffer(_cmd,obj1_ptr,obj2_ptr): print('handler_buffer') # param1 = AVAudioPCMBuffer # The buffer parameter is a buffer of audio captured # from the output of an AVAudioNode. # param2 = AVAudioTime # The when parameter is the time the buffer was captured if obj1_ptr: obj1 = ObjCInstance(obj1_ptr) #print(str(obj1._get_objc_classname())) # AVAudioPCMBuffer #print(str(obj1.frameLength())) # 4410 # segmentation in next line if no "_" before appendAudioPCMBuffer recognitionRequest._appendAudioPCMBuffer_(obj1) handler_block_buffer = ObjCBlock(handler_buffer, restype=None, argtypes=[c_void_p, c_void_p, c_void_p]) inputNode.installTapOnBus_bufferSize_format_block_(0,1024,recordingFormat, handler_block_buffer) AVAudioEngine.prepare() err_ptr = c_void_p() AVAudioEngine.startAndReturnError_(byref(err_ptr)) if err_ptr: err = ObjCInstance(err) print(err) @on_main_thread def handler_recognize(_cmd,obj1_ptr,obj2_ptr): print('handler_recognize') # param1 = result # The object containing the partial or final transcriptions # of the audio content. # param2 = error # An error object if a problem occurred. # This parameter is nil if speech recognition was successful. if obj1_ptr: obj1 = ObjCInstance(obj1_ptr) #print(str(obj1)) handler_block_recognize = ObjCBlock(handler_recognize, restype=None, argtypes=[c_void_p, c_void_p, c_void_p]) SFSpeechRecognizer = ObjCClass('SFSpeechRecognizer').alloc().init() recognitionTask = SFSpeechRecognizer.recognitionTaskWithRequest_resultHandler_(recognitionRequest, handler_block_recognize)
-
recognitionRequest = ObjCClass('SFSpeechAudioBufferRecognitionRequest').alloc()
Missing .init()?
By the way, you will want AVAudioEngine.stop() handy.
For instance you might want to create a ui.View with a will_close, so that when you are experimenting, you can just close the view to kill the engine. Anyway you will eventually need to show the recognized words.