• daltonb

    MONEY. This works great for accessing the sound data!!

    Sadly, upon further testing, setting the frame length to 1024 makes the speech recognition results very poor. Not sure why.. any ideas? Do you think the speech recognizer is expecting the original frame length somehow? For instance I say "Hello" and it outputs "LOL" sometimes, so maybe the input is getting clipped.

    My frame length on my phone is actually 4410 by default which is ok, but I guess this is a platform specific number.

    posted in Pythonista read more
  • daltonb

    Ok, I'll try to figure out the right casting call in the meantime. Yeah that is annoying; however from that stackoverflow post I've verified that calling buffer.setFrameLength(1024) succeeds in speeding up the sampling rate significantly after the first long (0.375s) sample.. haven't checked yet to see if I can update that before the first sample, but shouldn't matter too much for my purposes.

    posted in Pythonista read more
  • daltonb

    No worries man, appreciate the help. Any chance you could just post that casting snippet for now? That sound tricky for me.

    As to the second point.. any reason not to just add the processing code to the tap block that updates the RecognitionRequest? (instead of adding a mixer)

    posted in Pythonista read more
  • daltonb

    I'm trying to implement volume level metering using AVAudioEngine as described here: https://stackoverflow.com/questions/30641439/level-metering-with-avaudioengine. However, I'm not sure how to access the standalone function vDSP_meamgv() (as opposed to a class/instance method).

    I assume I would first need to call load_framework('Accelerate'), but after that I'm trying to figure out the equivalent of vDSP_meamgv = ObjCFunction('vDSP_meamgv').

    posted in Pythonista read more
  • daltonb

    Ok thanks!

    posted in Pythonista read more
  • daltonb

    @JonB just to double-check, do I need to roll my own audio filtering/metering algorithm if I want to use SFSpeechRecognizer? (e.g. https://stackoverflow.com/questions/30641439/level-metering-with-avaudioengine). From my quick research it seems the metering algorithm is only provided with AVAudioRecorder, which I couldn't figure out how to make compatible with SFSpeechRecognizer since (according to my rudimentary understanding) they both try to install a tap on the mic input node. As a quick check I tried using a sound.Recorder() along with the SFSpeechRecognizer implementation and got this crash:
    com.apple.coreaudio.avfaudio: required condition is false: IsFormatSampleRateAndChannelCountValid(format)
    (which is informing my interpretation that both are trying to tap the mic input.. not that I necessarily understand what that means). Thanks for any commentary!

    posted in Pythonista read more
  • daltonb

    Thanks @JonB, I do like how it turned out! (and any suggestions on the effect are welcome). Nice chance to brush up on my math a bit. I grabbed the offset and the squared exponent in the "envelope function" from this excellent Quora exchange: https://www.quora.com/What-function-is-Arctic-Monkeys-album-cover

    1. Yessir, that's my next goal... for me it was easier to get this MVP working before messing with objective C code. It will be nicer that way for sure though.. for starters I've had a hard time getting files to close cleanly with threaded code. Just a heads up the issue affects your ping pong example from a while back as well.. at least for me after running it, whichever .m4a file was last active keeps growing after exit.

    2. Oof.. I didn't know the objc interop was inherently unreliable.. that's a bit sad to hear. Do you know any details about why or specifically what scenarios? I love working in Python, but this may end up being my gateway drug to actual iOS development, ha. Thanks for the example code!

    posted in Pythonista read more
  • daltonb

    @mikael Magic- I think that worked! It seemed to be running more smoothly but than I got those errors again.. though on further trial-and-error I think it was due to another script I ran which had similar issues. Maybe the other script left behind some thready detritus on close?

    posted in Pythonista read more
  • daltonb

    @JonB that was super helpful.. would have taken me days to write that script even with all the hints. I made a little mic indicator for voice recognition- here’s progress so far! (Also, for me right now it runs perfectly half the time, but I’m curious if you get random segfaults and/or index out of bounds errors on some runs.. haven’t figured out a cause or a pattern yet)

    import ui
    import numpy as np
    import sound
    
    import time
    from objc_util import *
    CAShapeLayer=ObjCClass('CAShapeLayer')
    
    W=225
    f=25
    tau=0.035
    scroll=0.4
    voice_thresh = 25
    voice_scale = 10
    voice_max = 200
    N=1024
    t=np.linspace(-0.5,0.5,N)
    pingpong = 10.0
    
    class micView(ui.View):
    
        def __init__(self, *args, **kwargs):
            ui.View.__init__(self, *args, **kwargs)
            self.bg_color='white'
            L=CAShapeLayer.alloc().init()
            L.strokeColor=UIColor.grayColor().CGColor()
            L.fillColor=UIColor.clearColor().CGColor()
            L.lineWidth=2
            self.objc_instance.layer().addSublayer_(L)
            L.setNeedsDisplay()
            self.L=L
            self.A=0
            self.r_N = 2
            self.r=[sound.Recorder('r'+str(i)) for i in range(self.r_N)]
            [r.meters for r in self.r]
            self.r_i = 0
            self.r_active = self.r[self.r_i]
            self.r_active.record()
            self.t=time.perf_counter()
            self.r_t = self.t
        
        def next_recorder(self):
            self.r_active.stop()
            self.r_i = (self.r_i+1)%self.r_N
            self.r_active = self.r[self.r_i]
            self.r_active.record()
            self.r_t = self.t
    
        def update(self):
            if self.r_active:
                self.A=min(voice_scale*max(0, (voice_thresh+max(self.r_active.meters['average']))), voice_max)
                p=ui.Path()
                if self.A:
                    y=self.A*(.01+np.sin(2*np.pi*(t**2)))*np.cos(2*np.pi*f*(t-scroll*self.t))*np.exp(-(t**2)/tau) + self.height/2
                    y_offset = self.height/2 - 100
                    x_offset = (self.width-W)/2
                    p.move_to(x_offset,y[0]+y_offset)
                    for ti,yi in zip(x_offset+(t-t[0])/(t[-1]-t[0])*W,y+y_offset):
                        p.line_to(ti,yi)
                self.L.path=p.objc_instance.CGPath()
                self.L.setNeedsDisplay()
                self.t=time.perf_counter()
                if (self.t-self.r_t) > pingpong:
                    self.next_recorder()
        
        def will_close(self):
            [r.stop() for r in self.r]
            # necessary to free these or one recorder never stops.. not sure why
            self.r_active = None
            self.r = None
    
    v=micView()
    v.update_interval=1/60
    v.present()
    
    

    posted in Pythonista read more
  • daltonb

    Awesome, thanks so much! I probably won’t get a chance to play around with these till tomorrow but will update when I do

    posted in Pythonista read more

Internal error.

Oops! Looks like something went wrong!