Welcome!
This is the community forum for my apps Pythonista and Editorial.
For individual support questions, you can also send an email. If you have a very short question or just want to say hello — I'm @olemoritz on Twitter.
Real time audio buffer synth/Real time image smudge tool
-
Yeah I’ll try that, but again, it’s the speed of render_to_texture() that will tell if it’s enough for real time.Because a function like wavy needs a texture of the image at frame n to display the image at frame n+1, but then I need to render that image to a texture so that the shader can process it and display the image at frame n+2, etc
-
Actually, now I think about it, the problem is that scene and shaders compute their display only at 60 fps, and I think it’s not enough because for fast strokes you need to compute more often than that (otherwise you will have holes or irregularities between the smudge spots).
In my code I use a while(true) loop to compute the smudging (outside of the ui class) and its rate is only limited by the (very short) time numpy takes to add arrays.
By the way, somehow I now that it’s not good to use while(True) loops that way, but I don’t know what is the good practice to do the equivalent, at the same speed. Because of that loop, for example, right now when I close the ui window it doesn’t stop the code, and I need to do it manually with the cross in the editor. What should I do about that?
-
@JonB :
So back to the topic of real time audio, I modified your code to have a sawtooth instead of a sine, and then implemented a simple lowpass filter. There is an unwanted vibrato sound happening in the background for high frequencies, which is probably an aliasing behavior due to the inability of the program to keep a perfect rate? I am not sure. If I set the sampleRate to 44100, the vibrato seems less important (which kind of supports my aliasing assumption? Again, not sure) but still noticeable. Interestingly, I tried sampleRate= 88200 and the unwanted vibrato was gone. The thing is, when one changes the sampleRate, the filter actually behaves differently. Basically, taking a higher sampleRate with the same filter algorithm will tend to make its cutoff higher, so, for the comparison to be “fair”, with a 88200 sampleRate I replaced the 0.9 in the render method below by 0.95, and unfortenately, the unwanted vibrato was back :(
I also thought maybe it was a problem with the data precision and error accumulation so I tried scaling up the data in the render method and renormalizing it in the end for the buffer but that didn’t fix the issue.
To hear the unwanted vibrato with a 11000 sampleRate, all you need to do is add an attribute
self.z=[0,0]
in the AudioRenderer class and then change the render method this way (to have a filtered sawtooth):
def render(self, buffer, numFrames, sampleTime): '''override this with a method that fills buffer with numFrames''' #print(self.sounds,self.theta,v.touches) #The scale factor was to try to win some precision with the data. Scale=1 means it doesn’t scale scale=1 z=self.z for frame in range(numFrames): b=0 for t in self.sounds: f,a=self.sounds[t] theta=self.theta[t] #dTheta=2*math.pi*f/self.sampleRate dTheta=(f*scale)/self.sampleRate #b+=math.sin(theta) * a b+=((theta%scale)*2-scale)*a theta += dTheta #self.theta[t]=theta %(2*math.pi) self.theta[t]=theta%scale z[0]=0.9*z[0]+0.1*b z[1] = 0.9*z[1]+0.1*z[0] buffer[frame]=self.z[1]/scale self.z=z return 0
-
@Mederic Re: rendering numpy arrays, iosurface/calayer is amazingly fast:
Here is an iosurface wrapper that exposes a numpy array (w x h x 4 channels) and a ui.View:
https://gist.github.com/87d9292b238c8f7169f1f2dcffd170c8See the notes regarding using .Lock context manager, which is required.
Just manipulate the array inside awith s.Lock()
, and it works just like you would hope.On my crappy ipad3, I get > 100 fps when updating a 50x50 region, which is probably plenty fast.
edit: i see you are using float arrays. conversion from float to uint8 is kinda slow, so that is a problem.
-
@Mederic regarding while True:
doing while v.on_screen:
or at least checking on_screen is a good way to kill a loop once the view is closed. -
Ok thank you.
I ran your code and it is very fast but I have a question (and as I am still not familiar with the libraries you use, it might take a while to figure out the answer on my own):
The printed fps is around 1000 on my IPad Pro.
Now, I computed the fps of my PythoniSmudge code and I realize it’s important to have two fps data here:
- The computation fps of my while(True) loop was around 300
- The fps of my Views (computed by incrementing an N every time a draw function is over) was 40
That is important because the first fps makes sure the smudge tool is internally computed continuously enough to avoid having irregularities and holes in the path on the final image (nothing to do with lag), (which is the case with computation fps = 300), and the second fps makes sure that my eye doesn’t see lag on the screen (which is the case as soon as view fps>30)
My question is, what does your fps=1000 compute exactly? It seems to only be the computation fps but maybe I am wrong and it somehow includes the view fps as a part of it, but I would really need to isolate the view fps because that is really what causes the sensation of lag.
If really 1000 IS the view fps, then it’s more than enough.
-
I believe it is the actual view FPS but you might want to increase N to get better timing. The redraw method should effectively block while data is copied over.
What you would do is have a single view, from the iosurface. You could try s.array[:,:,0]=imageArray, but that may be slow since it must copy the entire image.
Better would be to determine the affected box each touch_moved, then only copy those:
with s.Lock(): s.array[rows,cols,0]=imageArray[rows,cols]
(Where rows And cols are indexes to affected pixels)
To keep monochrome, you would want your imageArray to be sized (r,c,1)
to allow broadcasting to workwith s.Lock(): s.array[rows,cols,0:3]=imageArray[rows,cols]
This way you only copy over and convert the changed pixels each move.
-
By the way... You might get acceptable performance with your original code if you use pil2ui with a jpeg instead of png format during touch_moved, then switch over to the png during touch_ended.
Also, you might eek out some performance by using a single overlay view, but rendering N ui.images, that are drawn during the view's draw method. That way you don't have the overhead of multiple views moving around. You would keep track of the pixel locations. See ui.Image.draw, which let's you draw into an image content. I think draw itself is fast, if you have the ui.Images already created.That said, the iosurface approach should beat the pants off these methods.
-
Ok I am going to give it a try. Regarding the N ui.Images method, I actually did that before and it was lagging. I think that’s because at every frame it dynamically draws the N ui.images as opposed to my current approach where at each frame the set_needs_display() method is used for only one miniview, the other ones are just “inactive” or “frozen”.
Also, I got a big improvement by only sending a set_needs_display() request every 3 touch_moved.
-
Regarding the use of a float array: it’s kind of necessary for the smudge to be beautiful, otherwise, with int8, you get visible ugly and persistent spots around the white areas. What causes that is that if, for instance, you have a pixel of value 254 next to a pixel of value 255, and smudge on them, then at the first frame the 254 pixel will try to become, say, 254.2, but as it is an integer, it will stay equal to 254, hence the same thing will happen at the second frame, the third frame, etc. It will keep trying to go to 255 but fail and get completely absorbed to 254. In the end, the smudge won’t have affected it, and it gets worst: it will stay equal to 254 whatever number of strokes you make on it. On the other hand, if you use floats, then at the first frame the 254.0 pixel will become 254.2 (and get rounded to 254 for display, but stay 254.2 in the array), and at the second frame it will become, say, 254.4, and maybe then 254.55, which will be displayed as a 255 pixel, so the smudge will really have affected it correctly.
-
I tried with IOSurface, and it’s really extremely fast!
I didn’t have to change my code too much so I will took a few minutes to clean things up and post a link!
Thank you!!!
-
Here it is!!!
https://gist.github.com/medericmotte/37e43e477782ce086880e18f5dbefcc8
It made the code so much simpler and faster!
Thank you so much!
PS: Have you seen my post above about the “aliasing” vibrato in the real time audio buffer code? I don’t want to take too much of your time but now that one problem has definitely been fixed, I kind of hope the same for audio :)
-
I have not run the audio issue yet.. but two possibilities:
- precision issue. The samples are float32, not double. For filtering you probably want to work as doubles before writing.
- overrun -- if your code falls behind, iOS will skip frames. There are some fields in the timecode structure that help tell you what the time that the buffer will start, etc., But I haven't did into them.
Going to high sample rate means your code has less time to produce the same number of samples, increasing chance of overrun. You could compare the time that render takes to numSamples/sampleRate -- render time should be less than say 80% of the actual audio time. That's why I started with a low sample rate.
I tried speeding things up with numpy, but got bad results..care needs to be taken with how time is treated. Since frequency and amplitude change discretely, there might be a better design that ensures continuity of samples.
- have you tried writing your samples to a wave file then playing it back? I.e is your filter and logic setup correctly?
Also, for sawtooth, I would think scaling the amplitude correctly is super important, because the signal must stay between -1 and +1, otherwise you saturate and that will produce harmonics. I haven't really looked at your code, but it might be worth mocking up the code to write to wave and see.
-
This post is deleted! -
@JonB Ok now I realize the issue was actually there even without the filter. I went back to my non-real-time synth, took out all the effects, played a simple high pitch sawtooth and the unwanted vibrato/“ghost frequency” was there. I am pretty sure now it is aliasing. I solved the problem using the PolyBLEP method, see Phelan Kane’s article and the code sample at the end:
http://metafunction.co.uk/all-about-digital-oscillators-part-2-blits-bleps/Here are my current modifications of @JonB audio processing buffer files, that you can get at:
https://gist.github.com/87d9292b238c8f7169f1f2dcffd170c8Here are the attributes to add in the AudiorRenderer class:
filtersNumber=16 self.filtersBuffers= [0.0]*filtersNumber # cutoffParam=1 will mean no filtering self.cutoffParam=1
Here is the anti-aliased sawtooth and the 16 filters in the render method:
def render(self, buffer, numFrames, sampleTime): '''override this with a method that fills buffer with numFrames''' #print(self.sounds,self.theta,v.touches) fb=self.filtersBuffers for frame in range(numFrames): b=0.0 cut=self.cutoffParam for touch in self.sounds: f,a=self.sounds[touch] t=self.theta[touch] #replace 110.0 by f if you want to control the frequency and see there is no aliasing. dt=110.0/self.sampleRate t+= dt t=t%1 saw=2*t-1 self.theta[touch]=t if (t < dt): t /= dt saw-= t + t - t * t - 1.0 elif (t > 1.0 - dt): t = (t - 1.0) / dt saw-= t * t + t + t + 1.0 b+=saw*1 #the a control (from 0 to 1) is used to change the cutoff by setting: lerpfact=0.2 cut=(1-lerpfact)*cut+ lerpfact*a #setting the first filter input to b=sawtooth wave input= b #start the loop around filters: for f in range(len(fb)): fb[f]=(1-cut)*fb[f]+cut*input input = fb[f] buffer[frame]=input self.filtersBuffers=fb self.cutoffParam=cut return 0
It’s set up to a have a fat filtered 110 Hz saw bass (showing the filter works in real time, despite a few glitches), but if you replace 110 by f to control the frequency you can check there is no aliasing even at high frequencies.
You can find the modified files here:
https://gist.github.com/medericmotte/b523acbc1c446ca889e7471afa5a9b2f -
@JonB How would I go about getting a stereo buffer with your code? Most of my sound is mono but in the very end I like to add two delay effects for each ear with different parameters to give a sensation of space, so I would like to end the render loop with something like
Buffer[frame]=[out1, out2] or something similar. -
here is a cleaned up idea, whereby there are different generators. Each generator has an amplitude and base frequency, and an internal numpy buffer, which is used as a circular buffer. So we can make use of numpy rather than filling one sample at a time, should be more performant.
Samples get buffered based on a time stamp -- and get generated in touch_moved or update. But phase always increments on a per sample basis -- so if we fill the buffer, we just fill what we can. Then render method of the audio unit is simply popping the number of requested samples out of the circular buffer, and writing into the audio unit buffer, so that part should never overrun.
I still get some dropouts, though you can pre-charge the buffer more in touch_began. The debug text is showing number of buffer underruns, overruns, and current number of buffered samples.
This approach may result in some latency, but it's better at preventing tearing.
I offer a sinewave, sawtooth and triangle wave, then my plan was to implement a filtet decorator/function that let's you define filter coeffs, but that is not done yet.
https://gist.github.com/jsbain/ed6a6956c43f3d8fd40092e93e49a007The buffer, filter and mixing is all done as doubles, conversion to float is at end. I try to maintain phase continuity (though in retrospect I might be doing it wrong).
-
@Mederic for stereo,
streamFormat.mChannelsPerFrame = 1;Would be set to 2 instead of 1. Some of the other fields would then get multiplied by 2, see
https://developer.apple.com/documentation/coreaudio/audiostreambasicdescriptionIn my more recent code, above, buffer in render_callback, would be cast to pointer to c_float2inNumberFrames, in which case, when converting to array, b in render could be accessed as b[idxChannel,:] for one channel.
You could either create the generators in stereo, or have different generators filling reach channel.
-
@JonB Thanks. I noticed you haven’t antialiased your sawtooth, hence I still hear the unwanted vibrato/ghost frequency in the high frequencies (although maybe less dominant).
Here is my previous code with only the antialiased sawtooth (you will notice that the ghost frequency is gone):
https://gist.github.com/medericmotte/d99357919ce0ed658e5fa6e3b9d82121
And here is my current code with antialiased sawtooth, vibrato, unison, chords, filter, and delay all in one render method:
https://gist.github.com/medericmotte/d8e81b7e0961006d7026f16cc195682c
With this inefficient “all computation in the render method” implementation, on my device, I am able to play one lfo vibrato, 4 notes at a time (4 fingers), each triggering 4 unison voices (detuned), so 16 antialiased sawtooths in total plus one lfo sine, at the same time, plus 4 one pole iir filters in serie, and a 1 second delay with feedback, all in one render method, with sampleRate=44100.
It works glitchless with this setting, but if I set the filtersNumber attribute to, say, an unusually high 16 poles, some glitches can be heard while changing the cutoff.
My initial idea, when creating this topic, was to fill the circular buffer with the numbers computed by my already programmed non-real-time synth, as they come (by 60Hz chunks). As you said, it may or may not introduce a bit of latency depending on the complexity of the synth, but it will at least prevent the glitching as long as I set the latency high enough. That’s what I am going to do next.
Regarding the use of numpy in this context, before a created this thread I actually had implemented two versions of my non real time synth. One standard serialized, and the other using numpy in parallel.
For the parallel one, the saw/chord/unison part was easy to parallelized, but the trick was to get a parallel algorithm for IIR filters and feedback delays. The problem is that their output depend on its own past values so they can’t, as is, be parallelized. I managed to get a parallel algorithm by truncating their transfer function’s infinite development. For instance, by approximating 1/(1-az ) = 1+az +a^2z^2+a^3z^3... +a^nz^n+... by 1+az +a^2z^2+a^3z^3+ ... + a^10 z^10. Then it becomes a (10 order) fir filter and can be implemented in parallel with numpy. A fast implementation of this approximation is the (classic):
For i in range(approximationOrder): {OutputVector= 1+a*InputVector InputVector=OutputVector}
Here is a test code for my serialized synth (the saw hasn’t been antialiased yet in this code):
https://gist.github.com/medericmotte/5330028059e3e94198a14b4f87a9189e
Here is the equivalent for my parallelized synth:
https://gist.github.com/medericmotte/85b0d3f9eb7bb30c03f87e5c0eb16322
On my device, to compute a 5 second sound (3*4 sawtooths + 16 one pole filters), the serialized synth take 2.9 seconds and the parallelized synth take 1.2 seconds (with a 10 order FIR approximation of each IIR).
So they are kind of faster than real time with this simple setup, which makes me think that latency may not even be a problem in most cases.
-
@JonB My last post has changed quite a lot during the day so I hope you read the last version :) Sorry about that.