Real time audio buffer synth/Real time image smudge tool

Mederic

@JonB I never said I used the filter for antialiasing. Maybe I was confusing because I started talking about the aliasing problem at the same time as I talked about the filter, but they have nothing to do with each other in my code.

The only thing I use to antialiase is the polyBLEP method (the polynomial residu), and it doesn’t just help with antialiasing, it works really well (although not technically killing frequencies > Nyquist , it weakens them enough to be inaudible).

That being said, theoretically speaking there is actually one (theoretically perfect) filter that WOULD antialiase a sound with base frequency=freq: it’s, by definition, the one with frequency response g(n*freq)=(n*freq<Nyquist). The corresponding signal the filter would need to convolve its input with is then

IR(t)=sin(freq*t)+sin(2*freq*t)+sin(3*freq*t)+...+sin(Nyquist*t)

(Here I assumed that Nyquist is a multiple of freq, and I omitted the 2pi factors) This function can be computed using sin(x)=Im(exp(ix)) and the geometric sum formula. I am gonna omit the 2pi factors here again.

IR(t)=sin((freq+Nyquist)*t/2)*sin(Nyquist*t/2)/sin(freq*t/2)

Anyway, it is a continuous time function. In other words, you would have to convolve the filter’s input with a continuous time function as opposed to the discrete impulse response DSP filters use. That’s why DSP filters can’t do (perfect) antialiasing, and that’s why I am not counting on it, even if oversampling them would make them more time continuous, they would still be time discrete by nature (also, the form of their transfer functions shows that their frequency response loops around the complex circle so any lowpass DSP filter ends up not killing some frequencies if you go high enough).

Mederic

And regarding the sum of sines, for a 100Hz note you would need to go up to Nyquist=22050 so approximately 220 sines to sum just for one note. If you want 7 detuned saws playing one note in unison, you would need 7*220= 1540 sines. If you want to play a chord of 4 notes with it and in that range, you would need approximately 1540*4= 6160 sines to sum at the same time, each multiplied element-wise by some frequency response function to evaluate in parallel on a 6160 sized vector (the frequencies), all that knowing that one might play notes in a fast way so you can’t compute things too much in advance. Do you think it could work? You know more than me when it comes to computational time (I have a mathematical background and have done a lot of programming but only as a hobby so still a lot to learn).

Mederic

@JonB I think I need to take a step back and put in practice all this ideas we mentioned. There is a lot of information in there, and I have a lot to learn (c_types, objc_utiles , more about numpy, Apple Core Audio) and a lot of ideas to try (Standard DSP, parallel approximation, convolution, sum of sines, pre-buffering) and I feel like I need to catch up with all that before we keep thinking about new ideas (the more ideas we get, the more work it will take to try them all) ;)

Also, I will try to code it first myself (as a learning exercise) before looking more at your code (as a reference/solution). I learn better that way.

I will get back to you when all of this will be done. I am under the impression that you are also interested in coding a synth with Pythonista (correct me if I am wrong), but it seems that your vision is to have a complete modular environment while I am (right now) really trying to do something minimalistic and customized for my personal (and subjective) preferences, so we will probably end up with two different codes :)

JonB

I am not coding a synth. Mostly I have been interested in pushing the boundaries of high performance on pythonista, as a fun exercise, and trying to understand these various libraries. Making the iPad bleep is also kinda fun ;)

Mederic

@JonB Coming back sooner than I thought :)

I made an interesting experiment with your codes and there is some phenomenon I am not sure I understand right.

If you take your audiounittest.py code (the very first one) and play the sine by modulating the frequency very fast on the whole range, you will notice quantized modulation. Now if you take your other code: audiounittest2.py, set it to a sine, and do the same thing. It’s a perfectly smooth frequency modulation. Do you hear the difference? It’s not really audio glitches like caused by overheading. It’s really « quantized » frequencies.

Here is what I tried to fight that. In the render() method of audiounittest.py, I stored the last used frequency in the previous render() call in a self.previous_frequency attribute.

Then, at the beginning of render(), I assign self.previous_frequency to a prev_freq variable and the frequency from self.sound[touch_id] to a current_freq variable. Then, during the « for frame in range(numFrames) » loop, I interpolate between prev_freq and current_freq. It killed the frequency quantization effect. That’s the only way I could get that perfectly smooth modulation I was hearing in audiounittest2.py.

The problem was solved but I still wanted to understand what was the issue.I first thought it had to do with maybe the touch_moved() method getting slowed down in audiounittest.py by the less efficient implementation compared to audiounittest2.py, but I just timed them and both touch_moved are around 110Hz. In other words, the sounds[touch_id] attribute changes as often in audiounittest.py as in audiounittest2.py. So it doesn’t explain the quantization.

Here is my current guess.

In audiounittest2.py, you compute the samples perfectly without missing any frequency values because the touch_moved() method is actually calling the generator and sending to it the frequency as a parameter. So no frequency is missed by the generator.

On the other hand, in audiounittest.py, the render() method generally fills the buffer faster than the duration of the buffer itself. In other words, there is always time when the render() method « waits » before filling the buffer again. The consequence is that when it fills the buffer it actually only accesses the first few frequency values happening during this very short time and use them to fill the whole buffer. Then, during the waiting time, it misses all the other frequency values. Btw, I don’t think it falls under the scope of overheading. To me overheading is the opposite (the render() method being to slow), but here it would be kind of too fast.

What do you think about this? Am I understanding the issue correctly?

JonB

Check how many samples render is asking for (display numFrames). In my version, it asks for 1024 at a time for 44100 sample rate. So, if you move frequency over 8000 Hz in half a sec, (22000 samples), it will be quantized into 8000/22 Hz chunks.

The buffered approach updates the buffer (up to present time, at least using time.perf_counter) for every touch moved, and within update. So over 0.5 sec you might get several hundred touch events, and updates every few msec.

https://gist.github.com/6ccd9ad8ba95c373ec7d76ceaf9061bc has some minor corrections, adds some diagnostic prinouts, and pulled all the ctypes garbage into a separate file.

So, even if you don't use the modular generator approach, you might consider using a custom generator (i.e subclass ToneGenerator and handle the logic within the buffer filling methods.

It is also theoretically possible to force the audiounit to ask for data more frequently. https://developer.apple.com/documentation/audiotoolbox/1534199-generic_audio_unit_properties/kaudiounitproperty_maximumframesperslice?language=objc

		maxFrames=c_uint32(256)
		err = AudioUnitSetProperty(toneUnit, kAudioUnitProperty_MaximumFramesPerSlice,
        kAudioUnitScope_Global, 0, byref(maxFrames), sizeof(maxFrames));

however this didnt't work when i tried.

Mederic

1024 as well for me. Please refer to audiounittest.py or audiounittest2.py because they are both your version :)

I agree with you about the 8000/22Hz computation but it’s only true if my assumption (the render() method filling the buffer way faster than the buffer’s duration) is true.

if the render() method was taking 0.02s to fill a 0.02s buffer, then, as it accesses the sounds[touch_id] attribute at each iteration of the « for frame in range(numFrames) » loop, it should be accessing the correct frequency values in real time and at the right time and fill a 0.02s buffer with frequency values corresponding to a 0.02s time of modulation, so there shouldn’t occur quantizing, at least not more than in audiounittest2.py, knowing that touch_moved updates occurred at 110Hz even for the fastest moves in both codes, so the quantization should happen at a 110Hz rate (which is not noticeable and appears as smooth) (Even if you remove the computation in the update part of audiounittest2.py you won’t notice quantization)
If on the contrary, the render() method takes 0.001s to fill a 0.02s buffer, then, even if it accesses the sounds[touch_id] attribute at each iteration of the « for frame in range(numFrames) » loop, it will be doing so during a 0.001s period, thus accessing frequency values corresponding to such a small period of time and using it to fill the whole 0.02s buffer. So it will be almost as if it only gets the very first frequency of the 0.02s and use it for the whole chunk resulting in a 1/0.02 = 50Hz quantization. Somehow this is noticeable on fast modulation (like 30fps vs 15fps in graphics when there is fast motion, I guess).

Sorry for the redundancy but I wanted to be more precise in what I meant.

Also, I do intend to have a prebuffered approach in the future, I am just studying the differences in order to learn and figure out more exactly how the render() thread works.

Mederic

@JonB , just to keep you up to date: I finally took my initial non-real-time synth and merged it with audiounittest. It wasn’t long to do because my synth was actually already computing 60Hz chunks of sound data in the update method (of a Scene), appending it to an out array. I had done that in the past back when I was trying to achieve real time by continuously writing in a .wav file at each update (that’s when I realized that it couldn’t really be done right and came here to create this thread about the audio circular buffer functionality).

So it was already all set up for real time and the circular buffer! The only thing I had to do was to not write in a .wav file, and place the out array in the AudioRenderer’s attributes, and have the render() method start reading it after waiting a fixed latency attribute. Weirdly, latency=0 actually works glitchless for me :) I guess that’s because the first render() doesn’t work, in which case the first audio chunk is kept null by default, and then the render() thread naturally takes a delay of one chunk behind the update() method.

Of course, later I will have to use some kind of deque or circular array for the out attribute because right now it’s basically recording the whole performance without deleting anything, but I am already really happy to have been able to just take my non-real time synth code from before and almost just copy/paste it in audiounittest.py! When you think about it, it’s very similar to what I had to do with the smudge tool: I almost didn’t change my code and basically merged it with your IOSurface wraper! It’s interesting how one can just plug a pure Python script to these features (IOSurface wraper and AudioRenderer) and work perfectly :)

And I still intend to take advantage of numpy in multiple places.

JonB

@Mederic
I remembered reading about the Accelerate framework. Might have some useful bits for you

https://developer.apple.com/documentation/accelerate/vdsp?language=objc

In particular: vector polynomial evaluation (though i image numpy is competitive here), IIR filter in biquad form, and some other goodness that would be in scipy, but not numpy.

Mederic

@JonB Sorry for the late answer. Thank you for this. I won’t have the time to look at it for a while because I have put other stuff on pause for too long for this project and now I have to catch up.

Right now everything will just feel like a bonus/improvement to my current code because it’s already working very well on my device.

Btw, as an experiment I tried using numpy to compute sawtooths as sums of sines (in real time) and it gets quickly hard to keep up for my device. A few notes at the same time and there was overhead. To be honest I didn’t try every possibilities. I basically summed the sines with according weights by stacking them as rows in a matrix and left-multiplying this matrix by a row vector of the weights.

I think the only reasonable way to use the sum of sines decomposition is to precompute wavetables and then use them.

ClywdSlade

This post is deleted!