Saturday, April 30, 2022

Implementing the (functional equivalent of a) Hilbert Transform with minimal overhead

I recently had a need to take existing audio and derive a quadrature pair of audio channels from this single source (e.g. the two channels being 90 degrees from each other) in order to do some in-band frequency conversion (frequency shifting).  The "normal" way to do this is to apply a Hilbert transformation using an FIR algorithm - but I needed to keep resources to an absolute minimum, so throwing a 50-80 tap FIR at it wasn't going to be my first choice.  

Another way to do this is to apply cascaded "Allpass" filters.  In the analog domain, such filters are used not to provide any sort of band-filtering effect, but to cause a phase change without affecting the amplitude and by carefully selecting several different filters and cascading them.  This is often done in "Phasing" type radios and this is accomplished with 3 or 4 op amp sections (often Biquad) cascaded - with another, similar branch of op-amps providing the other channel.  By careful selection of values, a reasonable 90 degree phase shift between the two audio channels can be obtained over the typical 300-3000 Hz "communications" bandwidth such that 40+ dB of opposite sideband attenuation is obtainable.


One tool that allows this to be done in hardware using op amps is Tonne Software's  "QuadNet" program which is an interactive tool that allows the input and analysis of parameters to derive component values - see .

I wished to do this in software, so a bit of searching let me to an older blog entry by Olli Niemitalo of Finland, found here:  which, in turn, references several other sources, including:

This very same same technique is also used in the "csound" library (found here) - a collection of tools that allow manipulation of sound in various ways.

My intent was this to be done in Javascript where I was processing audio real-time (hence the need for it to be lightweight) and this fit the bill.  Olli's blog entry provided suitable information to get this "Hilbert" transformation working.   Note the quotes around "Hilbert" indicating that it performs the function - but not via the method - of a "real" Hilbert transform in the sense that it provides a quadrature signal.

The beauty of this code is that only a single multiplication is required for each channel's filter - a total of eight multiplications in all for each iteration of the two channels - each with four sections - something that is highly beneficial when it comes to keeping CPU and memory utilization down!

As noted above, this code was implemented in Javascript and the working version is represented below:  It would be trivial to convert this to another language - particularly C:

* * *

Here comes the code!

First, here are the coefficients used in the allpass filters themselves - the "I" and the "Q" channels being named arbitrarily:

// Biquad coefficients for "Hilbert - "I" channel
  var ci1=0.47940086558884;  //0.6923878^2
  var ci2=0.87621849353931; //0.9360654322959^2
  var ci3=0.97659758950819; //0.9882295226860^2
  var ci4=0.99749925593555; //0.9987488452737^2
  // Biquad coefficients for "Hilbert" - "Q" channel
  var cq1=0.16175849836770; //0.4021921162426^2
  var cq2=0.73302893234149; //0.8561710882420^2
  var cq3=0.94534970032911;  //0.9722909545651^2
  var cq4=0.99059915668453;  //0.9952884791278^2

Olli's page gives the un-squared values as it is a demonstration of derivation - a fact implied by the comments in the code snippet above.

In order to achieve the desired accuracy over the half-band (e.g. half of the sampling rate) a total of FOUR all-pass sections are required, so several arrays are needed to hold the working values as defined here:

  var tiq1=[0,0,0];  // array for input for Q channel, filter 1
  var toq1=[0,0,0];  // array for output for Q channel, filter 1
  var tii1=[0,0,0];  // array for input for I channel, filter 1
  var toi1=[0,0,0];  // array for output for I channel, filter 1
  var tiq2=[0,0,0];  // array for input for Q channel, filter 2
  var toq2=[0,0,0];  // array for output for Q channel, filter 2
  var tii2=[0,0,0];  // array for input for I channel, filter 2
  var toi2=[0,0,0];  // array for output for I channel, filter 2
  var tiq3=[0,0,0];  // array for input for Q channel, filter 3
  var toq3=[0,0,0];  // array for output for Q channel, filter 3
  var tii3=[0,0,0];  // array for input for I channel, filter 3
  var toi3=[0,0,0];  // array for output for I channel, filter 3
  var tiq4=[0,0,0];  // array for input for Q channel, filter 4
  var toq4=[0,0,0];  // array for output for Q channel, filter 4
  var tii4=[0,0,0];  // array for input for I channel, filter 4
  var toi4=[0,0,0];  // array for output for I channel, filter 4


The general form of the filter as described in Olli's page is as follows:

 out(t) = coeff*(in(t) + out(t-2)) - in(t-2)

In this case,  our single multiplication is with our coefficient being multiplied by the input sample, and from that we add our output from two operations previous and subtract from that our input value - also  from two operations previous.

The variables "tiq" and "toq" and "tii" and "toi" refer to input and output values of the Q and I channels, respectively.  As you might guess, these arrays must be static as they must contain the results of the previous iteration.

The algorithm itself is as follows - a few notes embedded on each section

  tp0++;        // array counters
  if(tp0>2) tp0=0;

// The code above uses the modulus function to make sure that the working variable arrays are accessed in the correct order.  There are any number of ways that this could be done, so knock yourself out!

// The audio sample to be "quadrature-ized" is found in the variable "audio" - which should be a floating point number in the implementation below.  Perhaps unnecessarily, the output values of each stage are passed in variable "di" and "dq" - but this was convenient for initial testing.

  // Biquad section 1
  di=ci1*(tii1[tp0] + toi1[tp2]) - tii1[tp2];

  dq=cq1*(tiq1[tp0] + toq1[tp2]) - tiq1[tp2];

  // Biquad section 2

  di=ci2*(tii2[tp0] + toi2[tp2]) - tii2[tp2];

  dq=cq2*(tiq2[tp0] + toq2[tp2]) - tiq2[tp2];

  // Biquad section 3

  di=ci3*(tii3[tp0] + toi3[tp2]) - tii3[tp2];

  dq=cq3*(tiq3[tp0] + toq3[tp2]) - tiq3[tp2];

  // Biquad section 4

  di=ci4*(tii4[tp0] + toi4[tp2]) - tii4[tp2];

  dq=cq4*(tiq4[tp0] + toq4[tp2]) - tiq4[tp2];

// Here, at the end, our quadrature values may be found in "di" and "dq"

* * *

Doing a frequency conversion:

The entire point of this exercise was to produce quadrature audio so that it could be linearly shifted up or down while suppressing the unwanted image - this being done using the "Phasing method" - also called the "Hartley Modulator" in which the quadrature audio is mixed with a quadrature local oscillator and through addition or subtraction, a single sideband of the resulting mix may be preserved.

An example of how this may be done is as follows:

  i_out = i_in * sine + q_in * cosine;
  q_out = q_in * sine - i_in * cosine;

In the above, we have "i_in" and "q_in" - which are the I and Q audio inputs, which could be our "di" and "dq" samples from our "Hilbert" transformation and with this is an oscillator with both sine and cosine outputs (e.g. 90 degrees apart).

These sine and cosine value would be typically produced using an NCO - a numerically-controlled oscillator - running at the sample rate of the audio system.  In this case, I used a 1k (1024) entry sine wave table with the cosine being generated by adding 256 (exactly 1/4th of the table size) to its index pointer with the appropriate modulus applied to cause the cosine pointer to "wrap around" back to the beginning of the table as needed.

If I needed just one audio output from my frequency shifting efforts, I could use either "i_out" or "q_out" so one need not do both of the operations, above - but if one wanted to preserve the quadrature audio after the frequency shift, the code snippet shows how it could be done.

* * *

Does it work?

Olli's blog indicates that the "opposite sideband" attenuation - when used with a mixer - should be on the order of -43 dB at worst - and actual testing indicated this to be so from nearly DC.  This value isn't particularly high when it comes to the "standard" for communications/amateur receivers where the goal is typically greater than 50 or 55 dB, but in casual listening, the leakage is inaudible.

One consequence of the attenuation being "only" 43 dB or so is that if one does frequency shifting, a bit of the local oscillator used to accomplish this can bleed through - and even at -43 dB, a single, pure sine wave can often be detected by the human ear amongst the noise and audio content - particularly if there is a period of silence.  Because this tone is precisely known, can be easily removed with the application of a moderately sharp notch filter tuned to the local oscillator frequency.

This page stolen from