Create a Music Visualizer in 7 Steps with Ruby

October 17th 2010

Music Visualizer with Ruby

I haven’t seen much on this, but with Ruby-Processing, (JRuby and Processing), it’s a snap to get visuals responding to sound.

These instructions will assume you’re running Snow Leopard, but will most likely work on another OS (Linux, Windows) if you have Ruby and Java installed — I just make no guarantees =)

Now, let’s open up terminal and get to business:

1) Install Ruby-Processing

We’ll be using Ruby-Processing, a fantastic gem by Jeremy Ashkenas for this exercise. What’s really great about this gem is it comes complete with JRuby, Processing, and Minim — pretty much everything we need to make the magic happen.

To get this badboy installed all you need to do is the following:

sudo gem install ruby-processing 

2) Create your visualizer processing sketch

Ruby-Processing comes with some handy generators, so let’s use one. Move to a nice directory where you want this sketch to live and the following command will start a file for you:

rp5 create visualizer 

Now, boot up your text-editor of choice and open visualizer.rb. I’ve commented the lines I’ve added… pretty basic…

# visualizer.rb

  class Visualizer < Processing::App

    def setup

      smooth  # smoother == prettier
      size(1280,100)  # let's pick a more interesting size
      background 10  # ...and a darker background color

    end

    def draw

    end

  end

  Visualizer.new :title => "Visualizer" 

To run your (currently not very interesting) sketch:

rp5 run visualizer.rb 

3) Import Minim

Since Ruby-Processing is JRuby, you can use Java libraries as well as Ruby gems in your sketches (just no C extensions). Anyways, the following code is how you load in Minim, the sound library we’ll be using. Ruby-Processing includes Minim with its install, and will include that directory in its search path.

The only thing to note here is that with Java libraries we have to be specific about which packages we want to load from the library. We’ll use the import method after using Ruby-Processing’s load_library command.

# visualizer.rb

  class Visualizer < Processing::App

    # Load minim and import the packages we'll be using
    load_library "minim"
    import "ddf.minim"
    import "ddf.minim.analysis"

    def setup
      smooth  # smoother == prettier
      size(1280,100)  # let's pick a more interesting size

      background 10  # ...and a darker background color
    end

    def draw
      # nothing here yet...

    end

  end

  Visualizer.new :title => "Visualizer"
 

4) Initialize Minim to get Sound/Music Input

This is where we get Minim all set up to do our sound magic. We’ll create a Minim object and tell it to start listening to our primary sound input, (microphone, line in, or soundflower if you’re using it).

Then we’ll create an FFT object which we’ll using to create most of the sound-responsiveness, as well as a beat detection object, which is also fun. For the FFT, we’ll create an array called @freqs, which will list the audio frequencies that we are interested in getting values for. I just looked at which frequencies were in the VLC equalizer — I figure this will give us a pretty good range.

Finally we’ll create a bunch of arrays that we’ll be using a little bit later and setting a “smoothness” factor, which I will also talk a bit about in the next step.

# visualizer.rb
  ...

  def setup_sound
    # Creates a Minim object
    @minim = Minim.new(self)
    # Lets Minim grab sound data from mic/soundflower

    @input = @minim.get_line_in

    # Gets FFT values from sound data
    @fft = FFT.new(@input.left.size, 44100)
    # Our beat detector object

    @beat = BeatDetect.new

    # Set an array of frequencies we'd like to get FFT data for 
    #   -- I grabbed these numbers from VLC's equalizer
    @freqs = [60, 170, 310, 600, 1000, 3000, 6000, 12000, 14000, 16000]

    # Create arrays to store the current FFT values, 

    #   previous FFT values, highest FFT values we've seen, 
    #   and scaled/normalized FFT values (which are easier to work with)
    @current_ffts   = Array.new(@freqs.size, 0.001)
    @previous_ffts  = Array.new(@freqs.size, 0.001)
    @max_ffts       = Array.new(@freqs.size, 0.001)
    @scaled_ffts    = Array.new(@freqs.size, 0.001)

    # We'll use this value to adjust the "smoothness" factor 

    #   of our sound responsiveness
    @fft_smoothing = 0.8
  end 

…and let’s be sure to call this from “setup”

# visualizer.rb

  ...

  def setup
    smooth  # Make it prettier
    size(1280,100)  # Let's pick a more interesting size

    background 10  # Pick a darker background color

    setup_sound
  end 

5) Use Minim to get FFT each Frame

Now that we have Minim in there and set up, let’s go ahead and use it. Since we want to get updated info from it every frame, lets create a method that we’ll call every frame to get current sound data from it.

We’re primarily interested in FFT values for those frequencies I mentioned above. We’ll iterate through those, ultimately populating the “Scaled FFTs” array — the one we’ll be using to draw. (I’ll go into this a bit more after the code)

# visualizer.rb
  ...

  def update_sound
    @fft.forward(@input.left)

    @previous_ffts = @current_ffts

    # Iterate over the frequencies of interest and get FFT values
    @freqs.each_with_index do |freq, i|
      # The FFT value for this frequency
      new_fft = @fft.get_freq(freq)

      # Set it as the frequncy max if it's larger than the previous max

      @max_ffts[i] = new_fft if new_fft > @max_ffts[i]

      # Use our "smoothness" factor and the previous FFT to set a current FFT value 
      @current_ffts[i] = ((1 - @fft_smoothing) * new_fft) + (@fft_smoothing * @previous_ffts[i])

      # Set a scaled/normalized FFT value that will be 

      #   easier to work with for this frequency
      @scaled_ffts[i] = (@current_ffts[i]/@max_ffts[i])
    end

    # Check if there's a beat, will be stored in @beat.is_onset
    @beat.detect(@input.left)

  end 

and add this method call to “draw”:

# visualizer.rb
  ...

  def draw
    update_sound
  end
 

The FFT values that we’ll be getting from Minim can basically be thought of how much of that frequency is present at that moment. So, generally when bass is heavy the lower frequency FFT values will be greater, bells and claps will affect the higher frequency FFT values, and vocals will move the values for the frequencies in the middle.

However, the units of the values Minim gives us are somewhat unpredictable/not uniform across frequencies, so these arrays will be used to make these values more manageable. We want to do two operations on the FFT values to tame them a bit: normalization and smoothing. We use the “Max FFTs” array to help with normalization and the “Previous FFTs” array for smoothing.

Normalizing the FFT values will make sure that for each frequency we have a value between 0 and 1. This will allow us to easily turn three sound frequencies into a color simply by multiplying each value by 255 and assigning them to red, green, and blue. Similarly we can turn a frequency into a horizontal position by multiplying it by the width of the sketch. To normalize an FFT value we need to keep track of the largest value we’ve witnessed for that frequency (it will be different for each) and divide any new value by that.

FFT values change rapidly and drastically. If we were using them raw, our animation would be all over the place, which may not be a bad thing, but it’s definitely something we’d like to control. Enter the “smoothness” factor. By keeping track of the previous FFT value and using this “smoothness” factor (a number between 0 and 1), we can scale the effect of the new FFT values coming in. The “smoothness” factor is basically what percentage of our current value will be made from Minim’s new reading and what percentage comes from our last value — basically a running average. A value of 0 turns this off (like raw), and a value of 1 won’t be very interesting. Play with this variable =)

6) Draw using those scaled values as parameters

Now, we get to the fun part. We use that “Scaled FFTs” array to create an animation. This simple animation will be a circle that changes position, color, and size to the sound/music/noise.

We do this by setting its different parameters equal to the “Scaled FFT” value for a frequency of our choice. For example, if we wanted the circle to get bigger when lower sound frequencies are stronger, we could use @scaled_ffts[0], @scaled_ffts[1], @scaled_ffts[2], or @scaled_ffts[3] (which would represent frequencies 60hz, 170hz, 310hz, and 600hz respectively). However, we just need to remember to multiply that value by something. Those “Scaled FFT” values by themselves are just numbers between 0 and 1, so the size of our circle was just set to equal one of those values, it would be a very small circle. So, we use the following pattern:

desired_value = @scaled_ffts[frequency_index] * maximum_desired_value + minimum_desired_value
 

For colors this is easy. By default Processing expects red, green, and blue values between 0 and 255. We don’t need to add zero, so we just multiply the “Scaled FFT” value for whichever frequency we want by 255.

See below for an example of me using this pattern for each of the circle’s parameters. Also I use Minim’s beat detection object to quadruple the size.

def animate_sound  
    # Create a circle animated with sound:
    # Horizontal position will be controlled by the FFT of 60hz (normalized against width)

    # Vertical position - 170hz (normalized against height)
    # red, green, blue - 310hz, 600hz, 1khz (normalized against 255)
    # Size - 170hz (normalized against height), quadrupled on beat

    @size = @scaled_ffts[1]*height
    @size *= 4 if @beat.is_onset

    @x1  = @scaled_ffts[0]*width + width/2

    @y1  = @scaled_ffts[1]*height + height/2
    @red1    = @scaled_ffts[2]*255

    @green1  = @scaled_ffts[3]*255
    @blue1   = @scaled_ffts[4]*255

    fill @red1, @green1, @blue1
    stroke @red1+20, @green1+20, @blue1+20

    ellipse(@x1, @y1, @size, @size)
  end 

Oh, and be sure to add this to “draw”:

def draw
    update_sound
    animate_sound
  end 

7) Play!

Well, that’s about it actually. So the only thing left to do is experiment. Add this to animate_sound to create another circle that operates on higher frequencies.

# Add another circle using different controlling frequencies

  @x2  = width/2 - @scaled_ffts[5]*width
  @y2  = height/2 - @scaled_ffts[6]*height

  @red2    = @scaled_ffts[7]*255
  @green2  = @scaled_ffts[8]*255

  @blue2   = @scaled_ffts[9]*255

  fill @red2, @green2, @blue2

  stroke @red2+20, @green2+20, @blue2+20
  ellipse(@x2, @y2, @size, @size)
 

Enjoy!

Source: visualizer.rb