The Ruby diff gem seems pretty good to me. I am new to Ruby though, and it seems like the documentation is lacking. Maybe there's something good out there, but I cannot seem to find it. Maybe Ruby developers just know how things like this work by magic, I don't know. Here's what reading the source code and puzzling for a while allowed me to work out.

The basic idea is to create a new object with the two lists (arrays) you want to compare. It treats strings as an array of characters. I don't know if Ruby always does that. I like it though. Feels like Haskell.

diff =[1,2,3,4],[2,4,6])

So, what is that thing, diff? It turns out it's an object that contains a list of lists of lists. Or, just the empty list, if there are no differences between the two things being compared. You can access the differences by using an accessor on the Diff class that is called diff.

So, the gem uses the standard longest common subsequence algorithm, just like good old Unix diff. The first level of lists inside the main list returned by the diff accessor are contiguous differences between the two lists. Inside those interior lists are three item lists each detailing the difference at a particular place in the list. This part seems strange to me. Wouldn't an object be better here? Maybe this is the Ruby way.

Anyway, each three-item list contains either a "+" or a "-" in the zeroeth slot, depending on whether the difference is an addition or deletion. The next item is the integer index of the change. The final item is the object that has been removed or added. Here's an example:

require 'rubygems'
require 'diff'

foo = ["apple","banana","cherry","damson","emblica","fig","guava"]
bar = ["apple","cherry","damson"]
baz = ["apple","banana","damson","fig","guava"]

puts, foo).inspect
puts, bar).inspect
puts, baz).inspect
puts, baz).inspect

Running that code leads to the following output:

[[["-", 1, "banana"]], [["-", 4, "emblica"], ["-", 5, "fig"], ["-", 6, "guava"]]]
[[["-", 2, "cherry"]], [["-", 4, "emblica"]]]
[[["-", 1, "cherry"], ["+", 1, "banana"]], [["+", 3, "fig"], ["+", 4, "guava"]]]

The inspect method just delegates to the inspect method on the diff accessor on the Diff object. inspect is apparently a Ruby method on Array that means "pretty print".

Really, the whole thing is pretty easy to use, I just feel like a little more documentation would be helpful. I am also curious about the decision to make the innermost atom of difference reporting a list instead of an object with descriptive methods.

One tip: the flatten method on array can be your friend here. Particularly the flatten(1) variant, which gets rid of one level of listiness.