Archive for the 'Ruby' Category

Ruby One Liner to Sort and Run Length Encode a String

Tuesday, December 30th, 2008

I'm not a Ruby programmer but I thought this was kind of cool. While poking around on Stack Overflow the subject of storing letter frequency for words came up. While there may be a better solution, the idea of alphabetizing the word and storing letter frequencies of 3 or over as the number of occurrences followed by the letter seemed like a passable solution. For instance, "mississippi" is alphabetized to "iiiimppssss" and the multiple occurrences are further reduced to result in "4impp4s". Seems simple enough and in the case being discussed it would result in very little impact on the storage mechanism or the code around it.

The whole thing turns out to be pretty easy as a Ruby one liner:

"mississippi".split( // ).sort.join.gsub(/(.)\1{2,}/) { |s| s.length.to_s + s[0,1] }

That can probably be made a lot better by a Ruby expert. The regular expression finds any character followed by the same character two or more times and then passes the matching string to the following block as a parameter s. It then returns the replacement string which will be the length of the matched string (the character count) followed by one of characters from the matching string. It executes this as a global substitution on the original string. Wha-bam!!! I wonder if there's an odd edge case where this breaks.