Ruby One Liner to Sort and Run Length Encode a String

I'm not a Ruby programmer but I thought this was kind of cool. While poking around on Stack Overflow the subject of storing letter frequency for words came up. While there may be a better solution, the idea of alphabetizing the word and storing letter frequencies of 3 or over as the number of occurrences followed by the letter seemed like a passable solution. For instance, "mississippi" is alphabetized to "iiiimppssss" and the multiple occurrences are further reduced to result in "4impp4s". Seems simple enough and in the case being discussed it would result in very little impact on the storage mechanism or the code around it.

The whole thing turns out to be pretty easy as a Ruby one liner:

"mississippi".split( // ).sort.join.gsub(/(.)\1{2,}/) { |s| s.length.to_s + s[0,1] }

That can probably be made a lot better by a Ruby expert. The regular expression finds any character followed by the same character two or more times and then passes the matching string to the following block as a parameter s. It then returns the replacement string which will be the length of the matched string (the character count) followed by one of characters from the matching string. It executes this as a global substitution on the original string. Wha-bam!!! I wonder if there's an odd edge case where this breaks.

2 Responses to “Ruby One Liner to Sort and Run Length Encode a String”

  1. Cari Michael Says:

    Where you did you find that picture of they guy with the long hair wearing the pink sleeveless shirt? I love it!

  2. Robert Simmons Says:

    I'm not sure what that has to do with the post on which you commented but I'm guessing you're referring to a picture I used in a comment on a thread on an entirely different site. That picture is currently on the first page of results when you use Google's Image Search with the term sleeveless shirt. It really is a stunning piece of clothing:

Leave a Reply