Filtering Each Word in a String in Ruby

Posted by Zach Baker Thu, 27 Apr 2006 21:03:00 GMT

Here’s a little Ruby method I found handy. I recently needed to convert some strings from all capital letters into something more polite. Normally, using String#scan as a word iterator will do just fine. However, I was looking for a way to modify each word in the string but keep punctuation and other non-word stuff in place.

So I created a filter (and filter!) method for String. It pulls each word from the string, yields it to the given block and uses the result to replace the original word. Sort of like map for strings:

"e. e. cummings".filter {|w| w.capitalize }
=> "E. E. Cummings"

"Merrye Olde Englande".filter {|w| w.chop }
=> "Merry Old England"

"EVETS KAINZOW".filter {|w| w.reverse.downcase.capitalize }
=> "Steve Wozniak"

So here’s the code.

class String
  def filter(word_pattern = '\w+', &proc)
    result_string, rest_of_string = '', self
    word_re = Regexp.new(word_pattern)
    while word_match = word_re.match(rest_of_string)
      rest_of_string = word_match.post_match
      # concatenate the part before the match and the result of the yield (unless nil)
      result_string << word_match.pre_match + (yield word_match[0] or word_match[0])
    end
    result_string + rest_of_string
  end
  def filter!(word_pattern = '\w+', &proc)
    replace(filter(word_pattern, &proc))
  end
end

And here’s the rub:

"DUDE, WHERE'S MY CAR?".filter {|w| w.capitalize }
=> "Dude, Where'S My Car?"

Whoa! See that rogue capital S? That’s an example of what happens when using the default pattern /\w+/ to match words. There’s no logic to make the “that” and the apostrophe-s one single word. So you may find you need to pass in a more sophisticated regular expression:

"DUDE, WHERE'S MY CAR?".filter(/\w+'[Ss]|\w+/) {|w| w.capitalize }
=> "Dude, Where's My Car?"

Of course, you might want something a bit more sophisticated than that.