In Ruby, a String is just an object

by Mechaferret on September 18th, 2009

I was deploying a somewhat intricate new search ordering feature in one of our Rails applications recently, for which the developer had done a significant amount of metaprogramming. Because of that, I’d looked over the code and tested it carefully in my own development environment. I’d found and fixed the inevitable functional bugs and requirements misinterpretations (when ordering by years of experience, it’s supposed to be descending), and it seemed solid. So I deployed it to the QA/staging environment for one final runthrough before deploying to production. I was expecting the last test to be completely uneventful: there were no configuration changes, and Passenger is usually quite well-behaved (unlike some app servers I can think of, where the trip between development and staging was always an event… WebLogic comes to mind…)

So of course I get a weird SQL error the minute I search on the staging environment. Very weird: it looks like the custom query is joining some tables twice. I go back to the dev machine and check the generated SQL there. Fine (of course). I put in some logging and deploy to staging again. I search using the new ordering: it’s fine. I try another one. It breaks again. After a little bit of panic that it might be a threading issue, I finally figure it out: configuration strings are being overwritten.

Specifically: the static config was being stored as a hash of hashes of strings, and processed with each call to the search object to generate SQL using roughly the logic below:

conditions = Hash.new
constant_config_hash.each { |hash_of_strings|
  hash_of_strings.each { |key, s|
    if conditions.has_key?(key)
      conditions[key] << " and #{s}"
    else
      conditions[key] = s
    end
  }
}

If there were two hashes in config_hash with the same key (which there were), then the append clause above would append the second string not only to the hash value, but also to the initial config string, because the “else” clause set the value of the hash to be the actual string object, so when the hash value was modified, so was the config string. So the first time through, conditions[key] would wind up as “X and Y”, but the second time it would be “X and Y and Y”, because the config string that was formerly “X” would have become “X and Y”.

The fix was simple:

    if conditions.has_key?(key)
      conditions[key] << " and #{s}"
    else
      conditions[key] = s.clone
    end

This can show up in one other Rails-specific way (here model is an ActiveRecord model object:

attr_value = model.some_attribute
<do a bunch of processing on attr_value>
model.some_attribute = attr_value
model.save

And here, the new value won’t save… because the processing on attr_value changed the actual string containing the original value of model.some_attribute, so the new value set by some_attribute= and the old value are the same, so the ActiveRecord optimization logic that only saves dirty attributes doesn’t think there is anything to change (model.changes is empty). Again, a simple application of “clone” will suffice to fix the issue.

It’s occurrences like these that have enlightened me to why, in Java, instances of String are constant once initialized. As maddening as it was to have to move Strings in and out of StringBuffers to do string operations, it at least saved the debugging time when you didn’t clone a string you needed to.

Morals:

  1. Be very careful to clone or otherwise copy strings if you need to preserve their initial values in Ruby and any other language in which strings are mutable objects.
  2. Remember that classes get loaded each time in the Rails development environment, which can hide some evils that later show up in production.
  3. Don’t assume every difference between deployment and production in Rails is a multi-threading issue.¹

¹ OK, the last one is not really obvious from the story. Feel free to extrapolate how long “a little bit of panic that it might be a threading issue” actually was.

Leave a Reply

Note: XHTML is allowed. Your email address will never be published.

Subscribe to this comment feed via RSS