The problem

When I first launched this blog I decided to benchmark how long it takes to generate various pages on my web site, I have found that the two slowest pages to generate are my atom and rss feeds. They take on average 150ms to generate.

So, I decided to dig down and figure out why this is so slow. After watching the Railscast episode about request profiling I decided to try it out on my atom feed.

So, I installed the ruby-prof gem and created a little script (“get ‘/posts.atom’”) and ran it 30 times:

script/performance/request -n 30 lib/profile_atom_feed.rb  

Next up I dug into the output (which is in the rails tmp folder)

I discovered there were a TON of calls being made to the function String#to_xs, they all seemed to be originating from builder and they were taking more than 50% of the time. (It turns out profiler lies and they were taking more than 90% of the time)

Let’s look at the source code in builder:

class Fixnum
  XChar = Builder::XChar if ! defined?(XChar)

  # XML escaped version of chr
  def xchr
    n = XChar::CP1252[self] || self
    case n when *XChar::VALID
      XChar::PREDEFINED[n] or (n<128 ? n.chr : "&##{n};")
    else
      '*'
    end
  end
end

class String
  # XML escaped version of to_s
  def to_xs
    unpack('U*').map {|n| n.xchr}.join # ASCII, UTF-8
  rescue
    unpack('C*').map {|n| n.xchr}.join # ISO-8859-1, WIN-1252
  end
end

Ok, so string_xs is a method for turning a string into an xml safe string. It will call Fixnum::xchr for every letter it is passed. Builder needs to ensure that all the text it renders is xml safe. So there you are, once per letter (barring xml tags) in my atom feed, a call is made to Fixnum::xchr, this involved a complex bit of lookup logic in ranges. Nothing is wrong with this code, but it does involve looking up a value in up to 2 hashs (CP1252 and PREDEFINED) and 1 range lookup (VALID). This all adds up, especially if you have a big rss document.

I explored some mini optimizations:

This monkey patch

class Fixnum
  alias_method :xchr_old, :xchr  

  def xchr
    @@XChar_Cache ||= (0..255).map{|x| x.send :xchr_old} 
    @@XChar_Cache[self] or xchr_old 
  end 

end

Gives me 2X speed improvement. I suspect that with a bit of Ruby fu you could get this down to a 4x speed improvement. But… I decided to Google a bit.

The solution

The easiest thing to do is

sudo gem install fast_xs 

This makes my feed generation 10x faster. What it does is natively implement String::to_xs, the good news is that rails 2.0.2 and later, is aware of this patch and all you need to do is install the gem and restart your rails app.

Here is a screenshot of the state of affairs after the patch.

Comments

Paul Ceccato about 16 years ago
Paul Ceccato

fast_xs sounds like blashpemy to me! Optimised native code, thats cheating! can't we do everything in ruby?

Sam Saffron about 16 years ago
Sam Saffron

Paul,

<p>I agree with you on that. But… it does seem like the fastest solution at the moment. I have not tried to rewrite String::to_xs yet to avoid all method calls, but it is possible this will provide good enough performance.</p>


<p>Method calls are pretty expensive in <span class='caps'>MRI</span>, hopefully in Rubinius optimizations like this will no longer be required.</p>


<p>Cheers

Sam

Millisami almost 15 years ago
Millisami

Sam, does this apply apply for RSS feeds or just for ATOM only?

Sam Saffron almost 15 years ago
Sam Saffron

@Millisami

It applies to anything that uses builder, which is rss and atom usually


comments powered by Discourse