One very common problem Ruby and Rails have is memory usage. Often when hosting sites the bottleneck is memory not performance. At Discourse we spend a fair amount of time tuning our application so self hosters can afford to host Discourse on 1GB droplets.

To help debug memory usage I created the memory_profiler gem, it allows you to easily report on application memory usage. I highly recommend you give it a shot on your Rails app, it is often surprising how much low hanging fruit there is. On unoptimized applications you can often reduce memory usage by 20-30% in a single day of work.

Memory profiler generates a memory usage report broken into 2 parts:

Allocated memory

Memory you allocated during the block that was measured.

Retained memory

Memory that remains in use after the block being measure is executed.

So, for example:

def get_obj
   allocated_object1 = "hello "
   allocated_object2 = "world"
   allocated_object1 + allocated_object2
end

retained_object = nil

MemoryProfiler.report do
   retained_object = get_obj
end.pretty_print 

Will be broken up as:

[a lot more text]
Allocated String Report
-----------------------------------
         1  "hello "
         1  blog.rb:3

         1  "hello world"
         1  blog.rb:5

         1  "world"
         1  blog.rb:4


Retained String Report
-----------------------------------
         1  "hello world"
         1  blog.rb:5

As a general rule we focus on reducing retained memory when we want our process to consume less memory and we focus on reducing allocated memory when optimising hot code paths.

For the purpose of this blog post I would like to focus on retained memory optimisations and in particular in the String portion of memory retained.

How you can get memory profiler report for your Rails app?

We use the following script to profile Rails boot time:

if ENV['RAILS_ENV'] != "production"
  exec "RAILS_ENV=production ruby #{__FILE__}"
end

require 'memory_profiler'

MemoryProfiler.report do
  # this assumes file lives in /scripts directory, adjust to taste...
  require File.expand_path("../../config/environment", __FILE__)

  # we have to warm up the rails router
  Rails.application.routes.recognize_path('abc') rescue nil

  # load up the yaml for the localization bits, in master process
  I18n.t(:posts)

  # load up all models so AR warms up internal caches
  (ActiveRecord::Base.connection.tables - %w[schema_migrations versions]).each do |table|
    table.classify.constantize.first rescue nil
  end
end.pretty_print

You can see an example of such a report here:

Very early on in my journey of optimizing memory usage I noticed that Strings are a huge portion of the retained memory. To help cutting down on String usage memory_profiler has a dedicated String section.

For example in the report above you can see:

Retained String Report
-----------------------------------
       942  "format"
       940  /home/sam/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/actionpack-5.1.4/lib/action_dispatch/journey/nodes/node.rb:83
         1  /home/sam/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/actionpack-5.1.4/lib/action_controller/log_subscriber.rb:3
         1  /home/sam/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/activemodel-5.1.4/lib/active_model/validations/validates.rb:115

       941  ":format"
       940  /home/sam/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/actionpack-5.1.4/lib/action_dispatch/journey/scanner.rb:49
         1  /home/sam/.rbenv/versions/2.5.0/lib/ruby/gems/2.5.0/gems/activesupport-5.1.4/lib/active_support/dependencies.rb:292
... a lot more ...

We can see that there are 940 copies of the string "format" living in my Ruby heaps. These strings are all “rooted” so they just sit there in the heap and never get collected. Rails needs the 940 copies so it can quickly figure out what params my controller should get.

In Ruby RVALUEs (slots on the Ruby heap / unique object_ids) will consume 40 bytes on x64. The string “format” is quite short so it fits in a single RVALUE without an external pointer or extra malloc. Still, this is 37,600 bytes just to store the single string “format”. That is clearly wasteful, we should send a PR to Rails.

It is wasteful on a few counts:

  1. Every object in the Ruby heap is going to get scanned every time a full GC runs, from now till the process eventually dies.

  2. Small chunks of memory do not fit perfectly into your process address space, memory fragments over time and the actual impact of a 40 byte RVALUE may end up being more due to gaps between RVALUE heaps.

  3. The larger your Ruby heaps are the faster they grow (out-of-the-box): Feature #12967: Add a default for RUBY_GC_HEAP_GROWTH_MAX_SLOTS out-of-the-box - Ruby master - Ruby Issue Tracking System

  4. A single RVALUE in a Ruby heap that contains 500 or so RVALUEs can stop it from being reclaimed

  5. More objects means less efficient CPU caching, more chances of hitting swap and so on.

Techniques for string deduplication

I created this Gist to cover quite a bit of the nuance around the techniques you can use for string deduplication in Ruby 2.5 and up, for those feeling brave, I recommend you spend some time reading it carefully:

For those who prefer words, well here are some techniques you can use:

Use constants

# before
def give_me_something
   "something"
end

# after
SOMETHING = "something".freeze

def give_me_something
   SOMETHING
end

Advantages:

  • Works in all versions of Ruby

Disadvantages:

  • Ugly and verbose
  • If you forget the magic “freeze” you may not reuse the string properly Ruby > 2.3

Use the magic frozen_string_literal: true comment

# before
def give_me_something
   "something"
end

# after

# frozen_string_literal: true
def give_me_something
   "something"
end

Ruby 2.3 introduces the frozen_string_literal: true pragma. When the comment # frozen_string_literal: true is the first line of your file, Ruby treats the file differently.

Every simple string literal is frozen and deduplicated.

Every interpolated string is frozen and not deduplicated. Eg x = "#{y}" is a frozen non deduplicated string.

I feel this should be the default for Ruby and many projects are embracing this including Rails. Hopefully this becomes the default for Ruby 3.0.

Advantages:

  • Very easy to use
  • Not ugly
  • Long term this enables fancier optimisations

Disadvantages:

  • Can be complicated to apply on existing files, a great test suite is highly recommended.

Pitfalls:

There are a few cliffs you can fall which you should be careful about. Biggest is the default encoding on String.new

buffer = String.new
buffer.encoding => Encoding::ASCII-8BIT

# vs 

# String @+ is new in Ruby 2.3 and up it allows you to unfreeze
buffer = +""
buffer.encoding => Encoding::UTF-8

Usually this nuance will not matter to you at all cause as soon as you append to the String it will switch encoding, however if you are passing refs to 3rd party library of the empty string you created havoc can ensue. So, "".dup or +"" is a good habit.

Dynamic string deduplication

Ruby 2.5 introduces a new techniques you can use to deduplicate strings. It was introduced in Feature #13077: [PATCH] introduce String#fstring method - Ruby master - Ruby Issue Tracking System by Eric Wong.

To quote Matz

For the time being, let us make -@ to call rb_fstring.
If users want more descriptive name, let’s discuss later.
In my opinion, fstring is not acceptable.

So, String’s @- method will allow you to dynamically de-duplicate strings.

a = "hello"
b = "hello"
puts ((-a).object_id == (-b).object_id) # I am true in Ruby 2.5 (usually) 

This syntax exists in Ruby 2.3 and up, the optimisation though is only available in Ruby 2.5 and up.

This technique is safe, meaning that string you deduplicate still get garbage collected.

It relies on a facility that has existed in Ruby for quite a while where it maintains a hash table of deduplicated strings:

The table was used in the past for the "string".freeze optimisation and automatic Hash key deduplication. Ruby 2.5 is the first time this feature is exposed to the general public.

It is incredibly useful when parsing input with duplicate content (like the Rails routes) and when generating dynamic lookup tables.

However, it is not all :rose:s

Firstly, some people’s sense of aesthetics is severely offended by the ugly syntax. Some are offended so much they refuse to use it.

Additionally this technique has a bunch of pitfalls documented in extreme levels here.

Until Feature #14478: String #uminus should de-dupe unconditionally - Ruby master - Ruby Issue Tracking System is fixed you need to “unfreeze” strings prior to deduping

yuck = "yuck"
yuck.freeze
yuck_deduped = -+yuck

If a string is tainted you can only “partially” dedupe it

This means the VM will create a shared string for long strings, but will still maintain the RVALUE

love = "love"
love.taint
(-love).object_id == love.object_id 

# got to make a copy to dedupe
deduped = -love.dup.untaint

Trouble is lots of places that want to apply this fix end up trading in tainted strings, a classic example is the postgres adapter for Rails that has 142 copies of "character varying" in the Discourse report from above. In some cases this limitation means we are stuck with an extra and pointless copy of the string just cause we want to deduplicate (cause untainting may be unacceptable for the 3 people in the universe using the feature).

Personally, I wish we just nuked all the messy tainting code from Ruby’s codebase :fire: , which would make it both simpler, safer and faster.

If a string has any instance vars defined you can only partially dedupe

# html_safe sets an ivar on String so it will not be deduplicated
str = -"<html>test</html>".html_safe 

This particular limitation is unavoidable and I am not sure there is anything Ruby can do to help us out here. So, if you are looking to deduplicate fragments of html, well, you are in a bind, you can share the string, you can not deduplicate it perfectly.

Additional reading:

https://ruby-talk.trydiscourse.com/t/psa-string-memory-use-reduction-techniques/74477

Good luck, reducing your application’s memory usage, I hope this helps!

Comments


comments powered by Discourse