I’ve got a large Rails app and I’m looking to improve (dismal) performance.
Running with ruby-prof doesn’t help me much, I get output similar to this (running in production mode on thin):
Thread ID: 9322800
Total: 1.607768
Sort by: self_time
%self total self wait child calls name
26.03 0.42 0.42 0.00 0.00 1657 Module#define_method
8.03 0.13 0.13 0.00 0.00 267 Set#initialize
4.41 0.07 0.07 0.00 0.00 44 PG::Result#values
4.28 0.07 0.07 0.00 0.00 1926 ActiveSupport::Callbacks::Callback#start
4.21 0.07 0.07 0.00 0.00 14835 Kernel#hash
4.13 0.08 0.07 0.00 0.01 469 Module#redefine_method
4.11 0.07 0.07 0.00 0.00 63 *<Class::ActiveRecord::Base>#with_scope
4.02 0.07 0.06 0.00 0.00 774 ActiveSupport::Callbacks::Callback#_compile_options
3.24 0.05 0.05 0.00 0.00 30 PG::Connection#async_exec
2.31 0.40 0.04 0.00 0.37 2130 *Module#class_eval
1.47 0.02 0.02 0.00 0.00 6 PG::Connection#unescape_bytea
1.03 0.05 0.02 0.00 0.03 390 *Array#select
* indicates recursively called methods
I guessed that maybe it is spending a lot of time in the garbage collector so since I’m running on REE I decided to try using GC.enable_stats to get some more information. I added the following to my application controller:
around_filter :enable_gc_stats
private
def enable_gc_stats
GC.enable_stats
begin
yield
ensure
GC.disable_stats
GC.clear_stats
end
end
On a relatively large page running on my machine here in production mode with REE and the thin webserver (ruby-prof disabled since it makes it a bit slower) I get:
Completed 200 OK in 1093ms (Views: 743.1ms | ActiveRecord: 139.2ms)
GC.collections: 11
GC.time: 666299 us 666.299 ms
GC.growth: 461 KB
GC.allocated_size: 152 MB
GC.num_allocations: 1,924,773
ObjectSpace.live_objects: 1,015,195
ObjectSpace.allocated_objects: 12,393,644
So for a page that took 1093 ms, it seems like almost 700ms was spend in the garbage collector. Has anybody had this kind of problem before? I realize you cannot help with my app in particular (it is quite big with a lot of gems and things) – but are there techniques or tools to get a better idea why so much garbage is being created?
Any ideas would be very much appreciated!
Your rails log shows most of the time (75%) is spent in view code.
Your profile report shows three obvious hotspots:
Module#define_methodfor self time,Module#class_evalfor total time, andSet#initialize.define_methodandclass_evalindicate there’s likely a lot of dynamic code execution which seems excessive to me — generally you want to generate that code early and reuse it instead of repeatadly re generating it. It almost certainly is part of the problem with your excessive object allocation issues. Producing a graph report instead of a flat report should help you find the parent methods which are falling into these expensive paths and that may give you a pointer to where you could optimize.Set#initializemay be a real artifact of what your code needs to do, or it might be a sign that there’s some significantSet[...]orSet::newset creation calls inline which could be done once and assigned to a constant or instance/class var for reuse.ruby-prof is ok, but you might want to also try perftools.rb which is easy to hook up to rack rails with rack-perftools_profiler. perftools has some enhanced visualization tools which can make it much easier to understand hot execution paths.
Since you’re running REE and extensive object allocation (and hence garbage collection) is an issue, you could try memprof to get some insight into what and where all these allocations are coming from.
If you can’t find a path to reducing the amount of objects being allocated, you could ease the GC burden at the expense of larger process memory size by tuning the GC to prealloc a heap large enough to hold a typical request’s allocation demands. Unicorn offers a rack module for out of band GC. You might be able to adapt this module’s approach to work with thin and move all the GC time to between requests — you’ll still pay the cpu cost, but at least you won’t delay your responses for garbage collection.