Dwi Wahyudi

Senior Software Engineer (Ruby, Golang, Java)


Sometimes, in Ruby on Rails webapp, looping/iterating will cause N+1 queries which will make the code to execute SQL queries over and over again, making the response and processing times longer.

Example

Here is an example:

  Vehicle
    .where('brand_id = ? and manufacturing_date >= ?',
           brand_id, 
           manufacturing_date)
    .each do |vehicle|
      # a call to vehicle's manufacturer will cause N+1 query here.
      # if there are 1000 vehicle (the N), 
      # this will execute 1000 queries.
      vehicle.manufacturer
      # N is 1000, query to Vehicle above is 1.
      # This is N+1 query.
      # Because up to this line, there are 1001 queries executed.

      # this as well will execute another 1000 queries.
      vehicle.brand

      # and another 1000 queries
      vehicle.manufacturer.spareparts
    end

In order to reduce the amount of queries executed, we can use eager-loading, by using includes syntax.

  Vehicle
    .includes(:brand, manufacturer: :spareparts)
    .where('brand_id = ? and manufacturing_date >= ?',
           brand_id, 
           manufacturing_date)
    .each do |vehicle|
      # These call to associations will no longer trigger any query.
      # Because they are all already loaded by includes method above.
      vehicle.manufacturer
      vehicle.brand
      vehicle.manufacturer.spareparts

      # You may check if an association is loaded via loaded? method.
      vehicle.manufacturer.loaded? # will return true.
    end

Querying Loaded Records

What if we want to query the spareparts from manufacturer of vehicle above? We can call .where or .find_by again, but this again will trigger a query,

vehicle.manufacturer.spareparts.where(component_id: 4)

vehicle.manufacturer.spareparts.find_by(component_id: 4)

Which in turn, if done inside a loop will cause another N+1 queries.

Because for each of above manufacturer’s spareparts is already loaded in memory, we just need to call ruby methods. .select instead of .where and .detect instead of .find_by.

These two statements won’t execute/generate any query to database:

vehicle.manufacturer.spareparts.select { |at| at.component_id == 4 }
# This will return many.

vehicle.manufacturer.spareparts.detect { |at| at.component_id == 4 }
# This will return only one.

Manual Pre-Loading

Note that we can also manually preload some associations from a collection of records. Let say we have a collection of books.

books = Book.where('published_date < ?', '3 Jan 2019'.to_date)

Let say Book model has_many tags.

Note: This will only call 1 query to preload all tags for the books collections. ActiveRecord::Associations::Preloader.new.preload(books, :tags)

Now, each book has tags association preloaded.

books.first.tags.loaded? # return true
books.second.tags.loaded? # return true

Manual preloading has an advantage for eager-loading associations for a polymorphic/STI associations.

ActiveRecord::Associations::Preloader
  .new.preload(vehicles.select { |record| record.manufacturer.respond_to?(:car_seat_suppliers) }, 
               manufacturer: :car_seat_suppliers)

In above case, manufacturer, it can be any kind of manufacturer, bus manufacturer, car manufacturer, bike manufacturer, etc. Not all type of manufacturer have car_seat_suppliers associations, some do, some don’t. By doing this, we make sure, we correctly preload car_seat_suppliers only to manufacturer that have such association/method.

Pitfalls

  • By using eager-loading, we query all associations of a model/table before-hand and load them all to memory. So the trade-off here is that we will have bigger memory usage. Be very careful of this.
  • Beware of eager-loading of table with abnormal indexing behaviour which needs USE INDEX (use_index('PRIMARY')) to retrieve bulk of them.