Apr 30, 2020

Rails to Ruby

#programming #rails

~ views | ~ words

I have some rake tasks that sit alongisde a Rails app that fetch data from remote sources and stuff into the database. These tasks sun in a background worker every 5 minutes. In Rails, rake tasks can load the entire Rails environment with the environment option:

task some_task: :environment do
  SomeClass.do_my_thing!
end

My task was to move these tasks from the infrastructure where the Rails app lives into a CI-like environment that executes short-lived tasks more efficiently, and exposes better logs. The purpose of this move was to be able to debug errors in the data syncs more easily, restart them on demand, and to isolate logs away from the web server.

Because the new system does not have access to the production database server, I had to make my rake tasks run independently and POST the data to the Rails server at a privileged API endpoint.

Intro aside, the biggest hurdle I ran into during this conversion was to load code without Rails' autoloading feature. Rails does this fantastic thing where you can reference any class from anywhere. Code in Controllers or classes in lib/ can reference models, Models can reference classes from installed gems, etc. There are a set of rules that govern this automatic loading behavior and its configurable with config.autoload_paths, but the point is that it's rare to need explicit require statements.

Converting my rake task to run without loading the Rails environment made it painfully obvious that organizing code that doesn't come with its own structure is challenging! The good news is that I ended up with a decent pattern for using require and require_relative effectively:

Use require for files outside the module and require_relative for files inside the same module.

To illustrate, here's a sample directory structure:

thing_1
  - foo.rb
  - bar.rb
thing_2
  - baz.rb
thing_1.rb
thing_2.rb

Note that thing_1 has both a root level file and a root level directory. The file thing_1.rb looks like this:

# thing_1.rb
Dir["./thing_1/**/*.rb"].each { |f| require f }

module Thing1
end

and thing1/foo.rb looks like this:

### thing1/foo.rb
module Thing1
  class Foo
  end
end

If thing_2/baz.rb wants to use the Thing1::Foo class, it can require it like this:

# thing2/baz.rb
require './thing_1'

but if thing_1/bar.rb wants to use Thing1::Foo (inside the same module), it can require it like this:

# thing1/bar.rb
require_relative './foo'

The tradeoff for this is that thing_1 is all-or-nothing. thing_2 cannot import Thing1::Foo without also importing Thing1::Bar. I tend to think this tradeoff is worth it, and if a module gets too big, it's time to think of how to break it down further.

Another thing worth noting here is that require always runs relative to $CWD, whereas require_relative runs relative to __FILE__ (the current file). So if thing2/baz.rb really wanted to require only thing1/foo.rb (and not thing1/bar.rb), its two options would be:

require_relative '../thing1/foo.rb'
# or
require './thing1/foo.rb'

I found that it was easier to constrain myself to requireing only the top level "loader" files than to reach into other modules and require indivudual files and have to remember the right path to use.