Delegating long-running jobs in Rails

Although Ruby on Rails provides a complete stack for building web applications, there will often be scenarios that don’t fit well in the request/response cycle that Rails handles so well. An example of this is generating large reports or other CPU intensive tasks. Keeping the user waiting for the request to complete is not only bad for the user experience, it also keeps a process that can be used to handle more requests idle while it waits for the response to be generated. This negatively effects the scalability of an application and can quickly bring throughput to a crawl.

Luckily, there are many plugins available for Rails to address such issues. I recently came across delayed_job from the folks at Shopify which serves this purpose really well. I liked this plugin because it uses a table as a sort of queue, giving you an easy way to manage and adminster the jobs with the built-in Rails scaffolding and a little tweaking.

First off, we’ll start with the bort base rails application which gives us users and authentication out of the box. After you’ve downloaded and set it up, we’ll install the delayed_job plugin into your vendor/plugins folder. You can simply download and extract it into the plugins folder.

If you’re using git you can accomplish the steps above using the following commands:

$ git clone git://github.com/fudgestudios/bort.git
$ cd bort
$ ./script/plugin install git://github.com/tobi/delayed_job.git
$ ./script/generate migration CreateDelayedJobs

Next, open the generated migration (in db/migrate), and add the following in the self.up method:

create_table :delayed_jobs, :force => true do |table|
  table.integer  :priority, :default => 0      # Allows some jobs to jump to the front of the queue
  table.integer  :attempts, :default => 0      # Provides for retries, but still fail eventually.
  table.text     :handler                      # YAML-encoded string of the object that will do work
  table.string   :last_error                   # reason for last failure (See Note below)
  table.datetime :run_at                       # When to run. Could be Time.now for immediately, or sometime in the future.
  table.datetime :locked_at                    # Set when a client is working on this object
  table.datetime :failed_at                    # Set when all retries have failed (actually, by default, the record is deleted instead)
  table.string   :locked_by                    # Who is working on this object (if locked)
  table.timestamps
end

Of course, you should also add the corresponding drop command in the self.down method:

drop_table :delayed_jobs

Next, we’ll create a new domain model (with scaffolding) so that we can tie each long running process with the user it belongs to. We’ll use the generate scaffold command as follows:

$ /script/generate scaffold UserJob name:string, data:text

This command generated a migration (among other things), let’s go into that now and add:

t.references :user, :job

These two references will allow us to tie the user job with the user and the delayed job.

Now, we can update the UserJob model (in app/models/user_job.rb) and add the following to the class definition:

belongs_to :job, :class_name => "Delayed::Job"
belongs_to :user
def ready?
  job == nil
end

What we’re doing here with the ready? method is based on the way delayed_job works by default. The job record is deleted once it has successfully completed, so we can use that to indicate to the user that the work is complete.

Now we are ready to create the class that will actually do this work. The delayed_job plugin will actually serialize this object and save it in the delayed_jobs table. You can create this class anywhere, I’ll put it in app/models/report_job.rb:

class ReportJob < Struct.new(:user_job_id, :name, :params)
  def perform
    user_job = UserJob.find(user_job_id)
    user_job.update_attribute :name, name
    mydata = "something that took a long time to generate"
    user_job.update_attribute :data, mydata
  end
end

Now, to create a delayed_job from any controller action:

@user_job = UserJob.create(:user => current_user)
@job = Delayed::Job.enqueue ReportJob.new(@user_job.id, name, params)
@user_job.job = @job
@user_job.save!

What we’re doing above is first creating the UserJob, since we will have to pass that to the ReportJob before we put it in the queue. One of the parameters we pass to the ReportJob is the UserJob’s id, which we use in the perform method to find that UserJob again. Finally, once the output has been produced we update the data attribute of the UserJob with it.

Finally, we’ll need to update the scaffold view for the user job so we can see the output. Open up app/views/user_jobs/index.html.erb and add the following before the destroy link:

<td><%= link_to('Download', download_user_job_path(user_job)) if user_job.ready? %></td>

We’ll have to add this download action to the controller as well, so open app/controllers/user_jobs_controller.rb and add:

def download
  user_job = UserJob.find(params[:id])
  send_data user_job.data, :type => "text/plain", :filename => "mydata.txt"
end

Finally open config/routes.rb and update the user_jobs route to look like this:

map.resources :user_jobs, :member => {:download => :get}

And that should be all to get you started! Just run the migrations and start the server:

$ rake db:migrate
$ ./script/server