Sidekiq Batches with Sub-Batches. Simple Way to Organize Code

Sidekiq Batches with Sub-Batches. Simple Way to Organize Code

Sidekiq Batches is a feature in the Sidekiq job processing framework that allows you to process jobs in batches. With Sidekiq Batches, you can group a set of jobs together and execute them as a single batch, enabling you to manage dependencies and perform batch processing more efficiently. In this article, we’ll share our experience in using this tool in a real project and give you some insights.

The Challenge

In one of our projects, we have a synchronization process that ensures seamless communication between multiple systems. This is a common scenario where you need to synchronize a list of resources across different services. Naturally, we utilize Sidekiq to handle this background process.

The algorithm is quite straightforward:

  1. Retrieve a list of resources;
  2. Initiate synchronization for each resource.

The second step can also be executed individually if only a single resource needs to be synchronized.

Our resources include:

  • University: UniversitySync class
  • Faculties: FacultySync class
  • Departments: DepartmentsSyncDepartmentSync
  • Courses: CoursesSyncCourseSync
  • Professors: ProfessorsSyncProfessorSync
  • Students: StudentsSyncStudentSync
  • Enrollments: EnrollmentsSyncEnrollmentSync

These granular classes make it easy to scale the synchronization process across various resource types. However, from a business standpoint, we can’t perform synchronization for all resources in parallel due to dependencies among them. For instance, Students must be synchronized before Enrollments, and both University and Faculties need to be synchronized before any other resources.

02.05 Sidekiq Batches with sub batches img3 1 development

So, how do we maintain this synchronization order while using the existing architecture with vanilla Sidekiq?

One simple approach is to introduce delays between synchronization (and originally it was implemented in this way before). However, this method comes with its own set of challenges:

  1. Unpredictability. Delays are difficult to manage and maintain because the required time for each synchronization process might vary. This can lead to situations where a dependent resource starts synchronizing before its parent resource, causing inconsistencies.
  2. Inefficiency. Implementing fixed delays can cause resources to wait idly, even if their dependent resources have already been synchronized. This leads to wasted processing time and resources.
  3. Unclear completion. We can’t define where the whole synchronization process could be finished.

However, Sidekiq (Pro and Enterprise versions) has a Batches mechanism, which is more suitable for this case. Thus, we can combine work in a batch with sub-batches and close the gaps described before.

Learn more about how Sidekiq suggests developing complex workflows with batches and sub-batches here.

It’s pretty neat, but there are several downsides with the provided code organization:

  • Callbacks are all in one class, which theoretically simplifies things a bit, but in realitya massive list of batches and sub-batches creates a mess. From a testing perspective it would be better to have separate steps in each class.
  • Developers need to worry about batches’ API, so in order to create a sub-batch you need to:
    – initialize the parent batch `Sidekiq::Batch.new(status.parent_bid)`
    – wrap the sub-batch under `jobs` from the parent batch and wrap other workers under `jobs` as a sub-batch.

Here’s an example:

02.05 Sidekiq Batches with sub batches img1 1 development

Implementing a simple wrapper

Possible requirements for the wrapper:

  • An ability to define batches and sub-batches in one class
  • Having callbacks for the batch near the batch class
    02.05 Sidekiq Batches with sub batches img2 2 development

With this wrapper, we can define batches with sub-batches this way:

02.05 Sidekiq Batches with sub batches img4 1 development

As you can see, batches and sub-batches classes look almost the same, because a mess with Sidekiq Batch API is hidden in the Base class. 

Hopefully, you found this article helpful. Follow our blog and don’t miss new insightful posts!

Get the best content once a month!

Once a month you will receive the most important information on implementing your ideas, evaluating opportunities, and choosing the best solutions!

Farm with pullers

University ‘Farm of the Future’ Tests Precision Tech for Corn, Soybeans

The US Department of Agriculture (USDA) is funding a new collaboration between two institutes and a research center at the University of Illinois Urbana-Champaign that will create an integrated farm…
business requirements analysis

Business Requirements Analysis: 8 Powerful Steps

Business Requirements Analysis is a systematic approach used to identify, analyze, and document the needs and objectives of a business or organization. It involves gathering information, understanding business processes, and…
Growth

Gig Worker Applications Blossom Worldwide

Quinyx has raised $50 million to join the likes of Stint, InstaWorks Belgium and Gigpro, which are serving the fast-growing gig worker marketplace.
cybersecurity

Don’t Assume You’re Not a Potential Victim of Cyberattacks

A lot of smaller businesses believe they are below the radar screen of cyber attackers. This recent article by Kara Oosterhuis provides some good advice.

Create a product for your customers