Module: Kiba::Extend::Jobs

Extended by:
Parser
Defined in:
lib/kiba/extend/jobs.rb,
lib/kiba/extend/jobs/job.rb,
lib/kiba/extend/jobs/parser.rb,
lib/kiba/extend/jobs/runner.rb,
lib/kiba/extend/jobs/base_job.rb,
lib/kiba/extend/jobs/marc_job.rb,
lib/kiba/extend/jobs/reporter.rb,
lib/kiba/extend/jobs/show_me_job.rb,
lib/kiba/extend/jobs/tell_me_job.rb,
lib/kiba/extend/jobs/job_segmenter.rb,
lib/kiba/extend/jobs/dependency_job.rb,
lib/kiba/extend/jobs/json_to_csv_job.rb,
lib/kiba/extend/jobs/multi_source_prep_job.rb

Overview

Reusable, composable patterns for jobs

Heretofore, I have been repeating tons of code/logic for setting up a job in migration code:

  • Defining sources/destinations, @srcrows, @outrows
  • Changing CSV rows to hashes (initial transforms)
  • Changing hashes back to CSV rows
  • Calling postprocessing

Most of this never changes, and when it does there is way too much tedious work in a given migration to make it consistent across all jobs.

This is an attempt to dry up calling jobs and make it possible to test them via RSpec

Running Kiba.parse to define a job generates a Kiba::Control object, which is a wrapper bundling together: pre_processes, config, sources, transforms, destinations, and post_processes.

As described here, pre_ and post_processors get called once per ETL run—either before or after the ETL starts working through the source rows

This Kiba::Control object created by Kiba.parse is generated with a particular Kiba::Context, and once created, you cannot get access to or manipulate variables or configuration that the entire job needs to know about.

What Kiba::Extend::Jobs adds is the ability to set up reusable initial_transformers and final_transformers. Basically, job templates where just the meat of the transformations change.

files is the configuration of destination, source, and lookup files the job will use. It is a Hash, with the following format:

``` { source: [registry_key, registry_key], destination: [registry_key], lookup: [registry_key] } ~~~

OR

 { source: [registry_key, registry_key],
   destination: [registry_key]
 }

source and destination must each have at least one registry key. lookup may be omitted, or it may be included with one or more registry keys

transformer is a sequence of data transformations that could theoretically be called with interchangable input/output settings (i.e. materials).

In project code, instead of defining an entire job in a Kiba.parse block, you will define a Kiba.job_segment block containing just the transforms unique to that job.

Since:

  • 2.2.0

Defined Under Namespace

Modules: DependencyJob, JobSegmenter, Parser, Reporter, Runner, ShowMeJob, TellMeJob Classes: BaseJob, Job, JsonToCsvJob, MarcJob, MultiSourcePrepJob

Method Summary

Methods included from Parser

parse_job