Class: Kiba::Extend::Transforms::Clean::EnsureConsistentFields

Inherits:
Object
  • Object
show all
Defined in:
lib/kiba/extend/transforms/clean/ensure_consistent_fields.rb

Overview

Note:

This transform runs in memory, so for very large sources, it may take a long time or fail.

Ensures each output Hash/row has the same keys. This is important for writing out to Destinations::CSV, which expects all rows to have the same headers

Examples:

# Used in pipeline as:
# transform Clean::EnsureConsistentFields
xform = Clean::EnsureConsistentFields.new
input = [
  {foo: 'foo', bar: 'bar'},
  {baz: 'baz', boo: 'boo'}
]
result = Kiba::StreamingRunner.transform_stream(
  input, xform
).map{ |row| row }
expected = [
  {foo: 'foo', bar: 'bar', baz: nil, boo: nil},
  {foo: nil, bar: nil, baz: 'baz', boo: 'boo'}
]
expect(result).to eq(expected)

Since:

  • 4.0.0

Instance Method Summary collapse

Constructor Details

#initializeEnsureConsistentFields

Returns a new instance of EnsureConsistentFields.

Since:

  • 4.0.0



33
34
35
36
# File 'lib/kiba/extend/transforms/clean/ensure_consistent_fields.rb', line 33

def initialize
  @keys = {}
  @rows = []
end

Instance Method Details

#close {|evened_row| ... } ⇒ Object

Yield Parameters:

  • evened_row (Hash{ Symbol => String, nil })

Since:

  • 4.0.0



49
50
51
52
53
54
55
56
# File 'lib/kiba/extend/transforms/clean/ensure_consistent_fields.rb', line 49

def close
  @allfields = keys.keys

  rows.each do |row|
    evened_row = add_fields(row)
    yield evened_row
  end
end

#process(row) ⇒ Object

Returns Nil.

Parameters:

  • row (Hash{ Symbol => String, nil })

Returns:

  • Nil

Since:

  • 4.0.0



40
41
42
43
44
45
46
# File 'lib/kiba/extend/transforms/clean/ensure_consistent_fields.rb', line 40

def process(row)
  @keys = keys.merge(row.keys
              .map { |key| [key, nil] }
              .to_h)
  @rows << row
  nil
end