Class: Kiba::Extend::Transforms::CombineValues::FromFieldsWithDelimiter

Inherits:
Object
  • Object
show all
Includes:
Allable
Defined in:
lib/kiba/extend/transforms/combine_values/from_fields_with_delimiter.rb

Overview

Note:

Do not use with both prepend_source_field_name and deduplicate set to true. There is no way to safely interpret the desired behavior with this combination of options.

Combine values from given fields into the target field.

This is like the CONCATENATE function in many spreadsheets. The given delim value is used as a separator between the combined values.

Note: Used with defaults, this has the same function as FullRecord, but deletes the source fields. FullRecord retains source fields by default.

If target field has the same name as one of the source fields, and delete_sources = true, no values are lost. The target field is not deleted.

Blank/nil values are dropped. If prepend_source_field_name = true, names of blank/nil fields are omitted

Examples:

With defaults

# Used in pipeline as:
# transform CombineValues::FromFieldsWithDelimiter
xform = CombineValues::FromFieldsWithDelimiter.new
input = [
  {name: "Weddy", sex: "m", source: "adopted"},
  {source: "hatched", sex: "f", name: "Niblet"},
  {source: "", sex: "m", name: "Tiresias"},
  {name: "Keet", sex: nil, source: "hatched"},
  {name: "", sex: nil, source: nil}
]
result = input.map{ |row| xform.process(row) }
expected = [
  {index: "Weddy m adopted"},
  {index: "Niblet f hatched"},
  {index: "Tiresias m"},
  {index: "Keet hatched"},
  {index: nil}
]
expect(result).to eq(expected)

Prepending field names

# Used in pipeline as:
# transform CombineValues::FromFieldsWithDelimiter,
#   prepend_source_field_name: true
xform = CombineValues::FromFieldsWithDelimiter.new(
  prepend_source_field_name: true
)
input = [
  {name: "Weddy", sex: "m", source: "adopted"},
  {source: "hatched", sex: "f", name: "Niblet"},
  {source: "", sex: "m", name: "Tiresias"},
  {name: "Keet", sex: nil, source: "hatched"},
  {name: "", sex: nil, source: nil}
]
result = input.map{ |row| xform.process(row) }
expected = [
  {index: "name: Weddy sex: m source: adopted"},
  {index: "name: Niblet sex: f source: hatched"},
  {index: "name: Tiresias sex: m"},
  {index: "name: Keet source: hatched"},
  {index: nil}
]
expect(result).to eq(expected)

With custom sources, a source as target, and delim

# Used in pipeline as:
# transform CombineValues::FromFieldsWithDelimiter,
#  sources: %i[name sex],
#  target: :name,
#  delim: ", "
xform = CombineValues::FromFieldsWithDelimiter.new(
  sources: %i[name sex],
  target: :name,
  delim: ", "
)
input = [
  {name: "Weddy", sex: "m", source: "adopted"},
  {source: "hatched", sex: "f", name: "Niblet"},
  {source: "", sex: "m", name: "Tiresias"},
  {name: "Keet", sex: nil, source: "hatched"},
  {name: "", sex: nil, source: "na"}
]
result = input.map{ |row| xform.process(row) }
expected = [
  {name: "Weddy, m", source: "adopted"},
  {name: "Niblet, f", source: "hatched"},
  {name: "Tiresias, m", source: ""},
  {name: "Keet", source: "hatched"},
  {name: nil, source: "na"}
]
expect(result).to eq(expected)

Deduplicating combined values

# Used in pipeline as:
# transform CombineValues::FromFieldsWithDelimiter,
#  sources: %i[p1 p2 p3 p4],
#  target: :p,
#  delim: "|",
#  deduplicate: true,
#  dedupe_delim: ";"
xform = CombineValues::FromFieldsWithDelimiter.new(
  sources: %i[p1 p2 p3 p4],
  target: :p,
  delim: "|",
  deduplicate: true,
  dedupe_delim: ";"
)
input = [
  {p1: "a", p2: "b", p3: "b", p4: "a"},
  {p1: "a;b", p2: "b|b", p3: "b;b", p4: "a|a"}
]
result = input.map{ |row| xform.process(row) }
expected = [
  {p: "a|b"},
  {p: "a|b"}
]
expect(result).to eq(expected)

Deduplicate without separate dedupe_delim

# Used in pipeline as:
# transform CombineValues::FromFieldsWithDelimiter,
#  sources: %i[p1 p2 p3 p4],
#  target: :p,
#  delim: "|",
#  deduplicate: true
xform = CombineValues::FromFieldsWithDelimiter.new(
  sources: %i[p1 p2 p3 p4],
  target: :p,
  delim: "|",
  deduplicate: true
)
input = [
  {p1: "a", p2: "b", p3: "b", p4: "a"},
  {p1: "a;b", p2: "b|b", p3: "b|b;b", p4: "a|a"}
]
result = input.map{ |row| xform.process(row) }
expected = [
  {p: "a|b"},
  {p: "a;b|b|b;b|a"}
]
expect(result).to eq(expected)

ERROR when prepending and deduplicating

# Used in pipeline as:
# transform CombineValues::FromFieldsWithDelimiter,
#  sources: %i[p1 p2 p3 p4],
#  target: :p,
#  delim: "|"
#  deduplicate: true,
#  prepend_source_field_name: true
xform = CombineValues::FromFieldsWithDelimiter
params = {
    sources: %i[p1 p2 p3 p4],
    target: :p,
    delim: "|",
    deduplicate: true,
    prepend_source_field_name: true
  }
expect{ xform.new(**params) }.to raise_error(
  Kiba::Extend::UnsafeParameterComboError
)

Direct Known Subclasses

FullRecord

Instance Method Summary collapse

Constructor Details

#initialize(sources: :all, target: :index, delim: " ", prepend_source_field_name: false, delete_sources: true, deduplicate: false, dedupe_delim: nil) ⇒ FromFieldsWithDelimiter

Returns a new instance of FromFieldsWithDelimiter.

Parameters:

  • sources (Array<Symbol>, :all) (defaults to: :all)

    Fields whose values are to be combined

  • target (Symbol) (defaults to: :index)

    Field into which the combined value will be written. May be one of the source fields

  • delim (String) (defaults to: " ")

    Value used to separate individual field values in combined target field

  • prepend_source_field_name (Boolean) (defaults to: false)

    Whether to insert the source field name before its value in the combined value. Field names are NOT prepended to nil or blank values. Since 4.0.0

  • delete_sources (Boolean) (defaults to: true)

    Whether to delete the source fields after combining their values into the target field. If target field name is the same as one of the source fields, the target field is not deleted. Since 4.0.0

  • deduplicate (Boolean) (defaults to: false)

    Whether to deduplicate field values that will be combined before combining them.

  • dedupe_delim (String) (defaults to: nil)

    on which to split individual field values for deduplication. Can be omitted if this is the same as delim value, as that will also be applied AFTER this value by default.



187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
# File 'lib/kiba/extend/transforms/combine_values/from_fields_with_delimiter.rb', line 187

def initialize(sources: :all, target: :index, delim: " ",
  prepend_source_field_name: false, delete_sources: true,
  deduplicate: false, dedupe_delim: nil)
  @fields = [sources].flatten
  @target = target
  @delim = delim
  @del = delete_sources
  @prepend = prepend_source_field_name
  @deduplicate = deduplicate
  @dedupe_delim = dedupe_delim

  if prepend && deduplicate
    raise Kiba::Extend::UnsafeParameterComboError,
      "Do not run #{self.class.name} with both deduplicate and "\
      "prepend_source_field_name set to true"
  end
end

Instance Method Details

#process(row) ⇒ Object

Parameters:

  • row (Hash{ Symbol => String, nil })


206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
# File 'lib/kiba/extend/transforms/combine_values/from_fields_with_delimiter.rb', line 206

def process(row)
  finalize_fields(row) unless fields_set

  fieldvals = fields.map { |field| field_and_value(row, field) }
    .compact
    .to_h
  fields.each { |src| row.delete(src) } if del
  row[target] = combined_value(fieldvals)

  # if prepend
  #   pvals = []
  #   vals.each_with_index do |val, i|
  #     val = "#{fields[i]}: #{val}" unless val.nil?
  #     pvals << val
  #   end
  #   vals = pvals
  # end
  # val = vals.compact.join(delim)
  # row[target] = val.empty? ? nil : val

  row
end