Class: Kiba::Extend::Transforms::Deduplicate::FieldValues

Inherits:
Object
  • Object
show all
Defined in:
lib/kiba/extend/transforms/deduplicate/field_values.rb

Overview

Note:

This is NOT safe for use with groupings of fields whose multi-values are expected to be the same length

Removes duplicate values within the given field(s)

Processes one field at a time. Splits value on sep, and keeps only the unique values

Input table:

| foo         | bar       |
|-------------------------|
| 1;1;1;2;2;2 | a;A;b;b;b |
|             | q;r;r     |
| 1           | 2         |
| 1           | 2         |

Used in pipeline as:

  transform Deduplicate::FieldValues, fields: %i[foo bar], sep: ';'

Results in:

| foo   | bar     |
|-----------------|
| 1;2   | a;A;b   |
|       | q;r     |
| 1     | 2       |
| 1     | 2       |

Instance Method Summary collapse

Constructor Details

#initialize(fields:, sep:) ⇒ FieldValues

Returns a new instance of FieldValues.

Parameters:

  • fields (Array<Symbol>)

    names of fields in which to deduplicate values

  • sep (String)

    used to split/join multivalued field values



47
48
49
50
# File 'lib/kiba/extend/transforms/deduplicate/field_values.rb', line 47

def initialize(fields:, sep:)
  @fields = [fields].flatten
  @sep = sep
end

Instance Method Details

#process(row) ⇒ Object

Parameters:

  • row (Hash{ Symbol => String, nil })


53
54
55
56
57
58
59
# File 'lib/kiba/extend/transforms/deduplicate/field_values.rb', line 53

def process(row)
  @fields.each do |field|
    val = row.fetch(field)
    row[field] = val.to_s.split(@sep).uniq.join(@sep) unless val.nil?
  end
  row
end