Class: Kiba::Extend::Transforms::Deduplicate::Fields

Inherits:
Object
  • Object
show all
Defined in:
lib/kiba/extend/transforms/deduplicate/fields.rb

Overview

Removes the value(s) of source from targets

Input table:

| x   | y   | z   |
|-----+-----+-----|
| a   | a   | b   |
| a   | a   | a   |
| a   | b;a | a;c |
| a;b | b;a | a;c |
| a   | aa  | bat |
| nil | a   | nil |
|     | ;a  | b;  |
| a   | nil | nil |
| a   | A   | a   |

Used in pipeline as:

transform Deduplicate::Fields, source: :x, targets: %i[y z], multival: true, sep: ';'

Results in:

| x   | y   | z   |
|-----+-----+-----|
| a   | nil | b   |
| a   | nil | nil |
| a   | b   | c   |
| a;b | nil | c   |
| a   | aa  | bat |
| nil | a   | nil |
|     | a   | b   |
| a   | nil | nil |
| a   | A   | nil |

Input table:

| x | y | z |
|---+---+---|
| a | A | a |
| a | a | B |

Used in pipeline as:

transform Deduplicate::Fields,
   source: :x,
   targets: %i[y z],
   multival: true,
   sep: ';',
   casesensitive: false

Results in:

| x | y   | z   |
|---+-----+-----|
| a | nil | nil |
| a | nil | B   |

Instance Method Summary collapse

Constructor Details

#initialize(source:, targets:, casesensitive: true, multival: false, sep: Kiba::Extend.delim) ⇒ Fields

Returns a new instance of Fields.

Parameters:

  • source (Symbol)

    name of field containing value to remove from target fields

  • targets (Array<Symbol>)

    names of fields to remove source value(s) from

  • casesensitive (Boolean) (defaults to: true)

    whether matching should be case sensitive

  • multival (Boolean) (defaults to: false)

    whether to treat as multi-valued

  • sep (String) (defaults to: Kiba::Extend.delim)

    used to split/join multi-val field values



84
85
86
87
88
89
90
91
# File 'lib/kiba/extend/transforms/deduplicate/fields.rb', line 84

def initialize(source:, targets:, casesensitive: true,
  multival: false, sep: Kiba::Extend.delim)
  @source = source
  @targets = targets
  @casesensitive = casesensitive
  @multival = multival
  @sep = sep
end

Instance Method Details

#process(row) ⇒ Object

Parameters:

  • row (Hash{ Symbol => String, nil })


94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
# File 'lib/kiba/extend/transforms/deduplicate/fields.rb', line 94

def process(row)
  sourceval = row.fetch(@source, nil)
  return row if sourceval.nil?

  targetvals = @targets.map { |target| row.fetch(target, nil) }
  return row if targetvals.compact.empty?

  sourceval = @multival ? sourceval.split(@sep,
    -1).map(&:strip) : [sourceval.strip]
  targetvals = if @multival
    targetvals.map do |val|
      val.split(@sep, -1).map(&:strip)
    end
  else
    targetvals.map { |val| [val.strip] }
  end

  if sourceval.blank?
    targetvals = targetvals.map { |vals| vals.reject(&:blank?) }
  elsif @casesensitive
    targetvals = targetvals.map { |vals| vals - sourceval }
  else
    sourceval = sourceval.map(&:downcase)
    targetvals = targetvals.map do |vals|
      vals.reject do |val|
        sourceval.include?(val.downcase)
      end
    end
  end

  targetvals = if @multival
    targetvals.map { |vals| vals&.join(@sep) }
  else
    targetvals.map(&:first)
  end
  targetvals = targetvals.map { |val| val.blank? ? nil : val }

  targetvals.each_with_index { |val, i| row[@targets[i]] = val }

  row
end