Class: Kiba::Extend::Transforms::Helpers::PersonNameChecker

Inherits:
Object
  • Object
show all
Defined in:
lib/kiba/extend/transforms/helpers/person_name_checker.rb

Overview

Returns true/false indicating whether given value matches any given or added patterns. Used on a list of names and name-like values, where standard inverted name entry patterns are followed, this works ok. It will not work at all on directly- entered names. Used on a list of subject like terms or on freetext, be wary of false positives (though the patterns and the duplicative anchoring matching tries to avoid matching subject-like terms

The default name list provided is all unique first names from the data set at https://www.ssa.gov/OACT/babynames/limits.html which have been on more than 100 Social Security card applications from 1880-2022. So there is a definite U.S. bias.

Since:

  • 4.0.0

Constant Summary collapse

DEFAULT_PATTERNS =

rubocop:disable Layout/LineLength

Since:

  • 4.0.0

[]
ANTIPATTERNS =

Since:

  • 4.0.0

[
  /^\d/,
  /^\w+\.?$/
]
FAMILY_PATTERNS =

Since:

  • 4.0.0

[
  / famil(ies|y)/i
]

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(added_patterns: [], family_is_person: false, name_lists: [File.join(Kiba::Extend.ke_dir, "data", "us_names_1880-2022_gt100.txt")], mode: :strict, order: :inverted) ⇒ PersonNameChecker

Returns a new instance of PersonNameChecker.

Parameters:

  • added_patterns (Array<Regexp>) (defaults to: [])

    non-standard regexp to check against. Best practice is to add these to this helper via a pull request if you think they generally indicate organization-ness

  • family_is_person (Boolean) (defaults to: false)

    whether names with terms indicating family-ness are treated as persons (false for CollectionSpace, potentially true for other applications)

  • name_lists (Array<String>) (defaults to: [File.join(Kiba::Extend.ke_dir, "data", "us_names_1880-2022_gt100.txt")])

    paths to file(s) containing known given name Strings

  • mode (:strict, :lenient) (defaults to: :strict)

    :strict requires the given name to be in an expected position (as per the order parameter) for the value to be flagged as a name. :lenient will flag the value as a name if any words in the value match values in the name_lists

  • order (:direct, :inverted) (defaults to: :inverted)

    expected order of names. Has no effect if mode=:lenient

Since:

  • 4.0.0



66
67
68
69
70
71
72
73
74
75
76
77
# File 'lib/kiba/extend/transforms/helpers/person_name_checker.rb', line 66

def initialize(added_patterns: [], family_is_person: false,
  name_lists: [File.join(Kiba::Extend.ke_dir, "data",
    "us_names_1880-2022_gt100.txt")],
  mode: :strict, order: :inverted)
  base = DEFAULT_PATTERNS + added_patterns
  @patterns = family_is_person ? base + FAMILY_PATTERNS : base
  anti = ANTIPATTERNS
  @antipatterns = family_is_person ? anti : anti + FAMILY_PATTERNS
  @names = set_up_names(name_lists)
  @mode = mode
  @order = order
end

Class Method Details

.call(value:, added_patterns: [], family_is_person: false) ⇒ Object

Since:

  • 4.0.0



26
27
28
29
30
31
32
33
34
35
# File 'lib/kiba/extend/transforms/helpers/person_name_checker.rb', line 26

def call(
  value:,
  added_patterns: [],
  family_is_person: false
)
  new(
    added_patterns: added_patterns,
    family_is_person: family_is_person
  ).call(value)
end

Instance Method Details

#call(value) ⇒ true, false

Parameters:

  • value (String)

Returns:

  • (true)

    if value matches a person pattern

  • (false)

    otherwise

Since:

  • 4.0.0



82
83
84
85
86
87
88
89
90
91
92
# File 'lib/kiba/extend/transforms/helpers/person_name_checker.rb', line 82

def call(value)
  return false if value.blank?
  return false if antipatterns.any? do |pattern|
    value.match?(pattern)
  end
  return true if patterns.any? do |pattern|
    value.match?(pattern)
  end

  (mode == :lenient) ? lenient_check(value) : strict_check(value)
end