Class: Kiba::Extend::Utils::StringNormalizer
- Inherits:
-
Object
- Object
- Kiba::Extend::Utils::StringNormalizer
- Defined in:
- lib/kiba/extend/utils/string_normalizer.rb
Overview
Normalizes the given string according to the given parameters.
Can be used two ways. Preferred method when using in a transform or other context when the same normalization settings will be used to normalize many strings:
# First initialize an instance of the class as an instance variable in
# your context
@normalizer = Kiba::Extend::Utils::StringNormalizer.new(
xforms: [:blank]
)
# for the repetitive part:
vals.each{ |val| @normalizer.call(val) }
For one-off usage, testing normalization logic, or where the normalization settings vary per normalized value, you can do:
Kiba::Extend::Utils::StringNormalizer.call(
xforms: [:blank], str: 'Card table'
)
=> 'Cardtable'
The second way is much less performant, as it initializes a new instance of the class every time it is called.
Class Method Summary collapse
-
.call(str:, mode: nil, replacements: {}, xforms: []) ⇒ Object
Defined xforms.
Instance Method Summary collapse
-
#call(val) ⇒ Object
-
#initialize(mode: nil, replacements: {}, xforms: []) ⇒ StringNormalizer
constructor
Defined xforms.
Constructor Details
#initialize(mode: nil, replacements: {}, xforms: []) ⇒ StringNormalizer
Defined xforms
- :nfkc - ON BY DEFAULT: Applies Unicode compatibility decomposition, followed by canonical composition; See https://unicode.org/reports/tr15/ for more details than you want.
- :replace - ON BY DEFAULT: performs find-and-replace operations
specified in
replacementsparameter - :blank - deletes all spaces and tabs, using Ruby /\pBlank/ regexp
- :lower - downcase the string
- :nonword - removes ALL characters that are not letters, numbers, or underscores
- :punct - removes all characters matching Ruby /\pPunct/ regexp
- :to_ascii - replaces non-ASCII characters with an ASCII approximation, or if none exists, a replacement character which defaults to “?”.
Defined modes
- :cspaceid - replaces weird characters that don’t convert to ASCII properly, :to_ascii, :nonword, :lower
150 151 152 153 154 155 |
# File 'lib/kiba/extend/utils/string_normalizer.rb', line 150 def initialize(mode: nil, replacements: {}, xforms: []) @mode = mode @replacements = replacements @xforms = %i[nfkc replace] + xforms apply_mode_settings end |
Class Method Details
.call(str:, mode: nil, replacements: {}, xforms: []) ⇒ Object
Defined xforms
- :nfkc - ON BY DEFAULT: Applies Unicode compatibility decomposition, followed by canonical composition; See https://unicode.org/reports/tr15/ for more details than you want.
- :replace - ON BY DEFAULT: performs find-and-replace operations
specified in
replacementsparameter - :blank - deletes all spaces and tabs, using Ruby /\pBlank/ regexp
- :lower - downcase the string
- :nonword - removes ALL characters that are not letters, numbers, or underscores
- :punct - removes all characters matching Ruby /\pPunct/ regexp
- :to_ascii - replaces non-ASCII characters with an ASCII approximation, or if none exists, a replacement character which defaults to “?”.
Defined modes
- :cspaceid - replaces weird characters that don’t convert to ASCII properly, :to_ascii, :nonword, :lower
112 113 114 115 116 117 118 |
# File 'lib/kiba/extend/utils/string_normalizer.rb', line 112 def call(str:, mode: nil, replacements: {}, xforms: []) new( mode: mode, replacements: replacements, xforms: xforms ).call(str) end |
Instance Method Details
#call(val) ⇒ Object
157 158 159 160 161 |
# File 'lib/kiba/extend/utils/string_normalizer.rb', line 157 def call(val) return val if val.blank? xforms.inject(val) { |res, nv| do_xform(res, nv) } end |