Class: Kiba::Extend::Sources::JsonDir

Inherits:
Object
  • Object
show all
Extended by:
Sourceable
Defined in:
lib/kiba/extend/sources/json_dir.rb

Overview

Note:

May return Hashes having different keys, which will cause problems writing out to Destinations::CSV, which expects all rows to have the same headers/fields. Using Transforms::Clean::EnsureConsistentFields in any job that has a JsonDir source and a Destinations::CSV destination will protect against these errors. Jobs::JsonToCsvJob runs this transform automatically as the last step before writing out rows

Given path to a directory, JSON parses each file matching given specifications, and returns the result as a Hash with:

  • empty values converted to nil
  • keys downcased and converted to Symbols

Nothing is done to handle non-String data structures as the values of top-level keys in the JSON documents. If using CSV destination, such values are written out as text string versions of their Ruby representations. That is, your CSV field value might be:

{"65"=>{"title"=>"C1", "file"=>"66.jp2"}, "67"=>{"title"=>"C2", "file"=>"67.jp2"}}

If you need to work with such a value in subsequent jobs (i.e. reading the string back in from the CSV), you can do something like:

transform do |row|
  val = row[:codestringfield]
  next row if val.blank?

  # Given the example above, this will convert `val` to a Ruby Hash
  code = instance_eval(val)
  # whatever additional code you need to process the data
  row
end

Since:

  • 4.0.0

Class Method Summary collapse

Instance Method Summary collapse

Methods included from Sourceable

is_source?

Methods included from Registry::Fileable

#default_args, #default_file_options, #labeled_options, #options_key, #path_key, #requires_path?

Constructor Details

#initialize(dirpath:, recursive: false, filesuffixes: [".json"]) ⇒ JsonDir

Returns a new instance of JsonDir.

Parameters:

  • dirpath (String)

    path of directory containing JSON files

  • recursive (Boolean) (defaults to: false)

    Whether to include eligible JSON files in subdirectories

  • filesuffixes (Array<String>) (defaults to: [".json"])

    to read in as JSON records. Include preceding period

Since:

  • 4.0.0



75
76
77
78
79
# File 'lib/kiba/extend/sources/json_dir.rb', line 75

def initialize(dirpath:, recursive: false, filesuffixes: [".json"])
  @path = File.expand_path(dirpath)
  @recursive = recursive
  @filesuffixes = filesuffixes
end

Class Method Details

.default_file_optionsObject

Since:

  • 4.0.0



53
54
55
# File 'lib/kiba/extend/sources/json_dir.rb', line 53

def default_file_options
  nil
end

.options_keyObject

Since:

  • 4.0.0



57
58
59
# File 'lib/kiba/extend/sources/json_dir.rb', line 57

def options_key
  nil
end

.path_keyObject

Since:

  • 4.0.0



61
62
63
# File 'lib/kiba/extend/sources/json_dir.rb', line 61

def path_key
  :dirpath
end

.requires_path?Boolean

Returns:

  • (Boolean)

Since:

  • 4.0.0



65
66
67
# File 'lib/kiba/extend/sources/json_dir.rb', line 65

def requires_path?
  true
end

Instance Method Details

#each {|jsonhash| ... } ⇒ Object

Note:

If a file cannot be read/parsed as JSON, no Hash is yielded and a warning is written to STDOUT

Yield Parameters:

  • jsonhash (Hash)

    of parsed JSON

Since:

  • 4.0.0



84
85
86
87
88
89
90
91
92
93
# File 'lib/kiba/extend/sources/json_dir.rb', line 84

def each
  file_list.each do |path|
    jsonhash = parse_json(path)
    if jsonhash
      yield jsonhash
    else
      warn("Cannot read/parse #{path}")
    end
  end
end