Skip to content

Latest commit

 

History

History
195 lines (151 loc) · 7.61 KB

output.adoc

File metadata and controls

195 lines (151 loc) · 7.61 KB

Emendate’s output

parse output

By default, Emendate.parse returns an Emendate::Result Ruby object.

d = Emendate.parse('[1997, 1999]')
=> #<Emendate::Result:0x00007fdfb4102db8
 @dates=
  [#<Emendate::ParsedDate:0x00007fdfb4103060
    @certainty=[:inferred],
    @date_end=nil,
    @date_end_full="1997-12-31",
    @date_start=nil,
    @date_start_full="1997-01-01",
    @inclusive_range=nil,
    @index_dates=[],
    @original_string=nil>,
   #<Emendate::ParsedDate:0x00007fdfb4102ef8
    @certainty=[:inferred],
    @date_end=nil,
    @date_end_full="1999-12-31",
    @date_start=nil,
    @date_start_full="1999-01-01",
    @inclusive_range=nil,
    @index_dates=[],
    @original_string=nil>],
 @errors=[],
 @original_string="[1997, 1999]",
 @warnings=[]>

To return the result as a Hash:

d = Emendate.parse('[1997, 1999]').to_h
=> {:original_string=>"[1997, 1999]",
 :dates=>
  [{:original_string=>nil,
    :index_dates=>[],
    :date_start=>nil,
    :date_end=>nil,
    :date_start_full=>"1997-01-01",
    :date_end_full=>"1997-12-31",
    :inclusive_range=>nil,
    :certainty=>[:inferred]},
   {:original_string=>nil,
    :index_dates=>[],
    :date_start=>nil,
    :date_end=>nil,
    :date_start_full=>"1999-01-01",
    :date_end_full=>"1999-12-31",
    :inclusive_range=>nil,
    :certainty=>[:inferred]}],
 :errors=>[],
 :warnings=>[]}

Or, to return the result as JSON:

d = Emendate.parse('[1997, 1999]').to_json
=> "{\"original_string\":\"[1997, 1999]\",\"dates\":[{\"original_string\":null,\"index_dates\":[],\"date_start\":null,\"date_end\":null,\"date_start_full\":\"1997-01-01\",\"date_end_full\":\"1997-12-31\",\"inclusive_range\":null,\"certainty\":[\"inferred\"]},{\"original_string\":null,\"index_dates\":[],\"date_start\":null,\"date_end\":null,\"date_start_full\":\"1999-01-01\",\"date_end_full\":\"1999-12-31\",\"inclusive_range\":null,\"certainty\":[\"inferred\"]}],\"errors\":[],\"warnings\":[]}"

The Emendate result Hash has the following keys:

:original_string

the full original string passed in for parsing

:errors

will only be present if :result is empty. Will give some indication why parsing failed.

:warnings

may be present if :result is populated.

:dates

an Array of parsed date values

Comparison with TimeTwister result

The Emendate.parse command’s output is based on Timetwister's output, with some important differences.

Timetwister.parse returns an Array of Hashes, with one Hash per date or date range identified in the parsed date String:

r = Timetwister.parse('[1997, 1999]')
 => [{:original_string=>"[1997", :index_dates=>[1997], :date_start=>"1997", :date_end=>"1997", :date_start_full=>"1997-01-01", :date_end_full=>"1997-12-31", :inclusive_range=>nil, :certainty=>"inferred", :test_data=>"70"}, {:original_string=>" 1999]", :index_dates=>[1999], :date_start=>"1999", :date_end=>"1999", :date_start_full=>"1999-01-01", :date_end_full=>"1999-12-31", :inclusive_range=>nil, :certainty=>"inferred", :test_data=>"70"}]

Emendate’s dates Array is quite similar to TimeTwister’s result, with the following differences:

  • Individual date Hash does not contain an :original_string value, at least initially, given that the original string was not split up into discrete date values.

  • Only :date_start_full and date_end_full are initially provided.

  • :index_dates is not initially provided, given I don’t currently have a direct use case for it at present

  • :test_data is not included

  • :certainty is an array of applicable certainty values: approximate, inferred, uncertain, instead of Timetwister’s one string value

r = Timetwister.parse('[circa 1920?]')
 => [{:original_string=>"[circa 1920?]", :index_dates=>[1920], :date_start=>"1920", :date_end=>"1920", :date_start_full=>"1920-01-01", :date_end_full=>"1920-12-31", :inclusive_range=>nil, :certainty=>"questionable", :test_data=>"70"}]

Emendate.parse('[circa 1920?]')
=> #<Emendate::Result:0x00007fdfd4aaabc0
 @dates=
  [#<Emendate::ParsedDate:0x00007fdfd4aaaee0
    @certainty=[:inferred, :approximate, :uncertain],
    @date_end=nil,
    @date_end_full="1920-12-31",
    @date_start=nil,
    @date_start_full="1920-01-01",
    @inclusive_range=nil,
    @index_dates=[],
    @original_string=nil>],
 @errors=[],
 @original_string="[circa 1920?]",
 @warnings=[]>

If the date cannot be parsed, Timetwister returns a date hash with all values empty or nil except :original_string. In this case Emendate will return an empty Array for :result, making it clearer whether a usable parsed date is returned.

Timetwister.parse('n.d.')
 => [{:original_string=>"n.d.", :index_dates=>[], :date_start=>nil, :date_end=>nil, :date_start_full=>nil, :date_end_full=>nil, :inclusive_range=>nil, :certainty=>nil}]

Emendate.parse('n.d.')
=> #<Emendate::Result:0x00007fdfb41682a8 @dates=[], @errors=[], @original_string="n.d.", @warnings=[]>

Examples of Emendate errors and warnings

Errors indicate the date could not be parsed at all for some reason. This usually will either mean the string is just not a date, or Emendate doesn’t yet understand how to deal with the given date format.

Warnings indicate that the date was parsed, but that you may want to verify the result is as expected.

Emendate.parse('not a date')
=> #<Emendate::Result:0x00007fdfd4bc4da8
 @dates=[],
 @errors=[#<Emendate::UntokenizableError: “not”, “a”, “date”>],
 @original_string="not a date",
 @warnings=[]>

Emendate.parse('2/3/2021')
=> #<Emendate::Result:0x00007fdfd4ae6c38
 @dates=
  [#<Emendate::ParsedDate:0x00007fdfd4ae6da0
    @certainty=[],
    @date_end=nil,
    @date_end_full="2021-02-03",
    @date_start=nil,
    @date_start_full="2021-02-03",
    @inclusive_range=nil,
    @index_dates=[],
    @original_string=nil>],
 @errors=[],
 @original_string="2/3/2021",
 @warnings=["Ambiguous month/day treated as_month_day"]>

translate output

Important
Still very much under development!

Recieving an Emendate::Result can be useful for integration with your custom data workflow.

However, one of the purposes of Emendate is to translate from some (or no!) date format to another date format. For this purpose, a simpler, pre-processed result is more useful.

For this use case, you can call translate (with dialect option set), and you will receive an Emendate::Translation object with the following attributes:

orig

the original String given

result

typically a String, but some dialects may return another object type.

warnings

array of warning messages

Note

If the original string could not be parsed or cannot be translated, Emendate::Translation.result will be an empty String (or whatever the equivalent of empty is for the object type returned by the specified dialect). Warnings indicating why a result could not be returned will be in Emendate::Translation.warnings.

target dialects for translation

:lyrasis_pseudo_edtf

A pseudo-standard format developed by LYRASIS, documented here. Provides a way to record dates in legacy (currently hosted) Islandora that will be unambiguous to translate into EDTF for migration into the new Islandora

Emendate::Translation.result is a String

:edtf

Emendate::Translation.result is a String containing a valid EDTF value.

:collectionspace

Emendate::Translation.result is (tbd) ready for passing into collectionspace-mapper