LatinVerb

Having conceived of the conjugation work in Ruby, the first step was to start plugging it in in IRB.

>> %w(s t mus tis nt).collect{ |x| "ama#{x}" }.unshift('amo')
=> ["amo", "amas", "amat", "amamus", "amatis", "amant"]

From this initial naïve implementation, I decided to focus on capturing the macrons and the macron shortening rules. This required the ability to preserve Unicode macron characters and then operate on them. Ruby 1.8's support for Unicode has some real issues, namely it returns Unicode characters as their ASCII value, not as a character, for a number of String methods. Thus I wrote a Unicode string abstraction, and then came the macron handling functions. Once these were in place, I was able to keep a reliable representation of the core “givens” of a verb, this library is called LatinVerb. Here's a pretty-print of what a Latin::LatinVerb looks like:

>> pp @aFirst
#<Latin::LatinVerb:0x3778a0
@collections=[],
@conjugation="1",
@first_pers_perf=
 #<Latin::LatinWord:0x3777ec
  @USS="am\304\201v\304\253",
  @detect_multibyte_array=[0, 0, 1, 0, 1],
  @length=5,
  @multibyte=true,
  @original_string="am\304\201v\304\253",
  @sanitized_array=["a", "m", "\304\201", "v", "\304\253"],
  @uncertain_string_as_array=[97, 109, 257, 118, 299]>,
@first_pers_singular=
 #<Latin::LatinWord:0x377828
  @USS="am\305\215",
  @detect_multibyte_array=[0, 0, 1],
  @length=3,
  @multibyte=true,
  @original_string="am\305\215",
  @sanitized_array=["a", "m", "\305\215"],
  @uncertain_string_as_array=[97, 109, 333]>,
@four_pp=
 [#<Latin::LatinWord:0x377828
   @USS="am\305\215",
   @detect_multibyte_array=[0, 0, 1],
   @length=3,
   @multibyte=true,
   @original_string="am\305\215",
   @sanitized_array=["a", "m", "\305\215"],
   @uncertain_string_as_array=[97, 109, 333]>,
  #<Latin::LatinWord:0x377814
   @USS="am\304\201re",
   @detect_multibyte_array=[0, 0, 1, 0, 0],
   @length=5,
   @multibyte=true,
   @original_string="am\304\201re",
   @sanitized_array=["a", "m", "\304\201", "r", "e"],
   @uncertain_string_as_array=[97, 109, 257, 114, 101]>,
  #<Latin::LatinWord:0x3777ec
   @USS="am\304\201v\304\253",
   @detect_multibyte_array=[0, 0, 1, 0, 1],
   @length=5,
   @multibyte=true,
   @original_string="am\304\201v\304\253",
   @sanitized_array=["a", "m", "\304\201", "v", "\304\253"],
   @uncertain_string_as_array=[97, 109, 257, 118, 299]>,
  #<Latin::LatinWord:0x3774cc
   @USS="amatum",
   @detect_multibyte_array=[0, 0, 0, 0, 0, 0],
   @length=6,
   @multibyte=false,
   @original_string="amatum",
   @sanitized_array=["a", "m", "a", "t", "u", "m"],
   @uncertain_string_as_array=[97, 109, 97, 116, 117, 109]>],
@participial_stem=
 #<Latin::LatinWord:0x3762c0
  @USS="am\304\201",
  @detect_multibyte_array=[0, 0, 1],
  @length=3,
  @multibyte=true,
  @original_string="am\304\201",
  @sanitized_array=["a", "m", "\304\201"],
  @uncertain_string_as_array=[97, 109, 257]>,
@pass_perf_part=
 #<Latin::LatinWord:0x3774cc
  @USS="amatum",
  @detect_multibyte_array=[0, 0, 0, 0, 0, 0],
  @length=6,
  @multibyte=false,
  @original_string="amatum",
  @sanitized_array=["a", "m", "a", "t", "u", "m"],
  @uncertain_string_as_array=[97, 109, 97, 116, 117, 109]>,
@pres_act_inf=
 #<Latin::LatinWord:0x377814
  @USS="am\304\201re",
  @detect_multibyte_array=[0, 0, 1, 0, 0],
  @length=5,
  @multibyte=true,
  @original_string="am\304\201re",
  @sanitized_array=["a", "m", "\304\201", "r", "e"],
  @uncertain_string_as_array=[97, 109, 257, 114, 101]>,
@stem=
 #<Latin::LatinWord:0x376374
  @USS="am\304\201",
  @detect_multibyte_array=[0, 0, 1],
  @length=3,
  @multibyte=true,
  @original_string="am\304\201",
  @sanitized_array=["a", "m", "\304\201"],
  @uncertain_string_as_array=[97, 109, 257]>,
@voice_mood_matrix=
 #<TenseBlock:0x376694
  @boundaries=["voice", "mood"],
  @default_action=#<Proc:0x000247ac@./latin/LatinVerb.rb:156>,
  @definition=
   {:default_p=>#<Proc:0x000247ac@./latin/LatinVerb.rb:156>,
    :mood=>["Indicative", "Subjunctive"],
    :tense=>nil,
    :voice=>["Active", "Passive"],
    :boundaries=>"voice by mood"},
  @matrix=
   ["active_voice_indicative_mood",
    "active_voice_subjunctive_mood",
    "passive_voice_indicative_mood",
    "passive_voice_subjunctive_mood"],
  @tense_wrapping_lambda=#<Proc:0x00016a80@./language/TenseBlock.rb:83>>>	

The next step was to figure out how to access all the individual nodes of every voice/mood/tense vector. These individual nodes are sub-divided into person and number. During an idea jam at the Lone Star Ruby Conference, I decided that a flexible metaprogramming based resolution would be best.

  def method_missing(id, *args)    
    # We expect that method calls will be made to the object in the form:
    # V.active_voice_indicative_mood_present_tense.  Instead of having to 
    # create all these methods, we will look for unknown methods containing 
    # the pattern _voice_
 
    if id.to_s =~ /_voice_/
      
      @vector = id.to_s
      
      # This assignation needs to be done each call to the metaprog.
      # method because of the event that a verb was called for
      # a tense vector, and then a node vector.  The first instance's 
      # filling in of @number, @person will cause the second call to do
      # the wrong thing
      
      @number = @person = nil
      evaluate_method(id)

      # In the case that the instance has been used before to get 
      # a specific vector, then we want to clear it out
      @collections=[] if  not @person.nil? and not @collections.nil?
      
      generated_method = [@voice,@mood,@tense].join('_').to_sym
      
      raise("Method #{generated_method.to_} is not responded to!") unless 
        self.respond_to?(generated_method)

      raise ("FLAMING DETH:  pass to handler method returned nothing!") if 
        self.send(generated_method.to_sym).nil?
      
        
      @collections << 
        conjoin_nodes_with_labels(
          self.send(generated_method),
          TenseBlock.new( {
                              :boundaries => 'numbers by persons',
                              :numbers    => %w(Singular Plural),
                              :persons    => %w(First Second Third),
                              :tense      => 'present',
                            } 
                          )
        )
    else
      super(id)
    end
  end	

By setting up just a few methods and creating a display block class (TenseBlock) I was able to cover all the tenses. This, in LatinIRB, allows us to flexibly load an entire tense of nodes, e.g. “present tense,” or a single “fully qualified vector”, e.g. “active voice, indicative mood, present tense, singular number, first person”, i.e. amō.

>> puts @aFirst.definition_string
amō, amāre, amāvī, amatum
=> nil
>> @aFirst.indicative_mood_active_voice_imperfect_tense

=> [#<Latin::LatinTense:0x53c9c4
@verb_methods=["singular_number_first_person",
"singular_number_second_person",
"singular_number_third_person",
"plural_number_first_person",
"plural_number_second_person",
"plural_number_third_person"], @label="Indicative mood active
voice imperfect tense", @aggregate_nodes=[#<Latin::LatinWord:0x546e9c
@sanitized_array=["a", "m", "\304\201",
"b", "a", "m"], @multibyte=true,
@original_string="am\304\201bam", @length=6,
@USS="am\304\201bam", @detect_multibyte_array=[0, 0, 1, 0, 0, 0],
@uncertain_string_as_array=[97, 109, 257, 98, 97, 109]>,
#<Latin::LatinWord:0x546e88 @sanitized_array=["a", "m",
"\304\201", "b", "\304\201", "s"],
@multibyte=true, @original_string="am\304\201b\304\201s", @length=6,
@USS="am\304\201b\304\201s", @detect_multibyte_array=[0, 0, 1, 0, 1,
0], @uncertain_string_as_array=[97, 109, 257, 98, 257, 115]>,
#<Latin::LatinWord:0x54680c @sanitized_array=["a", "m",
"\304\201", "b", "a", "t"],
@multibyte=true, @original_string="am\304\201bat", @length=6,
@USS="am\304\201bat", @detect_multibyte_array=[0, 0, 1, 0, 0, 0],
@uncertain_string_as_array=[97, 109, 257, 98, 97, 116]>,
#<Latin::LatinWord:0x545d1c @sanitized_array=["a", "m",
"\304\201", "b", "\304\201", "m",
"u", "s"], @multibyte=true,
@original_string="am\304\201b\304\201mus", @length=8,
@USS="am\304\201b\304\201mus", @detect_multibyte_array=[0, 0, 1, 0,
1, 0, 0, 0], @uncertain_string_as_array=[97, 109, 257, 98, 257, 109, 117,
115]>, #<Latin::LatinWord:0x5451dc @sanitized_array=["a",
"m", "\304\201", "b", "\304\201",
"t", "i", "s"], @multibyte=true,
@original_string="am\304\201b\304\201tis", @length=8,
@USS="am\304\201b\304\201tis", @detect_multibyte_array=[0, 0, 1, 0,
1, 0, 0, 0], @uncertain_string_as_array=[97, 109, 257, 98, 257, 116, 105,
115]>, #<Latin::LatinWord:0x544598 @sanitized_array=["a",
"m", "\304\201", "b", "a",
"n", "t"], @multibyte=true,
@original_string="am\304\201bant", @length=7,
@USS="am\304\201bant", @detect_multibyte_array=[0, 0, 1, 0, 0, 0,
0], @uncertain_string_as_array=[97, 109, 257, 98, 97, 110, 116]>]]
>> puts @aFirst Indicative mood active voice imperfect tense:
amābam, amābās, amābat, amābāmus,
amābātis, amābant
#<Latin::LatinVerb:0x377864>
=> nil
>>  
@aFirst.subjunctive_mood_active_voice_pastperfect_tense_first_person_
plural_number => [#<Latin::LatinWord:0x375d48 @sanitized_array=["a", "m", "\304\201", "v", "i", "s", "s", "\304\223", "m", "u", "s"], @multibyte=true, @original_string="am\304\201viss\304\223mus", @length=11, @USS="am\304\201viss\304\223mus", @detect_multibyte_array=[0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0], @uncertain_string_as_array=[97, 109, 257, 118, 105, 115, 115, 275, 109, 117, 115]>] >> puts @aFirst amāvissēmus #<Latin::LatinVerb:0x377864> => nil

The above example was produced using the LatinIRB tool. It was the natural outgrowth of the building of these libraries.

To make the tool more universally usable, I then ported this application to the Ruby on Rails stack. That project is Lingua Latina in Viis Ferrorum.

LatinVerb was written by Steven G. Harms because he loves Latin and Ruby and wanted to carry one less book in his bookbag. Icons provided by pinvoke.