Machine Translation

My naam is Dwayne en ek is van Suid-Afrika

Stage 1: Morphologic analysis – each surface form returns the lema.

So... My + <Pos> naam + Noun + singular is (verb, infinitive, To be...) +verbe +<present tense> Dwayne +name +<proper tense>

Reka Kojz protice kroz ovaj grad se zove Amstel The river that runs through the city is call Amstel

1.Morphologic analasys – takes 2-3 months to make close to 80% coverage a.List base forms b.and list inflectionary patterns of the word 2.Process speech disambiguation a.Rule based – This preposition governs the accusative then remove the normative case b.Statistical – In this corpeus (manually disambiguation), accusative and pronoun. Disambiguate based on the part of speech 3.Chunk output into Pseudo phrases (phrase, relative, verb, propositional phrase, pronoun, verbe, noun phrase. 4.Translate the words (word for word translation or multiwords translated as one unit – eg. Compact disk) 5.So...

Reka = River Kojzo = that/which Protice = to flow Koz = through o vaj = this grad = city/hail/(weapon system) – symantic vs. syntactical ambiguity se = calls itself zove = be (is) called Amstel = Amstel

6.Transfer = Inflect, make passive. Leave or add 7.Generation

Dealing with Ambiguity 1.Tactical corpus – Human disambuguation, freely available and distributable 2.Rule based 3.Statistical

Recommendation 1.Project to create one linguistic data “Free Linguistic Data” – to address the challenges of accessing data that in some cases may be tax payer's anyway. 2.Crafting a policy advocacy statement by bringing together activists and advocates for languages 3.Open is not enough – Education and research clauses on contracts may be restrictive – they should be completely open source

Interest in working with smaller languages Looking for available morphological resources (texts and analyzers), bilingual dictionaries Tagged bilingual word lists

Reka N F Nom Kojzo Rel Protice Verb + Lema = citation, base form of the word, word you look up in the dictionary Kroz Preoposition o vaj Demonstrative masxulinem singular, normative and acoustic, determinant Grad Masculine noun (5 or 6 cases depending on the slavic case) The boy hit the dog, or the dog hit the boy is not important because of the cases. While in English it does change the meaning of the word grad Noun, masculine, singular, mornative, acoustic, zove verb, singular, could be noun, plural, normative. Amstel Noun, masculine, singular,