By Anna Feldman

Whereas supervised corpus-based tools are hugely exact for various NLP tasks, together with morphological tagging, they're tough to port to different languages simply because they require assets which are dear to create. accordingly, many languages haven't any real looking prospect for morpho-syntactic annotation within the foreseeable destiny. the tactic provided during this booklet goals to beat this challenge through considerably proscribing the mandatory information and in its place extrapolating the proper details from one other, comparable language. The method has been demonstrated on Catalan, Portuguese, and Russian. even if those languages are just rather resource-poor, an analogous technique will be in precept utilized to any inflected language, so long as there's an annotated corpus of a comparable language to be had. Time wanted for adjusting the approach to a brand new language constitutes a fragment of the time wanted for structures with wide, manually created assets: days rather than years. This ebook touches upon a few issues: typology, morphology, corpus linguistics, contrastive linguistics, linguistic annotation, computational linguistics and traditional Language Processing (NLP). Researchers and scholars who're attracted to those clinical parts in addition to in cross-lingual reports and functions will drastically reap the benefits of this paintings. students and practitioners in computing device technological know-how and linguistics are the potential readers of this publication.

Show description

Read Online or Download A Resource-Light Approach to Morpho-Syntactic Tagging PDF

Best study & teaching books

Advanced Mathematical Thinking (Mathematics Education Library)

This publication is the 1st significant research of complicated mathematical considering as played by means of mathematicians and taught to scholars in senior highschool and collage. Its 3 major elements concentrate on the nature of complicated mathematical considering, the speculation of its cognitive improvement, and reports of cognitive study.

How to Teach English Language Learners: Effective Strategies from Outstanding Educators, Grades K-6 (Jossey-Bass Teacher)

This hands-on booklet bargains academics a much-needed source that might support maximize studying for English Language freshmen (ELLs). find out how to train English Language rookies attracts on wide-ranging instructor caliber stories and profiles 8 educators who've completed unparalleled effects with their ELL scholars.

501 geometry questions

This entire advisor is designed for an individual wanting extra perform whereas learning or clean geometry abilities. just like present titles within the sequence, 501 Geometry Questions is helping organize for educational assessments and builds challenge fixing talents. every one query is followed by way of an entire solution rationalization with an absolutely displayed resolution.

A Grammar of Dolakha Newar

A Grammar of Dolakha Newar is the 1st totally entire reference grammar of a Newar style. Dolakha Newar is of specific curiosity because it is member of the at the same time unintelligible japanese department of the relations, so permits an incredible comparative viewpoint in this major Tibeto-Burman language.

Additional resources for A Resource-Light Approach to Morpho-Syntactic Tagging

Example text

Both stacked classifiers and voting schemes behave similarly in that they mainly correct uncommon error types. 15% to 18% error reduction was achieved in the experiments with stacked classifiers. Sjöbergh (2003a) concludes that combining taggers by voting or training a new stacked classifier increases the number of errors of some of the common errors types, but removes many more errors of uncommon types. This leads to fewer total errors and a concentration of errors to fewer error types. This property is useful.

However, it is worth noting that an ensemble can be more accurate than its component classifiers only if the individual classifiers disagree with one another (Hansen and Salamon 1990). Many methods for constructing ensembles have been developed. Some methods are general, and they can be applied to any learning algorithm. Other methods are specific to particular algorithms. What follows is an overview of various approaches to constructing ensembles. 1 Subsampling of training examples One of the general techniques is subsampling the training examples.

They treat morphological analysis as an alignment task in a large corpus, combining four similarity measures based on expected frequency distributions, context, morphologicallyweighted Levenshtein distance, and an iteratively bootstrapped model of affixation and stem-change probabilities. They divide this task into three separate steps: 1. Estimate a probabilistic alignment between inflected forms and root forms in a given language. 2. Train a supervised morphological analysis learner on a weighted subset of these aligned pairs.

Download PDF sample

Rated 4.69 of 5 – based on 18 votes