КОРПУСНЫЙ МЕТОД АВТОМАТИЧЕСКОГО МОРФОЛОГИЧЕСКОГО АНАЛИЗА ФЛЕКТИВНЫХ ЯЗЫКОВ

Ольга Ивановна Бабина, Никита Юрьевич Дюмин

Аннотация


Предложен метод автоматического морфологического анализа для языков
флективного строя. Особенностью метода является работоспособность при отсут-
ствии лексикона основ/псевдооснов, что достигается использованием корпуса тек-
ста на анализируемом языке.


Ключевые слова


автоматический морфологический анализ, автоматическая обработка текста, флективный язык, корпусные методы.

Полный текст:

PDF

Литература


Белоногов Г.Г., Кузнецов Б.А. Языковые

средства автоматизированных информационных систем.

М.: Наука, 1983. 288 с.; Шереметьева С.О. Методология

минимизации усилий в инженерной лингвистике: дис. …

д-ра филол. наук. СПб., 1997. 288 с.; Krovetz R. (2000).

Viewing morphology as an inference process. In Artificial

Intelligence, 118, 277–294; Dasgupta Sajib, Mumit Khan.

(2004). Feature Unification for Morphological Parsing in

Bangla. In Proceedings of 7th International Conference on

Computer and Information Technology (Dhaka, Bangladesh);

Attia, Mohammed A. (2006). An Ambiguity-

Controlled Morphological Analyzer for Modern Standard

Arabic Modelling Finite State Networks. In The Challenge

of Arabic for NLP/MT Conference (London, UK), 48–67;

Paikens P. (2007). Lexicon-Based Morphological Analysis

of Latvian Language. In Proceedings of the 3rd Baltic Conference

on Human Language Technologies (Kaunas, Lithuania,

–5 October, 2007), 235–240.

См., напр., Sheremetyeva S., Nirenburg S., Nirenburg I.

(1996). Generating Patent Claims From Interactive Input. In

Proceedings of the 8th International Workshop on Natural

Language Generation (Herstmonceux, Sussex, June 1996),

–70; Mihalcea R. (2003). The Role of Non-Ambiguous

Words in Natural Language Disambiguation. In Proceedings

of the Conference on Recent Advances in Natural Language

Processing, RANLP 2003 (September 2003, Borovetz, Bulgaria).

(http://www.cse.unt.edu/~rada/ papers/mihalcea.

ranlp03.pdf) и др.

См. Koskenniemi K. (1990). Finite-State Parsing and Disambiguation.

In Proceedings of COLING-90, Vol. 2, 229–

; Lauri Karttunen. (1993). Finite state constraints. In

John A. Goldsmith (ed.), The Last Phonological Rule, Chicago:

University of Chicago Press, 173–194; Abney S.

(1996). Part-of-speech tagging and partial parsing. In

G.K. Church, S. Young (ed.), Corpus-based methods in

language and speech. Kluwer academic publishers, Dordrecht;

Dasgupta Sajib, Mumit Khan. (2004). Feature Unification

for Morphological Parsing in Bangla. In Proc. 7th

ICCIT; Beesley K.R., Karttunen L. (2003). Finite State Morphology.

Stanford, CA: CSLI Publications, 2003. 505 p.;

Attia, Mohammed A. (2006). An Ambiguity-Controlled

Morphological Analyzer for Modern Standard Arabic Modelling

Finite State Networks. In In The Challenge of Arabic

for NLP/MT Conference (London, UK), 48–67; Köprü

Selçuk, Jude Miller. (2009). A Unification Based Approach

to Morphological Analysis and Generation of Arabic. In

CAASL3: Third Workshop on Computational Approaches to

Arabic- Script-based Languages (Ottawa, Canada, August

, 2009).

Beesley K.R. 2001. Finite-State Morphological Analysis

and Generation of Arabic at Xerox Research: Status and

Plans in 2001. In ACL Workshop on Arabic Language

Processing: Status and Perspective (Toulouse, France), 1–8;

Habash Nizar, Owen Rambow, and George Kiraz. (2005).

Morphological Analysis and Generation for Arabic Dialects.

(http://www1.cs.columbia.edu/~rambow/papers/magead-ws

pdf)

Mohammed Attia (2006). An Ambiguity-Controlled Morphological

Analyzer for Modern Standard Arabic Modelling

Finite State Networks. In The Challenge of Arabic for

NLP/MT Conference (London, UK), 48–67.

Шереметьева С.О., Ниренбург С. Эмпирическое моде-

лирование в вычислительной морфологии // НТИ. 1996.

№ 7; Белоногов Г.Г. Итоги науки и техники. Серия

«Информатика». 1984. № 8.

Напр., Krovetz R. (2000). Viewing morphology as an inference

process. In Artificial Intelligence, 118, 277–294.

Напр., Sheremetyeva S., Nirenburg S., Nirenburg I.

(1996). Generating Patent Claims From Interactive Input. In

Proceedings of the 8th International Workshop on Natural

Language Generation (Herstmonceux, Sussex, June 1996),

–70.

См. Подробнее Kirby J. (2006). Minimal Redundancy in

Word-Based Morphology. (http://home.uchicago.edu/

~jkirby/docs/ morph_rewrite.pdf)

Kazakov D. (1997). Unsupervised Learning of Naïve

Morphology with Genetic Algorithms. In Workshop Notes of

the ECML/MLnet workshop on empirical learning of Natural

Language Processing Task (Prague, Czech Republic,

April 1997), 105–112; Goldsmith J. (2001). Unsupervised

Learning of the Morphology of a Natural Language. In

Computational Linguistics, 27(2), 153–198.

Например, Zhao, Jian, and Xiao-Long Wang. (2002).

Chinese POS Tagging Based on Maximum Entropy Model.

In Proceedings of the First International Conference on

Machine Learning and Cybernetics, Beijing, 4–5 November

, 601–605; Masuyama Takeshi, and Hiroshi Nakagawa.

(2004). Two Step POS Selection from SVM Based Text

Categorization. In IEICE Trans. Inf. & Syst. Special Issue on

Information Processing Technology for Web Utilization,

Vol. E87-D, No. 2, February 2004. (http://www.r.dl.itc.u-tokyo.

ac.jp/~nakagawa/academic-res/masuyama-ieice-04.pdf);

Klami Mikaela, and Krista Lagus. (2006). Unsupervised

Word Categorization Using Self-Organizing Maps and Automatically

Extracted Morphs. In E. Corchado et al. (eds.)

IDEAL 2006, LNCS 4224. – Berlin;Heidelberg: Springer-

Verlag, 2006, 912–919.

Goldsmith, J. (2001). Unsupervised Learning of the Morphology

of a Natural Language. In Computational Linguistics,

(2), 153–198.

Ford A., & Singh R. (1991). Propedeutique Morphologique.

Folia Linguistica, 25 (3–4), 549–575; Neuvel Sylvain

(2002). Whole Word Morphologizer: Expanding the Word-

Based Lexicon: A Nonstochastic Computational Approach.

In Brain and Language 81, 454–463.

Бабина О.И., Дюмин Н.Ю. Нестрого аддитивный под-

ход к автоматическому морфологическому анализу

флективных языков // Материалы 5-й Междунар. науч.-

практ. конф. «Наука и современность-2010». Секция

«Филологические науки». – Новосибирск: Центр разви-

тия научного сотрудничества, 2010, 12–17.


Ссылки

  • На текущий момент ссылки отсутствуют.