Morfologik converted

The site hosts two versions of the converted dictionary of Morfologik 1.7. The original project is hosted at http://morfologik.blogspot.com/ and the author is Marcin Miłkowski.

The dictionary has been converted to the IKIPI tagset. The conversion has been performed by Adam Radziszewski and Marek Maziarz. The data are licenced on the same terms as the original Morfologik data: the user is free to choose between GNU LGPL or Creative Commons Share Alike.

The files have been converted into the intermediate “IKIPI” tagset. The working analyser configs are bundled with Maca — you need to install Maca and use the morfo1222-ikipi configuration. Example usages:

echo "Zjadłaś dwa śledzie. Znikły bez śladu." | maca-analyse morfo1222-ikipi --split -o xces > out-ikipi.xml
echo "Zjadłaś dwa śledzie. Znikły bez śladu." | maca-analyse morfo1222-ikipi --split -o xces | maca-convert ikipi2kipi.conv -o xces > out-kipi.xml

The first command outputs in the IKIPI tagset, the other — in KIPI (IPIC unchanged). Maca is also able to process pre-morph-like XML files (maca -i premorph-stream).

Documentation

  • Adam Radziszewski, Marek Maziarz, Developing free morphological data for Polish (draft). To appear in Cognitive Studies, vol. 11, ed. Violetta Koseska-Toszewa, SOW, Warszawa 2011. morfol.pdf
  • Adam Radziszewski, Conversion of Morfologik data into the IKIPI tagset: conversion.pdf
  • Marek Maziarz, Adam Radziszewski, Opis morfo-syntaktyczny liczebników zbiorowych oraz zaimków osobowych na potrzeby konwersji danych słownika Morfologik do tagsetu KIPI: MACA_liczebniki_zaimki.pdf

Acknowledgement

The project is financed by the National Centre for Research and Development (NCBiR) agreement SP/I/1/77065/10.

morfol_1222_lc.txt.bz2 - Morfologik 1.7 converted into KIPI tagset, LOWERCASED version from 22.12.2010 (16 MB) Adam Radziszewski, 22 Dec 2010 12:25

morfol_1222_case.txt.bz2 - Morfologik 1.7 converted into KIPI tagset, ORIGINAL-CASE version from 22.12.2010 (16.1 MB) Adam Radziszewski, 22 Dec 2010 12:25

tag-checker.cpp Magnifier - Util to validate morpho dictionary entries from stdin (requires Maca, compile: g++ -lmaca tag-checker.cpp -o tag-checker) (3.33 KB) Adam Radziszewski, 26 Jan 2011 13:40

MACA_liczebniki_zaimki.pdf - Opis morfo-syntaktyczny liczebników zbiorowych oraz zaimków osobowych na potrzeby konwersji danych słownika Morfologik do tagsetu KIPI (127 KB) Adam Radziszewski, 26 Jan 2011 16:19

conversion.pdf - Conversion of Morfologik data into the IKIPI tagset (476 KB) Adam Radziszewski, 27 Jan 2011 20:18

morfol.pdf - Developing free morphological data for Polish (113 KB) Adam Radziszewski, 04 Apr 2011 15:28