Evaluation¶
The evaluation was performed following the methodology proposed in Radziszewski and Acedański, 2012. Some additional details are also described in this dissertation (in Polish).
The procedure treats the whole tagger as a black box and the reported accuracy values include errors made at any possible level, including tokenisation errors, deficiencies of morphological analyser and unknown words tagged incorrectly.
The evaluation performed here was performed on the National Corpus of Polish (NKJP).
How to reproduce this? Read Evaluation_procedure
NKJP 1.0¶
This evaluation was performed using NKJP 1.0, using exactly the same data set and set-up as described there.
In this experiment WCRFT was evaluated in the older configuration (nkjp.ini), the newer configuration (nkjp_s2.ini) performs slightly better (more importantly, the new configuration requires less memory, so it is recommended).
Tagger | Re-analysis | Acc lower bound | Acc upper bound | Acc lower known | Acc lower unknown |
---|---|---|---|---|---|
PANTERA | no | 88.79% | 89.09% | 91.08% | 14.70% |
YES | 88.99% | 89.28% | 91.27% | 14.74% | |
WMBT no guess | no | 87.50% | 87.82% | 89.78% | 13.57% |
YES | 88.75% | 89.08% | 91.07% | 13.62% | |
WMBT + guess | no | 88.44% | 88.76% | 89.89% | 41.43% |
YES | 89.71% | 90.04% | 91.20% | 41.45% | |
WCRFT | YES | 90.34% | 90.67% | 91.89% | 40.13% |
PANTERA stands for the morphosyntactic tagger based on Brill's Algorithm adapted for morphologically rich languages, using threshold of 6 (recommended by the author)
WMBT no guess corresponds to WMBT with no guessing (as descibed in the LTC'11 paper)
WMBT guess is the most recent version that includes guessing of unknown words
WCRFT is the tagger available on this site
nkjp_s2.ini
— best accuracy but large model and somewhat slownkjp_e2.ini
— slightly worse accuracy but very small model and works faster
Tagger | Re-analysis | Acc lower bound | Acc upper bound | Acc lower known | Acc lower unknown | Full log |
---|---|---|---|---|---|---|
WCRFT nkjp_s2.ini | yes | 90.79% | 91.12% | 91.95% | 53.17% | r-wcrft-095-s2.txt |
WCRFT nkjp_e2.ini | yes | 90.26% | 90.58% | 91.54% | 48.52% | r-wcrft-095-e2.txt |
All the figures reported on this site have been obtained using Morfeusz SGJP (using Maca config morfeusz-nkjp
).
NKJP 1.1¶
Using nkjp_s2.ini
tagger configuration.
Morfeusz SGJP¶
Version: Dane lingwistyczne <2013/04/13>
Maca config: morfeusz-nkjp
Tagger | Re-analysis | Acc lower bound | Acc upper bound | Acc lower known | Acc lower unknown |
---|---|---|---|---|---|
WCRFT | yes | 90.79% | 91.13% | 91.95% | 53.08% |
The same tagger config (nkjp_s2.ini
) evaluated on NKJP 1.0 yielded 90.80% accuracy lower bound.
Morfeusz Polimorf¶
Version: Polimorf inflectional dictionary <2013/07/07>
Maca config: polimorf-nkjp
Tagger | Re-analysis | Acc lower bound | Acc upper bound | Acc lower known | Acc lower unknown |
---|---|---|---|---|---|
WCRFT | yes | 90.70% | 91.04% | 91.78% | 55.63% |