IOBBER is a shallow parser (chunker) developed for Polish. It is able to recognise boundaries of syntactic phrases as well as location of their syntactic heads. For instance, IOBBER is able to assign the following structure to a sentence:
[Dziennikarka]NP [zarzucała]VP [Rutkowskiemu]NP [to]NP, że [całe jego działanie ws. zaginięcia]NP [to]VP [„show”]NP
The name comes from IOB tags assigned to tokens during labelling phase. IOBBER employs a Machine Learning Technique (CRF) and may adapt to various phrase definitions. Essentially it learns from a training corpus using given configuration. The parser comes with a ready-made model trained on the KPWr corpus. We also have trained it using data from the National Corpus of Polish and the resulting model is available for download at the project site (link below).
IOBBER is highly configurable. Thanks to it it has successfully been applied to shallow parsing of Czech data.
More information and access to source codes are available at project home page.