WMBT is a morphosyntactic tagger combining tiered tagging and Memory-Based Learning.
The tagger is suited for positional tagsets: for each tagset attribute a separate case base is gathered.
WMBT has been implemented in Python, although low-level routines are based on the following C++ libraries:
- TiMBL, a popualar MBL framework,
- WCCL, a toolkit for generation of morphosyntactic features,
- Corpus2, a framework for dealing with annotated corpora and configurable tagsets.
WMBT itself is a disambiguation engine; to tag plain text, please use MACA first.
A detailed description (also on how to use MACA with WMBT), pointer to sources (GPL) and installation instructions may be found on the project site.