Heuristic algorithm for zero subject detection in Polish
-
- Kaczmarek, Adam
- Marcińczuk, Michał
- This article describes a heuristic approach to zero subject detection in Polish. It focuses on the zero subject detection as a crucial step in end-to-end coreference resolution. The zero subject verbs are recognized using a set of manually created rules utilizing information from different sources, including: a dependency parser, a shallow relational parser and a valence dictionary. The rules were developed and evaluated on the Polish Coreference Corpus. The experimental results show that the presented method significantly outperforms the only machine learning-based alternative for Polish, i.e., MentionDetector. We also discuss and evaluate the importance of zero subject detection for existing coreference resolution tools for Polish.
- Year:
- 2015
- Type of Publication:
- In Proceedings
- Keywords:
- Zero subject; Anaphora detection; Coreference resolution; Polish
- Editor:
- Pavel Král, Václav Matoušek
- Volume:
- 9302
- Book title:
- Text, Speech, and Dialogue, 18th International Conference, TSD 2015, Pilsen,Czech Republic, September 14-17, 2015, Proceedings
- Series:
- Lecture Notes in Computer Science
- Pages:
- 378-386
- Month:
- December
- ISBN:
- 978-3-319-24032-9
- ISSN:
- 978--3-31
- DOI:
- 10.1007/978-3-319-24033-6_43
Hits: 5914