Instalacja narzędzi NLP¶
Script description¶
This script was written to speed up NLP tools installation on Ubuntu 12.04. The idea was to create one shell script installer for all NLP tools made by The WrocUT Language Technology Group G4.19.
This script were test on Ubuntu 12.04 LTS from official Ubuntu website (http://releases.ubuntu.com/12.04/) tens of times and worked expected way. It was tested on pure installation with the same success as on several months installation.
Using this script to install Question Answering module, please make sure about permissions to downloading git repository of QA. That repository isn't public, so you have to contact with member of The WrocUT Language Technology Group G4.19 to get required permissions.
System requirements¶
Compilation of tools like Corpsu2 or Wccl is very complicated and needs big computional power. So it's highly recommend to install NLP tools on computers comply with this specification.
Operating system¶
Tools were tested on below OS (list is ordered according with G4.19 recommandation):- Ubuntu 12.04
- Ubuntu 13.10
Warning!
For Ubuntu 13.10 and higher there is no right installation package of libsfst1 (ver. 1.2 or 1.3). This library is required for MACA tool. In some cases installation of libsfst1 version higher then 1.3 worked, in some cases it wasn't. We don't know why. So is highly recommend to install libsfst1 versions 1.2 or 1.3.
It's highly recommend to use 64-bit OS.
Processor:
4 cores
8 threads
~ 3.20 GHz
RAM:
4 GB
Hard disk space:
5 GB for NLP tools
500 GB for big data linked with NLP tools
NLP tools descirption¶
NLP tools possible to install:
corpus2 (http://www.nlp.pwr.wroc.pl/redmine/projects/corpus2/wiki)
toki (http://www.nlp.pwr.wroc.pl/redmine/projects/toki/wiki)
maca (http://www.nlp.pwr.wroc.pl/redmine/projects/libpltagger/wiki)
wccl (http://www.nlp.pwr.wroc.pl/redmine/projects/joskipi/wiki)
wcrft (http://www.nlp.pwr.wroc.pl/redmine/projects/wcrft/wiki)
iobber (http://www.nlp.pwr.wroc.pl/redmine/projects/iobber/wiki)
solr (http://lucene.apache.org/solr/)
ner-ws (http://www.nlp.pwr.wroc.pl/redmine/projects/ner-ws/wiki)
qa (http://www.nlp.pwr.wroc.pl/redmine/projects/qa/wiki/Instalacja_prototypu#Instalacja-wymaganych-modułów)
Script parameters description¶
Script parameters setting on start (command line level):
p - path to NLP tools repositories folder (default: ~/nlp-tools/)
r - domain or ip of NLP tools repository (default: nlp.pwr.wroc.pl)
d - path to big data directory (eg. QA documents, solr collections, databases dumps) (default: '')
h - ip of installed tools remote server (default: localhost)
N - name of Ner-ws database (default: nerws)
U - name of Ner-ws database user (default: BorsukG419)
P - user password for Ner-ws database (default: BorsukNekst)
Start commands examples¶
Parameters could be set in any order without difference for script processing.
Default parameters¶
bash ubuntu-12.04.sh
Defined parameters¶
bash ubuntu-12.04.sh -p /home/xxx/nlp-tools/ -r nlp.pwr.wroc.pl -d /home/xxx/big_data/ -h localhost -N nerws -U BorsukG419 -P BorsukNekst