Zadanie #1443

Testy wydajnościowe WCCL

Added by Adam Radziszewski over 9 years ago. Updated over 9 years ago.

Status:NowyStart date:21 Apr 2011
Priority:NormalnyDue date:
Assignee:Tomasz Śniatowski% Done:

20%

Category:-
Target version:-

Description

Warto by też przetestować wccl-match. Wyniki spisać na wiki, przyda nam się to do publikacji i dokumentacji.

History

#1 Updated by Tomasz Śniatowski over 9 years ago

  • % Done changed from 0 to 20

Zrobilem kilka nieformalnych testów, dopisałem --progress do wcclrun. Póki co na korpusie 88K tokenów, na i5 (nieco słabyszym od i7 w 446):
- proste operatory w małej ilości osiągają wydajność rzędu 20-30ktps (kilo tokens per second), jest tu 30% narzutu na odczyt
- 100 operatorów getsymbol osiąga w przyblizeniu 6ktps
- 10 operatorów if(agrpp(-2, 2, {cas,gnd}),class[0],if(in(class[0],{subst,qub,adj}),gnd,ign)): 17.5ktps
- 50 operatorów jw.: 7ktps
- 100 jw: 3,8ktps

W tym ostatnim przypadku juz kolo 10% czasu schodzilo na sprawy alokacyjne, tj malloc, ~shared_ptr etc. Nadal są to jednak proste operatory które nic ciekawego nie robią, przydałyby się jakieś ,,rzeczywiste''.

#2 Updated by Tomasz Śniatowski over 9 years ago

Wycinek z profilera:

Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
samples  %        image name               symbol name
8706      4.8919  libicuuc.so.42.1         u_strToUTF8WithSub_4_2
8689      4.8824  libc-2.11.1.so           _int_malloc
7239      4.0676  wccl-run                 boost::detail::shared_count::~shared_count()
6000      3.3714  libc-2.11.1.so           malloc
5968      3.3534  libglib-2.0.so.0.2400.1  _g_utf8_normalize_wc
5006      2.8129  libwccl.so.0.0           Wccl::PointAgreement::apply_internal(Wccl::FunExecContext const&) const
4609      2.5898  libc-2.11.1.so           _int_free
4459      2.5055  libicuuc.so.42.1         icu_4_2::UnicodeString::padTrailing(int, unsigned short)
4365      2.4527  libicuuc.so.42.1         u_strFromUTF8WithSub_4_2
4284      2.4072  libgcc_s.so.1            __popcountdi2
3919      2.2021  libwccl.so.0.0           boost::shared_ptr<Wccl::TSet const> boost::dynamic_pointer_cast<Wccl::TSet const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&)
3916      2.2004  libstdc++.so.6.0.13      __dynamic_cast
3889      2.1852  libc-2.11.1.so           memcpy
3885      2.1830  libwccl.so.0.0           Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
3716      2.0880  libcorpus2.so.1.0        Corpus2::Tagset::tag_to_symbol_string_vector(Corpus2::Tag const&, bool) const
3694      2.0757  libcorpus2.so.1.0        Corpus2::Tagset::get_attribute_mask(signed char) const
3484      1.9577  libicuuc.so.42.1         icu_4_2::UnicodeString::copyFrom(icu_4_2::UnicodeString const&, signed char)
3475      1.9526  libc-2.11.1.so           free
3395      1.9077  libwccl.so.0.0           Wccl::GetSymbols::apply_internal(Wccl::FunExecContext const&) const
2821      1.5851  libwccl.so.0.0           Wccl::Constant<Wccl::Position>::apply_internal(Wccl::FunExecContext const&) const
2733      1.5357  libwccl.so.0.0           Wccl::Constant<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
2508      1.4093  libwccl.so.0.0           Wccl::IsSubsetOf<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
2213      1.2435  libwccl.so.0.0           boost::shared_ptr<Wccl::Position const> boost::dynamic_pointer_cast<Wccl::Position const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&)
2089      1.1738  libwccl.so.0.0           Wccl::Predicate::False(Wccl::FunExecContext const&)
2043      1.1480  libwccl.so.0.0           boost::shared_ptr<Wccl::TSet> boost::make_shared<Wccl::TSet>()
1958      1.1002  libcorpus2.so.1.0        Corpus2::Tagset::attribute_count() const
1923      1.0805  libstdc++.so.6.0.13      __cxxabiv1::__si_class_type_info::__do_dyncast(long, __cxxabiv1::__class_type_info::__sub_kind, __cxxabiv1::__class_type_info const*, void const*, __cxxabiv1::__class_type_info const*, void const*, __cxxabiv1::__class_type_info::__dyncast_result&) const
1864      1.0474  libstdc++.so.6.0.13      std::string::append(char const*, unsigned long)

#3 Updated by Tomasz Śniatowski over 9 years ago

Wycinek z callgrapha (trzeba kompilowac z -fno-omit-frame-pointer):

samples  %        image name               symbol name
-------------------------------------------------------------------------------
60913     4.5687  libicuuc.so.42.1         u_strToUTF8WithSub_4_2
  60913    100.000  libicuuc.so.42.1         u_strToUTF8WithSub_4_2 [self]
-------------------------------------------------------------------------------
58760     4.4072  libc-2.11.1.so           _int_malloc
  58760    100.000  libc-2.11.1.so           _int_malloc [self]
-------------------------------------------------------------------------------
  3         0.0055  libcorpus2.so.1.0        Corpus2::BufferedChunkReader::get_next_chunk()
  4         0.0073  wccl-run                 main
  7         0.0128  libcorpus2.so.1.0        Corpus2::BufferedChunkReader::get_next_sentence()
  380       0.6946  libwccl.so.0.0           Wccl::Operator<Wccl::TSet>::base_apply(Wccl::SentenceContext const&)
  1601      2.9264  wccl-run                 Runner::do_stream(std::istream&, bool)
  2897      5.2954  libwccl.so.0.0           Wccl::PointAgreement::apply_internal(Wccl::FunExecContext const&) const
  5290      9.6695  wccl-run                 Runner::do_sentence(boost::shared_ptr<Corpus2::Sentence> const&)
  8855     16.1859  libwccl.so.0.0           Wccl::Operator<Wccl::TSet>::apply(Wccl::SentenceContext const&)
  16167    29.5514  libwccl.so.0.0           Wccl::IsSubsetOf<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
  19504    35.6511  libwccl.so.0.0           Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
54037     4.0530  wccl-run                 boost::detail::shared_count::~shared_count()
  54037    98.7735  wccl-run                 boost::detail::shared_count::~shared_count() [self]
  425       0.7769  libcorpus2.so.1.0        boost::detail::sp_counted_impl_pd<Corpus2::Sentence*, boost::detail::sp_ms_deleter<Corpus2::Sentence> >::dispose()
  133       0.2431  libwccl.so.0.0           Wccl::TSet::~TSet()
  105       0.1919  libwccl.so.0.0           boost::detail::sp_counted_impl_pd<Wccl::TSet*, boost::detail::sp_ms_deleter<Wccl::TSet> >::dispose()
  4         0.0073  libc-2.11.1.so           free
  4         0.0073  libcorpus2.so.1.0        Corpus2::Sentence::~Sentence()
-------------------------------------------------------------------------------
43693     3.2771  libc-2.11.1.so           malloc
  43693    99.5375  libc-2.11.1.so           malloc [self]
  203       0.4625  libicuuc.so.42.1         u_strFromUTF8WithSub_4_2
-------------------------------------------------------------------------------
  940       0.6369  libwccl.so.0.0           Wccl::Operator<Wccl::TSet>::apply(Wccl::SentenceContext const&)
  146649   99.3631  libwccl.so.0.0           Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
43021     3.2267  libwccl.so.0.0           Wccl::PointAgreement::apply_internal(Wccl::FunExecContext const&) const
  43021    29.7783  libwccl.so.0.0           Wccl::PointAgreement::apply_internal(Wccl::FunExecContext const&) const [self]
  35360    24.4755  libgcc_s.so.1            __popcountdi2
  13692     9.4773  libwccl.so.0.0           Wccl::Constant<Wccl::Position>::apply_internal(Wccl::FunExecContext const&) const
  13075     9.0503  libwccl.so.0.0           Wccl::Predicate::False(Wccl::FunExecContext const&)
  9452      6.5425  libwccl.so.0.0           boost::shared_ptr<Wccl::Position const> boost::dynamic_pointer_cast<Wccl::Position const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&)
  6565      4.5442  libwccl.so.0.0           Wccl::TSet::matching_categories(Corpus2::Tag const&) const
  5077      3.5142  libwccl.so.0.0           Wccl::Constant<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
  4666      3.2297  libstdc++.so.6.0.13      __dynamic_cast
  4216      2.9182  libwccl.so.0.0           boost::shared_ptr<Wccl::TSet const> boost::dynamic_pointer_cast<Wccl::TSet const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&)
  3902      2.7009  libwccl.so.0.0           Wccl::TSet::categories_count(Corpus2::Tagset const&) const
  2897      2.0052  wccl-run                 boost::detail::shared_count::~shared_count()
  1227      0.8493  libwccl.so.0.0           Wccl::SentenceContext::get_abs_position(Wccl::Position const&) const
  859       0.5946  libwccl.so.0.0           Wccl::Predicate::True(Wccl::FunExecContext const&)
  462       0.3198  libwccl.so.0.0           Wccl::Constant<Wccl::Bool>::apply_internal(Wccl::FunExecContext const&) const
-------------------------------------------------------------------------------
  1         0.1942  libcorpus2.so.1.0        Corpus2::XmlReader::start_chunk(std::deque<xmlpp::SaxParser::Attribute, std::allocator<xmlpp::SaxParser::Attribute> > const&)
  2         0.3883  libxml++-2.6.so.2.0.7    xmlpp::SaxParserCallback::start_element(void*, unsigned char const*, unsigned char const**)
  102      19.8058  libxml++-2.6.so.2.0.7    xmlpp::SaxParserCallback::end_element(void*, unsigned char const*)
  120      23.3010  libcorpus2.so.1.0        Corpus2::XmlReader::on_end_element(Glib::ustring const&)
  290      56.3107  libcorpus2.so.1.0        Corpus2::XmlReader::on_start_element(Glib::ustring const&, std::deque<xmlpp::SaxParser::Attribute, std::allocator<xmlpp::SaxParser::Attribute> > const&)
42205     3.1655  libglib-2.0.so.0.2400.1  _g_utf8_normalize_wc
  42205    100.000  libglib-2.0.so.0.2400.1  _g_utf8_normalize_wc [self]
-------------------------------------------------------------------------------
  405       1.1324  libcorpus2.so.1.0        Corpus2::Tagset::parse_simple_tag(std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > > const&, Corpus2::Tagset::ParseMode) const
  35360    98.8676  libwccl.so.0.0           Wccl::PointAgreement::apply_internal(Wccl::FunExecContext const&) const
35765     2.6825  libgcc_s.so.1            __popcountdi2
  35765    100.000  libgcc_s.so.1            __popcountdi2 [self]
-------------------------------------------------------------------------------
  373       1.0572  libwccl.so.0.0           Wccl::Operator<Wccl::TSet>::base_apply(Wccl::SentenceContext const&)
  4216     11.9494  libwccl.so.0.0           Wccl::PointAgreement::apply_internal(Wccl::FunExecContext const&) const
  5127     14.5315  libwccl.so.0.0           Wccl::Operator<Wccl::TSet>::apply(Wccl::SentenceContext const&)
  9392     26.6198  libwccl.so.0.0           Wccl::IsSubsetOf<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
  16174    45.8421  libwccl.so.0.0           Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
35282     2.6463  libwccl.so.0.0           boost::shared_ptr<Wccl::TSet const> boost::dynamic_pointer_cast<Wccl::TSet const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&)
  35282    100.000  libwccl.so.0.0           boost::shared_ptr<Wccl::TSet const> boost::dynamic_pointer_cast<Wccl::TSet const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&) [self]
-------------------------------------------------------------------------------
  2         0.1825  libc-2.11.1.so           vfprintf
  3         0.2737  libicuuc.so.42.1         _fini
  45        4.1058  libcorpus2.so.1.0        T.2968
  73        6.6606  libcorpus2.so.1.0        boost::algorithm::split_iterator<__gnu_cxx::__normal_iterator<char const*, std::string> >::increment()
  73        6.6606  libcorpus2.so.1.0        std::_Rb_tree<std::string, std::pair<std::string const, std::deque<std::string, std::allocator<std::string> > >, std::_Select1st<std::pair<std::string const, std::deque<std::string, std::allocator<std::string> > > >, std::less<std::string>, std::allocator<std::pair<std::string const, std::deque<std::string, std::allocator<std::string> > > > >::_M_insert_unique_(std::_Rb_tree_const_iterator<std::pair<std::string const, std::deque<std::string, std::allocator<std::string> > > >, std::pair<std::string const, std::deque<std::string, std::allocator<std::string> > > const&)
  186      16.9708  libcorpus2.so.1.0        Corpus2::Tagset::parse_simple_tag(boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const&, Corpus2::Tagset::ParseMode) const
  714      65.1460  libcorpus2.so.1.0        std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >& boost::algorithm::iter_split<std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >, boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const, boost::algorithm::detail::token_finderF<boost::algorithm::detail::is_any_ofF<char> > >(std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >&, boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const&, boost::algorithm::detail::token_finderF<boost::algorithm::detail::is_any_ofF<char> >)
34022     2.5518  libc-2.11.1.so           memcpy
  34022    100.000  libc-2.11.1.so           memcpy [self]
-------------------------------------------------------------------------------
  1         0.0115  libc-2.11.1.so           memmove
  1         0.0115  libcorpus2.so.1.0        Corpus2::BufferedChunkReader::get_next_sentence()
  8         0.0923  libcorpus2.so.1.0        Corpus2::Sentence::append(Corpus2::Token*)
  17        0.1962  libcorpus2.so.1.0        Corpus2::Tagset::tag_to_symbol_string_vector(Corpus2::Tag const&, bool) const
  24        0.2770  libcorpus2.so.1.0        Corpus2::XmlReader::start_lexeme(std::deque<xmlpp::SaxParser::Attribute, std::allocator<xmlpp::SaxParser::Attribute> > const&)
  31        0.3578  libcorpus2.so.1.0        boost::detail::sp_counted_impl_pd<Corpus2::Sentence*, boost::detail::sp_ms_deleter<Corpus2::Sentence> >::dispose()
  109       1.2581  libcorpus2.so.1.0        Corpus2::Tagset::parse_simple_tag(boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const&, Corpus2::Tagset::ParseMode) const
  129       1.4889  libcorpus2.so.1.0        Corpus2::XmlReader::on_end_element(Glib::ustring const&)
  165       1.9044  wccl-run                 Runner::do_stream(std::istream&, bool)
  463       5.3440  libcorpus2.so.1.0        std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >& boost::algorithm::iter_split<std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >, boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const, boost::algorithm::detail::token_finderF<boost::algorithm::detail::is_any_ofF<char> > >(std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >&, boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const&, boost::algorithm::detail::token_finderF<boost::algorithm::detail::is_any_ofF<char> >)
  683       7.8832  libwccl.so.0.0           Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
  1484     17.1283  libwccl.so.0.0           Wccl::Value::to_string_u(Corpus2::Tagset const&) const
  2333     26.9275  libwccl.so.0.0           Wccl::TSet::to_string(Corpus2::Tagset const&) const
  3216     37.1191  wccl-run                 Runner::do_sentence(boost::shared_ptr<Corpus2::Sentence> const&)
31452     2.3590  libc-2.11.1.so           _int_free
  31452    100.000  libc-2.11.1.so           _int_free [self]
-------------------------------------------------------------------------------
  1307     100.000  wccl-run                 Runner::do_sentence(boost::shared_ptr<Corpus2::Sentence> const&)
30582     2.2938  libicuuc.so.42.1         icu_4_2::UnicodeString::padTrailing(int, unsigned short)
  30582    100.000  libicuuc.so.42.1         icu_4_2::UnicodeString::padTrailing(int, unsigned short) [self]
-------------------------------------------------------------------------------
  93        0.8188  wccl-run                 Runner::do_sentence(boost::shared_ptr<Corpus2::Sentence> const&)
  682       6.0046  libwccl.so.0.0           Wccl::Predicate::evaluate(bool, Wccl::FunExecContext const&)
  721       6.3479  libwccl.so.0.0           Wccl::Operator<Wccl::TSet>::apply(Wccl::SentenceContext const&)
  734       6.4624  libwccl.so.0.0           Wccl::GetSymbols::apply_internal(Wccl::FunExecContext const&) const
  1668     14.6857  libwccl.so.0.0           Wccl::IsSubsetOf<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
  2794     24.5994  libwccl.so.0.0           Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
  4666     41.0812  libwccl.so.0.0           Wccl::PointAgreement::apply_internal(Wccl::FunExecContext const&) const
30481     2.2862  libstdc++.so.6.0.13      __dynamic_cast
  30481    100.000  libstdc++.so.6.0.13      __dynamic_cast [self]
-------------------------------------------------------------------------------
  762       0.1391  libwccl.so.0.0           Wccl::Operator<Wccl::TSet>::base_apply(Wccl::SentenceContext const&)
  169102   30.8664  libwccl.so.0.0           Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
  377988   68.9945  libwccl.so.0.0           Wccl::Operator<Wccl::TSet>::apply(Wccl::SentenceContext const&)
28941     2.1707  libwccl.so.0.0           Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
  169102   31.2638  libwccl.so.0.0           Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
  146649   27.1127  libwccl.so.0.0           Wccl::PointAgreement::apply_internal(Wccl::FunExecContext const&) const
  121055   22.3808  libwccl.so.0.0           Wccl::IsSubsetOf<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
  28941     5.3507  libwccl.so.0.0           Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const [self]
  19504     3.6059  wccl-run                 boost::detail::shared_count::~shared_count()
  16174     2.9903  libwccl.so.0.0           boost::shared_ptr<Wccl::TSet const> boost::dynamic_pointer_cast<Wccl::TSet const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&)
  10916     2.0182  libwccl.so.0.0           Wccl::Constant<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
  9840      1.8192  libwccl.so.0.0           boost::shared_ptr<Wccl::Bool const> boost::dynamic_pointer_cast<Wccl::Bool const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&)
  4027      0.7445  libwccl.so.0.0           Wccl::GetSymbols::apply_internal(Wccl::FunExecContext const&) const
  2794      0.5166  libstdc++.so.6.0.13      __dynamic_cast
  2278      0.4212  libwccl.so.0.0           Wccl::Constant<Wccl::Position>::apply_internal(Wccl::FunExecContext const&) const
  2063      0.3814  libc-2.11.1.so           free
  1530      0.2829  libwccl.so.0.0           Wccl::TSet::matching_categories(Corpus2::Tag const&) const
  1318      0.2437  libwccl.so.0.0           Wccl::SentenceContext::get_abs_position(Wccl::Position const&) const
  744       0.1376  libwccl.so.0.0           boost::shared_ptr<Wccl::Position const> boost::dynamic_pointer_cast<Wccl::Position const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&)
  683       0.1263  libc-2.11.1.so           _int_free
  631       0.1167  libwccl.so.0.0           Wccl::Predicate::False(Wccl::FunExecContext const&)
  556       0.1028  libcorpus2.so.1.0        Corpus2::Tag::operator==(Corpus2::Tag const&) const
  481       0.0889  libwccl.so.0.0           Wccl::TSet::categories_count(Corpus2::Tagset const&) const
  394       0.0728  libwccl.so.0.0           boost::detail::sp_counted_base::destroy()
  352       0.0651  libwccl.so.0.0           boost::detail::sp_counted_impl_pd<Wccl::TSet*, boost::detail::sp_ms_deleter<Wccl::TSet> >::~sp_counted_impl_pd()
  331       0.0612  libwccl.so.0.0           boost::detail::sp_counted_impl_pd<Wccl::TSet*, boost::detail::sp_ms_deleter<Wccl::TSet> >::dispose()
  209       0.0386  libwccl.so.0.0           Wccl::Predicate::evaluate(bool, Wccl::FunExecContext const&)
  186       0.0344  libstdc++.so.6.0.13      operator delete(void*)
  102       0.0189  libwccl.so.0.0           Wccl::Predicate::True(Wccl::FunExecContext const&)
  27        0.0050  libwccl.so.0.0           boost::shared_ptr<Wccl::TSet> boost::make_shared<Wccl::TSet>()
-------------------------------------------------------------------------------
  1         0.2481  libwccl.so.0.0           _fini
  2         0.4963  libcorpus2.so.1.0        _fini
  197      48.8834  libicuuc.so.42.1         _fini
  203      50.3722  libc-2.11.1.so           malloc
26572     1.9930  libicuuc.so.42.1         u_strFromUTF8WithSub_4_2
  26572    100.000  libicuuc.so.42.1         u_strFromUTF8WithSub_4_2 [self]
-------------------------------------------------------------------------------
  37        0.0690  libwccl.so.0.0           Wccl::Operator<Wccl::TSet>::apply(Wccl::SentenceContext const&)
  4027      7.5105  libwccl.so.0.0           Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
  49554    92.4205  libwccl.so.0.0           Wccl::IsSubsetOf<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
25241     1.8932  libwccl.so.0.0           Wccl::GetSymbols::apply_internal(Wccl::FunExecContext const&) const
  25241    48.0031  libwccl.so.0.0           Wccl::GetSymbols::apply_internal(Wccl::FunExecContext const&) const [self]
  11938    22.7036  libwccl.so.0.0           boost::shared_ptr<Wccl::TSet> boost::make_shared<Wccl::TSet>()
  5521     10.4998  libwccl.so.0.0           Wccl::Constant<Wccl::Position>::apply_internal(Wccl::FunExecContext const&) const
  4941      9.3968  libwccl.so.0.0           boost::shared_ptr<Wccl::Position const> boost::dynamic_pointer_cast<Wccl::Position const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&)
  2322      4.4160  libwccl.so.0.0           boost::detail::sp_counted_impl_pd<Wccl::TSet*, boost::detail::sp_ms_deleter<Wccl::TSet> >::get_deleter(std::type_info const&)
  844       1.6051  libwccl.so.0.0           boost::detail::sp_enable_shared_from_this(...)
  734       1.3959  libstdc++.so.6.0.13      __dynamic_cast
  689       1.3103  libwccl.so.0.0           Wccl::SentenceContext::get_abs_position(Wccl::Position const&) const
  352       0.6694  libstdc++.so.6.0.13      operator new(unsigned long)
-------------------------------------------------------------------------------
  4         0.0189  wccl-run                 boost::detail::shared_count::~shared_count()
  4         0.0189  libc-2.11.1.so           memmove
  5         0.0237  libcorpus2.so.1.0        Corpus2::BufferedChunkReader::get_next_sentence()
  21        0.0994  libcorpus2.so.1.0        Corpus2::Sentence::append(Corpus2::Token*)
  30        0.1419  libcorpus2.so.1.0        Corpus2::Tagset::tag_to_symbol_string_vector(Corpus2::Tag const&, bool) const
  66        0.3123  libcorpus2.so.1.0        boost::detail::sp_counted_impl_pd<Corpus2::Sentence*, boost::detail::sp_ms_deleter<Corpus2::Sentence> >::dispose()
  67        0.3170  libcorpus2.so.1.0        Corpus2::XmlReader::start_lexeme(std::deque<xmlpp::SaxParser::Attribute, std::allocator<xmlpp::SaxParser::Attribute> > const&)
  241       1.1403  libcorpus2.so.1.0        Corpus2::XmlReader::on_end_element(Glib::ustring const&)
  242       1.1450  wccl-run                 Runner::do_stream(std::istream&, bool)
  242       1.1450  libcorpus2.so.1.0        Corpus2::Tagset::parse_simple_tag(boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const&, Corpus2::Tagset::ParseMode) const
  665       3.1464  libcorpus2.so.1.0        std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >& boost::algorithm::iter_split<std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >, boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const, boost::algorithm::detail::token_finderF<boost::algorithm::detail::is_any_ofF<char> > >(std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >&, boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const&, boost::algorithm::detail::token_finderF<boost::algorithm::detail::is_any_ofF<char> >)
  2063      9.7611  libwccl.so.0.0           Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const
  3653     17.2841  libwccl.so.0.0           Wccl::TSet::to_string(Corpus2::Tagset const&) const
  5259     24.8829  libwccl.so.0.0           Wccl::Value::to_string_u(Corpus2::Tagset const&) const
  8573     40.5630  wccl-run                 Runner::do_sentence(boost::shared_ptr<Corpus2::Sentence> const&)
24999     1.8750  libc-2.11.1.so           free
  24999    82.8330  libc-2.11.1.so           free [self]
  2600      8.6150  libstdc++.so.6.0.13      std::string::replace(unsigned long, unsigned long, char const*, unsigned long)
  2399      7.9490  libstdc++.so.6.0.13      std::string::_M_replace_safe(unsigned long, unsigned long, char const*, unsigned long)
  116       0.3844  libc-2.11.1.so           __strlen_sse42
  38        0.1259  libstdc++.so.6.0.13      std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)
  28        0.0928  libstdc++.so.6.0.13      char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag)
-------------------------------------------------------------------------------
  538       0.6132  libwccl.so.0.0           Wccl::TSet::to_string(Corpus2::Tagset const&) const
  87198    99.3868  libcorpus2.so.1.0        Corpus2::Tagset::tag_to_symbol_string(Corpus2::Tag const&, bool) const
24925     1.8695  libcorpus2.so.1.0        Corpus2::Tagset::tag_to_symbol_string_vector(Corpus2::Tag const&, bool) const
  24925    29.5250  libcorpus2.so.1.0        Corpus2::Tagset::tag_to_symbol_string_vector(Corpus2::Tag const&, bool) const [self]
  18772    22.2364  libcorpus2.so.1.0        Corpus2::Tagset::get_attribute_mask(signed char) const
  11525    13.6520  wccl-run                 std::vector<std::string, std::allocator<std::string> >::_M_insert_aux(__gnu_cxx::__normal_iterator<std::string*, std::vector<std::string, std::allocator<std::string> > >, std::string const&)
  9513     11.2687  libstdc++.so.6.0.13      std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&)
  7022      8.3179  libcorpus2.so.1.0        Corpus2::Tagset::attribute_count() const
  5205      6.1656  libcorpus2.so.1.0        Corpus2::Tagset::get_pos_name(std::bitset<64ul>) const
  3014      3.5702  libcorpus2.so.1.0        boost::iterator_range<PwrNlp::set_bits_iterator<std::bitset<64ul> > > PwrNlp::set_bits<64ul>(std::bitset<64ul> const&)
  1912      2.2649  wccl-run                 std::string* std::__uninitialized_copy_a<std::string*, std::string*, std::string>(std::string*, std::string*, std::string*, std::allocator<std::string>&)
  1632      1.9332  libstdc++.so.6.0.13      operator new(unsigned long)
  669       0.7925  libcorpus2.so.1.0        Corpus2::Tagset::get_pos_index(std::bitset<64ul>) const

To co się mnie rzuca w oczy to stosunkowo dużo dynamic_castow i ~shared_count. Być może pomogloby przearchitekturowanie Function<T> tak, aby zwracalo const T& lub samo T, zamiast shared_ptr<T>, w sytuacji gdy typ jest pewny.

#4 Updated by Tomasz Śniatowski over 9 years ago

Pierwsze szybkie porównanie z JOSKIPI: jeden operator, korpus 88K tokenów, komp roboczy w 446. Wyniki powtarzalne co do kilku %.

$ time jtester --input-file /home/local/folds/R09Folds/folds/test01.xml 
and(
equal(orth[0], "postulaty"),
//inter(flex[0], subst)
in(flex[0], {subst,depr,ger})
)

File with sentences...
Processed 5460 sentences.
Operator was equal to given value 2 times...
real    0m8.025s
user    0m7.980s
sys     0m0.020s
$ time wccl-run /home/local/folds/R09Folds/folds/test01.xml 'and(equal(orth[0],"postulaty"),in(class[0],{subst,depr,ger}))'|grep True|wc -l
2

real    0m2.840s
user    0m2.820s
sys     0m0.030s
time wccl-run -i xces-fast /home/local/folds/R09Folds/folds/test01.xml 'and(equal(orth[0],"postulaty"),in(class[0],{subst,depr,ger}))' "$a" "$a" |grep True|wc -l
2

real    0m2.119s
user    0m2.100s
sys     0m0.050s

#5 Updated by Tomasz Śniatowski over 9 years ago

JTester vs wcclrun na 1 foldzie freka (88ktok) i takim operatorze:


//--------------------------------------------------------
//verbInfObj
//bezokolicznik jako potencjalny argument (element predykatywny argumentu) elementu werbalnego
//cechą jest forma podst. elementu werbalnego 
//
//znane błędy: "szkoły zostały zamknięte" 
//--------------------------------------------------------

and(
    equal(flex[0],{inf}),
    llook(-1,begin,$V,and(
                          in(flex[$V],{fin,praet,inf,imps,impt,pred,winien,pcon,pant,ppas,pact})//,
                          //inter(base[$AR],{"czasownik"})
                      )),
     or(
        in(flex[$V],{fin,praet,pcon,pant,imps,impt,pred,winien}),
        and(
            in(flex[$V],{ppas,pact}),
            not(rlook($+1V,-1,$S, and( in(flex[$S],{subst,ger,depr}),
                                       agrpp($V,$S,{cas,nmb,gnd},3))
            ))
        ),
        and(
            in(flex[$V],{inf}),
            not(rlook($+1V,-1,$C, in(base[$C],{"ani","albo","czy","i","lub","oraz",","})
            ))
        ) 
     )
)//and

JTester: 5.5sek, wcclrun: 2.6sek, wcclrun -i xces-fast: 1.5s

Also available in: Atom PDF