Zadanie #1443
Testy wydajnościowe WCCL
Status: | Nowy | Start date: | 21 Apr 2011 | ||
---|---|---|---|---|---|
Priority: | Normalny | Due date: | |||
Assignee: | Tomasz Śniatowski | % Done: | 20% | ||
Category: | - | ||||
Target version: | - |
Description
Warto by też przetestować wccl-match. Wyniki spisać na wiki, przyda nam się to do publikacji i dokumentacji.
History
#1 Updated by Tomasz Śniatowski over 12 years ago
- % Done changed from 0 to 20
Zrobilem kilka nieformalnych testów, dopisałem --progress do wcclrun. Póki co na korpusie 88K tokenów, na i5 (nieco słabyszym od i7 w 446):
- proste operatory w małej ilości osiągają wydajność rzędu 20-30ktps (kilo tokens per second), jest tu 30% narzutu na odczyt
- 100 operatorów getsymbol osiąga w przyblizeniu 6ktps
- 10 operatorów if(agrpp(-2, 2, {cas,gnd}),class[0],if(in(class[0],{subst,qub,adj}),gnd,ign))
: 17.5ktps
- 50 operatorów jw.: 7ktps
- 100 jw: 3,8ktps
W tym ostatnim przypadku juz kolo 10% czasu schodzilo na sprawy alokacyjne, tj malloc, ~shared_ptr etc. Nadal są to jednak proste operatory które nic ciekawego nie robią, przydałyby się jakieś ,,rzeczywiste''.
#2 Updated by Tomasz Śniatowski over 12 years ago
Wycinek z profilera:
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 samples % image name symbol name 8706 4.8919 libicuuc.so.42.1 u_strToUTF8WithSub_4_2 8689 4.8824 libc-2.11.1.so _int_malloc 7239 4.0676 wccl-run boost::detail::shared_count::~shared_count() 6000 3.3714 libc-2.11.1.so malloc 5968 3.3534 libglib-2.0.so.0.2400.1 _g_utf8_normalize_wc 5006 2.8129 libwccl.so.0.0 Wccl::PointAgreement::apply_internal(Wccl::FunExecContext const&) const 4609 2.5898 libc-2.11.1.so _int_free 4459 2.5055 libicuuc.so.42.1 icu_4_2::UnicodeString::padTrailing(int, unsigned short) 4365 2.4527 libicuuc.so.42.1 u_strFromUTF8WithSub_4_2 4284 2.4072 libgcc_s.so.1 __popcountdi2 3919 2.2021 libwccl.so.0.0 boost::shared_ptr<Wccl::TSet const> boost::dynamic_pointer_cast<Wccl::TSet const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&) 3916 2.2004 libstdc++.so.6.0.13 __dynamic_cast 3889 2.1852 libc-2.11.1.so memcpy 3885 2.1830 libwccl.so.0.0 Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 3716 2.0880 libcorpus2.so.1.0 Corpus2::Tagset::tag_to_symbol_string_vector(Corpus2::Tag const&, bool) const 3694 2.0757 libcorpus2.so.1.0 Corpus2::Tagset::get_attribute_mask(signed char) const 3484 1.9577 libicuuc.so.42.1 icu_4_2::UnicodeString::copyFrom(icu_4_2::UnicodeString const&, signed char) 3475 1.9526 libc-2.11.1.so free 3395 1.9077 libwccl.so.0.0 Wccl::GetSymbols::apply_internal(Wccl::FunExecContext const&) const 2821 1.5851 libwccl.so.0.0 Wccl::Constant<Wccl::Position>::apply_internal(Wccl::FunExecContext const&) const 2733 1.5357 libwccl.so.0.0 Wccl::Constant<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 2508 1.4093 libwccl.so.0.0 Wccl::IsSubsetOf<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 2213 1.2435 libwccl.so.0.0 boost::shared_ptr<Wccl::Position const> boost::dynamic_pointer_cast<Wccl::Position const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&) 2089 1.1738 libwccl.so.0.0 Wccl::Predicate::False(Wccl::FunExecContext const&) 2043 1.1480 libwccl.so.0.0 boost::shared_ptr<Wccl::TSet> boost::make_shared<Wccl::TSet>() 1958 1.1002 libcorpus2.so.1.0 Corpus2::Tagset::attribute_count() const 1923 1.0805 libstdc++.so.6.0.13 __cxxabiv1::__si_class_type_info::__do_dyncast(long, __cxxabiv1::__class_type_info::__sub_kind, __cxxabiv1::__class_type_info const*, void const*, __cxxabiv1::__class_type_info const*, void const*, __cxxabiv1::__class_type_info::__dyncast_result&) const 1864 1.0474 libstdc++.so.6.0.13 std::string::append(char const*, unsigned long)
#3 Updated by Tomasz Śniatowski over 12 years ago
Wycinek z callgrapha (trzeba kompilowac z -fno-omit-frame-pointer):
samples % image name symbol name ------------------------------------------------------------------------------- 60913 4.5687 libicuuc.so.42.1 u_strToUTF8WithSub_4_2 60913 100.000 libicuuc.so.42.1 u_strToUTF8WithSub_4_2 [self] ------------------------------------------------------------------------------- 58760 4.4072 libc-2.11.1.so _int_malloc 58760 100.000 libc-2.11.1.so _int_malloc [self] ------------------------------------------------------------------------------- 3 0.0055 libcorpus2.so.1.0 Corpus2::BufferedChunkReader::get_next_chunk() 4 0.0073 wccl-run main 7 0.0128 libcorpus2.so.1.0 Corpus2::BufferedChunkReader::get_next_sentence() 380 0.6946 libwccl.so.0.0 Wccl::Operator<Wccl::TSet>::base_apply(Wccl::SentenceContext const&) 1601 2.9264 wccl-run Runner::do_stream(std::istream&, bool) 2897 5.2954 libwccl.so.0.0 Wccl::PointAgreement::apply_internal(Wccl::FunExecContext const&) const 5290 9.6695 wccl-run Runner::do_sentence(boost::shared_ptr<Corpus2::Sentence> const&) 8855 16.1859 libwccl.so.0.0 Wccl::Operator<Wccl::TSet>::apply(Wccl::SentenceContext const&) 16167 29.5514 libwccl.so.0.0 Wccl::IsSubsetOf<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 19504 35.6511 libwccl.so.0.0 Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 54037 4.0530 wccl-run boost::detail::shared_count::~shared_count() 54037 98.7735 wccl-run boost::detail::shared_count::~shared_count() [self] 425 0.7769 libcorpus2.so.1.0 boost::detail::sp_counted_impl_pd<Corpus2::Sentence*, boost::detail::sp_ms_deleter<Corpus2::Sentence> >::dispose() 133 0.2431 libwccl.so.0.0 Wccl::TSet::~TSet() 105 0.1919 libwccl.so.0.0 boost::detail::sp_counted_impl_pd<Wccl::TSet*, boost::detail::sp_ms_deleter<Wccl::TSet> >::dispose() 4 0.0073 libc-2.11.1.so free 4 0.0073 libcorpus2.so.1.0 Corpus2::Sentence::~Sentence() ------------------------------------------------------------------------------- 43693 3.2771 libc-2.11.1.so malloc 43693 99.5375 libc-2.11.1.so malloc [self] 203 0.4625 libicuuc.so.42.1 u_strFromUTF8WithSub_4_2 ------------------------------------------------------------------------------- 940 0.6369 libwccl.so.0.0 Wccl::Operator<Wccl::TSet>::apply(Wccl::SentenceContext const&) 146649 99.3631 libwccl.so.0.0 Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 43021 3.2267 libwccl.so.0.0 Wccl::PointAgreement::apply_internal(Wccl::FunExecContext const&) const 43021 29.7783 libwccl.so.0.0 Wccl::PointAgreement::apply_internal(Wccl::FunExecContext const&) const [self] 35360 24.4755 libgcc_s.so.1 __popcountdi2 13692 9.4773 libwccl.so.0.0 Wccl::Constant<Wccl::Position>::apply_internal(Wccl::FunExecContext const&) const 13075 9.0503 libwccl.so.0.0 Wccl::Predicate::False(Wccl::FunExecContext const&) 9452 6.5425 libwccl.so.0.0 boost::shared_ptr<Wccl::Position const> boost::dynamic_pointer_cast<Wccl::Position const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&) 6565 4.5442 libwccl.so.0.0 Wccl::TSet::matching_categories(Corpus2::Tag const&) const 5077 3.5142 libwccl.so.0.0 Wccl::Constant<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 4666 3.2297 libstdc++.so.6.0.13 __dynamic_cast 4216 2.9182 libwccl.so.0.0 boost::shared_ptr<Wccl::TSet const> boost::dynamic_pointer_cast<Wccl::TSet const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&) 3902 2.7009 libwccl.so.0.0 Wccl::TSet::categories_count(Corpus2::Tagset const&) const 2897 2.0052 wccl-run boost::detail::shared_count::~shared_count() 1227 0.8493 libwccl.so.0.0 Wccl::SentenceContext::get_abs_position(Wccl::Position const&) const 859 0.5946 libwccl.so.0.0 Wccl::Predicate::True(Wccl::FunExecContext const&) 462 0.3198 libwccl.so.0.0 Wccl::Constant<Wccl::Bool>::apply_internal(Wccl::FunExecContext const&) const ------------------------------------------------------------------------------- 1 0.1942 libcorpus2.so.1.0 Corpus2::XmlReader::start_chunk(std::deque<xmlpp::SaxParser::Attribute, std::allocator<xmlpp::SaxParser::Attribute> > const&) 2 0.3883 libxml++-2.6.so.2.0.7 xmlpp::SaxParserCallback::start_element(void*, unsigned char const*, unsigned char const**) 102 19.8058 libxml++-2.6.so.2.0.7 xmlpp::SaxParserCallback::end_element(void*, unsigned char const*) 120 23.3010 libcorpus2.so.1.0 Corpus2::XmlReader::on_end_element(Glib::ustring const&) 290 56.3107 libcorpus2.so.1.0 Corpus2::XmlReader::on_start_element(Glib::ustring const&, std::deque<xmlpp::SaxParser::Attribute, std::allocator<xmlpp::SaxParser::Attribute> > const&) 42205 3.1655 libglib-2.0.so.0.2400.1 _g_utf8_normalize_wc 42205 100.000 libglib-2.0.so.0.2400.1 _g_utf8_normalize_wc [self] ------------------------------------------------------------------------------- 405 1.1324 libcorpus2.so.1.0 Corpus2::Tagset::parse_simple_tag(std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > > const&, Corpus2::Tagset::ParseMode) const 35360 98.8676 libwccl.so.0.0 Wccl::PointAgreement::apply_internal(Wccl::FunExecContext const&) const 35765 2.6825 libgcc_s.so.1 __popcountdi2 35765 100.000 libgcc_s.so.1 __popcountdi2 [self] ------------------------------------------------------------------------------- 373 1.0572 libwccl.so.0.0 Wccl::Operator<Wccl::TSet>::base_apply(Wccl::SentenceContext const&) 4216 11.9494 libwccl.so.0.0 Wccl::PointAgreement::apply_internal(Wccl::FunExecContext const&) const 5127 14.5315 libwccl.so.0.0 Wccl::Operator<Wccl::TSet>::apply(Wccl::SentenceContext const&) 9392 26.6198 libwccl.so.0.0 Wccl::IsSubsetOf<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 16174 45.8421 libwccl.so.0.0 Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 35282 2.6463 libwccl.so.0.0 boost::shared_ptr<Wccl::TSet const> boost::dynamic_pointer_cast<Wccl::TSet const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&) 35282 100.000 libwccl.so.0.0 boost::shared_ptr<Wccl::TSet const> boost::dynamic_pointer_cast<Wccl::TSet const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&) [self] ------------------------------------------------------------------------------- 2 0.1825 libc-2.11.1.so vfprintf 3 0.2737 libicuuc.so.42.1 _fini 45 4.1058 libcorpus2.so.1.0 T.2968 73 6.6606 libcorpus2.so.1.0 boost::algorithm::split_iterator<__gnu_cxx::__normal_iterator<char const*, std::string> >::increment() 73 6.6606 libcorpus2.so.1.0 std::_Rb_tree<std::string, std::pair<std::string const, std::deque<std::string, std::allocator<std::string> > >, std::_Select1st<std::pair<std::string const, std::deque<std::string, std::allocator<std::string> > > >, std::less<std::string>, std::allocator<std::pair<std::string const, std::deque<std::string, std::allocator<std::string> > > > >::_M_insert_unique_(std::_Rb_tree_const_iterator<std::pair<std::string const, std::deque<std::string, std::allocator<std::string> > > >, std::pair<std::string const, std::deque<std::string, std::allocator<std::string> > > const&) 186 16.9708 libcorpus2.so.1.0 Corpus2::Tagset::parse_simple_tag(boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const&, Corpus2::Tagset::ParseMode) const 714 65.1460 libcorpus2.so.1.0 std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >& boost::algorithm::iter_split<std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >, boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const, boost::algorithm::detail::token_finderF<boost::algorithm::detail::is_any_ofF<char> > >(std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >&, boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const&, boost::algorithm::detail::token_finderF<boost::algorithm::detail::is_any_ofF<char> >) 34022 2.5518 libc-2.11.1.so memcpy 34022 100.000 libc-2.11.1.so memcpy [self] ------------------------------------------------------------------------------- 1 0.0115 libc-2.11.1.so memmove 1 0.0115 libcorpus2.so.1.0 Corpus2::BufferedChunkReader::get_next_sentence() 8 0.0923 libcorpus2.so.1.0 Corpus2::Sentence::append(Corpus2::Token*) 17 0.1962 libcorpus2.so.1.0 Corpus2::Tagset::tag_to_symbol_string_vector(Corpus2::Tag const&, bool) const 24 0.2770 libcorpus2.so.1.0 Corpus2::XmlReader::start_lexeme(std::deque<xmlpp::SaxParser::Attribute, std::allocator<xmlpp::SaxParser::Attribute> > const&) 31 0.3578 libcorpus2.so.1.0 boost::detail::sp_counted_impl_pd<Corpus2::Sentence*, boost::detail::sp_ms_deleter<Corpus2::Sentence> >::dispose() 109 1.2581 libcorpus2.so.1.0 Corpus2::Tagset::parse_simple_tag(boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const&, Corpus2::Tagset::ParseMode) const 129 1.4889 libcorpus2.so.1.0 Corpus2::XmlReader::on_end_element(Glib::ustring const&) 165 1.9044 wccl-run Runner::do_stream(std::istream&, bool) 463 5.3440 libcorpus2.so.1.0 std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >& boost::algorithm::iter_split<std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >, boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const, boost::algorithm::detail::token_finderF<boost::algorithm::detail::is_any_ofF<char> > >(std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >&, boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const&, boost::algorithm::detail::token_finderF<boost::algorithm::detail::is_any_ofF<char> >) 683 7.8832 libwccl.so.0.0 Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 1484 17.1283 libwccl.so.0.0 Wccl::Value::to_string_u(Corpus2::Tagset const&) const 2333 26.9275 libwccl.so.0.0 Wccl::TSet::to_string(Corpus2::Tagset const&) const 3216 37.1191 wccl-run Runner::do_sentence(boost::shared_ptr<Corpus2::Sentence> const&) 31452 2.3590 libc-2.11.1.so _int_free 31452 100.000 libc-2.11.1.so _int_free [self] ------------------------------------------------------------------------------- 1307 100.000 wccl-run Runner::do_sentence(boost::shared_ptr<Corpus2::Sentence> const&) 30582 2.2938 libicuuc.so.42.1 icu_4_2::UnicodeString::padTrailing(int, unsigned short) 30582 100.000 libicuuc.so.42.1 icu_4_2::UnicodeString::padTrailing(int, unsigned short) [self] ------------------------------------------------------------------------------- 93 0.8188 wccl-run Runner::do_sentence(boost::shared_ptr<Corpus2::Sentence> const&) 682 6.0046 libwccl.so.0.0 Wccl::Predicate::evaluate(bool, Wccl::FunExecContext const&) 721 6.3479 libwccl.so.0.0 Wccl::Operator<Wccl::TSet>::apply(Wccl::SentenceContext const&) 734 6.4624 libwccl.so.0.0 Wccl::GetSymbols::apply_internal(Wccl::FunExecContext const&) const 1668 14.6857 libwccl.so.0.0 Wccl::IsSubsetOf<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 2794 24.5994 libwccl.so.0.0 Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 4666 41.0812 libwccl.so.0.0 Wccl::PointAgreement::apply_internal(Wccl::FunExecContext const&) const 30481 2.2862 libstdc++.so.6.0.13 __dynamic_cast 30481 100.000 libstdc++.so.6.0.13 __dynamic_cast [self] ------------------------------------------------------------------------------- 762 0.1391 libwccl.so.0.0 Wccl::Operator<Wccl::TSet>::base_apply(Wccl::SentenceContext const&) 169102 30.8664 libwccl.so.0.0 Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 377988 68.9945 libwccl.so.0.0 Wccl::Operator<Wccl::TSet>::apply(Wccl::SentenceContext const&) 28941 2.1707 libwccl.so.0.0 Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 169102 31.2638 libwccl.so.0.0 Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 146649 27.1127 libwccl.so.0.0 Wccl::PointAgreement::apply_internal(Wccl::FunExecContext const&) const 121055 22.3808 libwccl.so.0.0 Wccl::IsSubsetOf<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 28941 5.3507 libwccl.so.0.0 Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const [self] 19504 3.6059 wccl-run boost::detail::shared_count::~shared_count() 16174 2.9903 libwccl.so.0.0 boost::shared_ptr<Wccl::TSet const> boost::dynamic_pointer_cast<Wccl::TSet const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&) 10916 2.0182 libwccl.so.0.0 Wccl::Constant<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 9840 1.8192 libwccl.so.0.0 boost::shared_ptr<Wccl::Bool const> boost::dynamic_pointer_cast<Wccl::Bool const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&) 4027 0.7445 libwccl.so.0.0 Wccl::GetSymbols::apply_internal(Wccl::FunExecContext const&) const 2794 0.5166 libstdc++.so.6.0.13 __dynamic_cast 2278 0.4212 libwccl.so.0.0 Wccl::Constant<Wccl::Position>::apply_internal(Wccl::FunExecContext const&) const 2063 0.3814 libc-2.11.1.so free 1530 0.2829 libwccl.so.0.0 Wccl::TSet::matching_categories(Corpus2::Tag const&) const 1318 0.2437 libwccl.so.0.0 Wccl::SentenceContext::get_abs_position(Wccl::Position const&) const 744 0.1376 libwccl.so.0.0 boost::shared_ptr<Wccl::Position const> boost::dynamic_pointer_cast<Wccl::Position const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&) 683 0.1263 libc-2.11.1.so _int_free 631 0.1167 libwccl.so.0.0 Wccl::Predicate::False(Wccl::FunExecContext const&) 556 0.1028 libcorpus2.so.1.0 Corpus2::Tag::operator==(Corpus2::Tag const&) const 481 0.0889 libwccl.so.0.0 Wccl::TSet::categories_count(Corpus2::Tagset const&) const 394 0.0728 libwccl.so.0.0 boost::detail::sp_counted_base::destroy() 352 0.0651 libwccl.so.0.0 boost::detail::sp_counted_impl_pd<Wccl::TSet*, boost::detail::sp_ms_deleter<Wccl::TSet> >::~sp_counted_impl_pd() 331 0.0612 libwccl.so.0.0 boost::detail::sp_counted_impl_pd<Wccl::TSet*, boost::detail::sp_ms_deleter<Wccl::TSet> >::dispose() 209 0.0386 libwccl.so.0.0 Wccl::Predicate::evaluate(bool, Wccl::FunExecContext const&) 186 0.0344 libstdc++.so.6.0.13 operator delete(void*) 102 0.0189 libwccl.so.0.0 Wccl::Predicate::True(Wccl::FunExecContext const&) 27 0.0050 libwccl.so.0.0 boost::shared_ptr<Wccl::TSet> boost::make_shared<Wccl::TSet>() ------------------------------------------------------------------------------- 1 0.2481 libwccl.so.0.0 _fini 2 0.4963 libcorpus2.so.1.0 _fini 197 48.8834 libicuuc.so.42.1 _fini 203 50.3722 libc-2.11.1.so malloc 26572 1.9930 libicuuc.so.42.1 u_strFromUTF8WithSub_4_2 26572 100.000 libicuuc.so.42.1 u_strFromUTF8WithSub_4_2 [self] ------------------------------------------------------------------------------- 37 0.0690 libwccl.so.0.0 Wccl::Operator<Wccl::TSet>::apply(Wccl::SentenceContext const&) 4027 7.5105 libwccl.so.0.0 Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 49554 92.4205 libwccl.so.0.0 Wccl::IsSubsetOf<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 25241 1.8932 libwccl.so.0.0 Wccl::GetSymbols::apply_internal(Wccl::FunExecContext const&) const 25241 48.0031 libwccl.so.0.0 Wccl::GetSymbols::apply_internal(Wccl::FunExecContext const&) const [self] 11938 22.7036 libwccl.so.0.0 boost::shared_ptr<Wccl::TSet> boost::make_shared<Wccl::TSet>() 5521 10.4998 libwccl.so.0.0 Wccl::Constant<Wccl::Position>::apply_internal(Wccl::FunExecContext const&) const 4941 9.3968 libwccl.so.0.0 boost::shared_ptr<Wccl::Position const> boost::dynamic_pointer_cast<Wccl::Position const, Wccl::Value const>(boost::shared_ptr<Wccl::Value const> const&) 2322 4.4160 libwccl.so.0.0 boost::detail::sp_counted_impl_pd<Wccl::TSet*, boost::detail::sp_ms_deleter<Wccl::TSet> >::get_deleter(std::type_info const&) 844 1.6051 libwccl.so.0.0 boost::detail::sp_enable_shared_from_this(...) 734 1.3959 libstdc++.so.6.0.13 __dynamic_cast 689 1.3103 libwccl.so.0.0 Wccl::SentenceContext::get_abs_position(Wccl::Position const&) const 352 0.6694 libstdc++.so.6.0.13 operator new(unsigned long) ------------------------------------------------------------------------------- 4 0.0189 wccl-run boost::detail::shared_count::~shared_count() 4 0.0189 libc-2.11.1.so memmove 5 0.0237 libcorpus2.so.1.0 Corpus2::BufferedChunkReader::get_next_sentence() 21 0.0994 libcorpus2.so.1.0 Corpus2::Sentence::append(Corpus2::Token*) 30 0.1419 libcorpus2.so.1.0 Corpus2::Tagset::tag_to_symbol_string_vector(Corpus2::Tag const&, bool) const 66 0.3123 libcorpus2.so.1.0 boost::detail::sp_counted_impl_pd<Corpus2::Sentence*, boost::detail::sp_ms_deleter<Corpus2::Sentence> >::dispose() 67 0.3170 libcorpus2.so.1.0 Corpus2::XmlReader::start_lexeme(std::deque<xmlpp::SaxParser::Attribute, std::allocator<xmlpp::SaxParser::Attribute> > const&) 241 1.1403 libcorpus2.so.1.0 Corpus2::XmlReader::on_end_element(Glib::ustring const&) 242 1.1450 wccl-run Runner::do_stream(std::istream&, bool) 242 1.1450 libcorpus2.so.1.0 Corpus2::Tagset::parse_simple_tag(boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const&, Corpus2::Tagset::ParseMode) const 665 3.1464 libcorpus2.so.1.0 std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >& boost::algorithm::iter_split<std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >, boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const, boost::algorithm::detail::token_finderF<boost::algorithm::detail::is_any_ofF<char> > >(std::vector<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> >, std::allocator<boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > > >&, boost::iterator_range<__gnu_cxx::__normal_iterator<char const*, std::string> > const&, boost::algorithm::detail::token_finderF<boost::algorithm::detail::is_any_ofF<char> >) 2063 9.7611 libwccl.so.0.0 Wccl::Conditional<Wccl::TSet>::apply_internal(Wccl::FunExecContext const&) const 3653 17.2841 libwccl.so.0.0 Wccl::TSet::to_string(Corpus2::Tagset const&) const 5259 24.8829 libwccl.so.0.0 Wccl::Value::to_string_u(Corpus2::Tagset const&) const 8573 40.5630 wccl-run Runner::do_sentence(boost::shared_ptr<Corpus2::Sentence> const&) 24999 1.8750 libc-2.11.1.so free 24999 82.8330 libc-2.11.1.so free [self] 2600 8.6150 libstdc++.so.6.0.13 std::string::replace(unsigned long, unsigned long, char const*, unsigned long) 2399 7.9490 libstdc++.so.6.0.13 std::string::_M_replace_safe(unsigned long, unsigned long, char const*, unsigned long) 116 0.3844 libc-2.11.1.so __strlen_sse42 38 0.1259 libstdc++.so.6.0.13 std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) 28 0.0928 libstdc++.so.6.0.13 char* std::string::_S_construct<char const*>(char const*, char const*, std::allocator<char> const&, std::forward_iterator_tag) ------------------------------------------------------------------------------- 538 0.6132 libwccl.so.0.0 Wccl::TSet::to_string(Corpus2::Tagset const&) const 87198 99.3868 libcorpus2.so.1.0 Corpus2::Tagset::tag_to_symbol_string(Corpus2::Tag const&, bool) const 24925 1.8695 libcorpus2.so.1.0 Corpus2::Tagset::tag_to_symbol_string_vector(Corpus2::Tag const&, bool) const 24925 29.5250 libcorpus2.so.1.0 Corpus2::Tagset::tag_to_symbol_string_vector(Corpus2::Tag const&, bool) const [self] 18772 22.2364 libcorpus2.so.1.0 Corpus2::Tagset::get_attribute_mask(signed char) const 11525 13.6520 wccl-run std::vector<std::string, std::allocator<std::string> >::_M_insert_aux(__gnu_cxx::__normal_iterator<std::string*, std::vector<std::string, std::allocator<std::string> > >, std::string const&) 9513 11.2687 libstdc++.so.6.0.13 std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&) 7022 8.3179 libcorpus2.so.1.0 Corpus2::Tagset::attribute_count() const 5205 6.1656 libcorpus2.so.1.0 Corpus2::Tagset::get_pos_name(std::bitset<64ul>) const 3014 3.5702 libcorpus2.so.1.0 boost::iterator_range<PwrNlp::set_bits_iterator<std::bitset<64ul> > > PwrNlp::set_bits<64ul>(std::bitset<64ul> const&) 1912 2.2649 wccl-run std::string* std::__uninitialized_copy_a<std::string*, std::string*, std::string>(std::string*, std::string*, std::string*, std::allocator<std::string>&) 1632 1.9332 libstdc++.so.6.0.13 operator new(unsigned long) 669 0.7925 libcorpus2.so.1.0 Corpus2::Tagset::get_pos_index(std::bitset<64ul>) const
To co się mnie rzuca w oczy to stosunkowo dużo dynamic_castow i ~shared_count. Być może pomogloby przearchitekturowanie Function<T> tak, aby zwracalo const T& lub samo T, zamiast shared_ptr<T>, w sytuacji gdy typ jest pewny.
#4 Updated by Tomasz Śniatowski over 12 years ago
Pierwsze szybkie porównanie z JOSKIPI: jeden operator, korpus 88K tokenów, komp roboczy w 446. Wyniki powtarzalne co do kilku %.
$ time jtester --input-file /home/local/folds/R09Folds/folds/test01.xml and( equal(orth[0], "postulaty"), //inter(flex[0], subst) in(flex[0], {subst,depr,ger}) ) File with sentences... Processed 5460 sentences. Operator was equal to given value 2 times... real 0m8.025s user 0m7.980s sys 0m0.020s $ time wccl-run /home/local/folds/R09Folds/folds/test01.xml 'and(equal(orth[0],"postulaty"),in(class[0],{subst,depr,ger}))'|grep True|wc -l 2 real 0m2.840s user 0m2.820s sys 0m0.030s time wccl-run -i xces-fast /home/local/folds/R09Folds/folds/test01.xml 'and(equal(orth[0],"postulaty"),in(class[0],{subst,depr,ger}))' "$a" "$a" |grep True|wc -l 2 real 0m2.119s user 0m2.100s sys 0m0.050s
#5 Updated by Tomasz Śniatowski over 12 years ago
JTester vs wcclrun na 1 foldzie freka (88ktok) i takim operatorze:
//-------------------------------------------------------- //verbInfObj //bezokolicznik jako potencjalny argument (element predykatywny argumentu) elementu werbalnego //cechą jest forma podst. elementu werbalnego // //znane błędy: "szkoły zostały zamknięte" //-------------------------------------------------------- and( equal(flex[0],{inf}), llook(-1,begin,$V,and( in(flex[$V],{fin,praet,inf,imps,impt,pred,winien,pcon,pant,ppas,pact})//, //inter(base[$AR],{"czasownik"}) )), or( in(flex[$V],{fin,praet,pcon,pant,imps,impt,pred,winien}), and( in(flex[$V],{ppas,pact}), not(rlook($+1V,-1,$S, and( in(flex[$S],{subst,ger,depr}), agrpp($V,$S,{cas,nmb,gnd},3)) )) ), and( in(flex[$V],{inf}), not(rlook($+1V,-1,$C, in(base[$C],{"ani","albo","czy","i","lub","oraz",","}) )) ) ) )//and
JTester: 5.5sek, wcclrun: 2.6sek, wcclrun -i xces-fast: 1.5s