User talk:Kipcool/stats
From OmegaWiki
Contents |
[edit] scripts
[edit] list of expressions
[edit] sql query, fast
USE omegawiki SELECT spelling, language_id FROM uw_expression_ns WHERE expression_id IN ( SELECT DISTINCT expression_id FROM uw_syntrans WHERE remove_transaction_id IS NULL ) ;
remark: the number of expressions obtained that way is one for each language: the word "toto" will appear twice if it exists in 2 languages, but only once if it has 2 definitions in one language. In the stats http://www.omegawiki.org/extensions/Wikidata/util/stats.php , remove_transaction_id is not taken into account, therefore displaying a wrong higher number of expressions:
USE omegawiki SELECT spelling, language_id FROM uw_expression_ns ;
[edit] http queries, slow
Note: It stops if it falls on an expression with a "[". In that case, you have to jump it manually.
#!/usr/bin/perl
use LWP::Simple
open (OUTPUT, ">WZlist.txt") ;
# 1 : get the list of pages
$continue = 1 ;
$starturl = 'http://www.OmegaWiki.org/index.php?title=Special%3AAllpages&from=%21&namespace=16&columns=3' ;
print "url de depart : $starturl\n" ;
$lapage = get $starturl ;
die "Couldn't get $starturl" unless defined $lapage;
# print $lapage ;
while ( $continue == 1 ) {
$allarticle = "" ;
while ($lapage =~ m/title="(OmegaWiki:.*?)">/g) {
$allarticle .= "http://www.OmegaWiki.org/$1\n" ;
}
print OUTPUT $allarticle ;
if ( $lapage =~ /(title=Special:Allpages&from=[^&]*&namespace=16&columns=3)" title="Special:Allpages">Next page/ ) {
$nexturl = $1 ;
$nexturl =~ s/&/&/g ;
$nexturl =~ s/:/%3A/g ;
$nexturl = 'http://www.OmegaWiki.org/index.php?'.$nexturl ;
print "$nexturl\n" ;
$lapage = get $nexturl ;
# print $lapage ;
die "Couldn't get $nexturl" unless defined $lapage;
} else {
$continue = 0 ;
}
}
close (OUTPUT) ;
[edit] list of DMs
[edit] sql query
USE omegawiki SELECT DISTINCT defined_meaning_id FROM uw_defined_meaning WHERE remove_transaction_id IS NULL ;
[edit] http queries = slow
#!/usr/bin/perl
use LWP::Simple
open (OUTPUT, ">DMlist.txt") ;
# 1 : get the list of pages
$continue = 1 ;
$starturl = 'http://www.OmegaWiki.org/index.php?title=Special%3AAllpages&from=%21&namespace=24&columns=3' ;
print "url de depart : $starturl\n" ;
$lapage = get $starturl ;
die "Couldn't get $starturl" unless defined $lapage;
# print $lapage ;
while ( $continue == 1 ) {
$allarticle = "" ;
while ($lapage =~ m/title="(DefinedMeaning:.*?)">/g) {
$allarticle .= "http://www.OmegaWiki.org/$1\n" ;
}
print OUTPUT $allarticle ;
if ( $lapage =~ /(title=Special:Allpages&from=[^&]*&namespace=24&columns=3)" title="Special:Allpages">Next page/ ) {
$nexturl = $1 ;
$nexturl =~ s/&/&/g ;
$nexturl =~ s/:/%3A/g ;
$nexturl = 'http://www.OmegaWiki.org/index.php?'.$nexturl ;
print "$nexturl\n" ;
$lapage = get $nexturl ;
# print $lapage ;
die "Couldn't get $nexturl" unless defined $lapage;
} else {
$continue = 0 ;
}
}
close (OUTPUT) ;
[edit] DM's
The number of DM's on Google is also about the same as your count (even a bit higher)
| date | #Expression | #DefinedMeanings | Yahoo: [1], | Google EXPR: [2] | Google DM: [3] |
| 23 jan 2007 | 179'890 | 12'470 | 21'700 | 172'000 | 12'600 |
HenkvD 15:07, 23 January 2007 (EST)
[edit] Updating?
I found that when I was actively adding Khmer synonyms and definitions that seeing the numbers mount up in your statistics was very gratifying. I suppose I could figure out how to run the queries posted here, but that's no real replacement for checking your page weekly. Won't you resume running (and posting) these statistics? Rsperberg 08:54, 11 June 2008 (EDT)

