As an anonymous user, you can only add new data. If you would like to also modify existing data, please create an account and indicate your languages on your user page.

International Beer Parlour/Archive1

From OmegaWiki
Jump to: navigation, search











Babel templates[edit]

We do ask everybody to enter information on their language proficiency. We have copied many templates (User & Babel) from the English Wikipedia. They are not all conforming to the same layout and information. We would appreciate it when you check the templates that apply to you. When there are issues with particular templates, please report them in the Insect room. GerardM 12:56, 12 March 2006 (CET)
Please pay attention to category:Task list which lists templates which do not correspond exactly to template:User en, User en-1, User en-2, User en-3, User en-4, User en-5. Thanks in advance! user:Gangleri | Gangleri 23:47, 14 March 2006 (CET)

Template:Babel-1 The font size for some content (including the text in the (user templates) need to be optimized. This is why OmegaWiki will use style templates for each language. In these template font families, font size, directionality, etc. should be specified. The format for these templates is "Template:Style xx" where "xx" is the language code.
Please help us implement the most suitable configuration for your native language and let us benefit from the test you have done before at "your" local wiki.
Without Template:Style ml text would look as follows:
മലയാളം മാതൃഭാഷയായുള്ള വ്യക്തി.

{{Style ml|ml='''[[:Category:User ml|മലയാളം]]''' '''[[:Category:User ml-N|മാതൃഭാഷയായുള്ള]]''' വ്യക്തി.}}


Languages with script variants should use the variant after the language code: Template:Style sr-ec, Template:Style sr-jc, Template:Style sr-el, Template:Style sr-jl.
Regards user:Gangleri | Gangleri 22:08, 14 March 2006 (CET)


When an Expression is in a characterset that is different from what is standard for a particular wiktionary, often transliterations have been added. The problem with transliterations is, that they are language specific. It means that in order to have transliterations, you have to have values that are language specific. It is worse, they should be dialect specific. This is one reason why there is at this moment in time to room reserved for transliterations in OmegaWiki. A word like Լեհաստան (Lehastan) will loose the Lehastan.. I wonder if an Armenian would pronounce the word like an American, or an Autstralian or a Brit. GerardM 20:23, 13 March 2006 (CET)


Beside project talk:Main Page#upper / lower case conventions and the suggestion at the end of Insect room#sort order in category:User en about the place for categories one should agree probably on more conventions. Maybe Conventions or project:Conventions would be the right place. Regards Gangleri | T 03:10, 14 March 2006 (CET)

Languages and dialects[edit]

I have been requested to bring the following discussion to this community portal from my talk page. So, the point is the following: I am interested in a dialect of the "eml" language (arzân), which is described in the following page: link to The full name of the language is "Emiliano-Romagnolo", and, as most Italian regional languages, it derives directly from Latin (not through Italian, I mean). It has an ISO-639-3 code (eml), but no ISO-639-2 code (where it was generically classified as "other romance", which is however just a bag for what does not fit elsewhere). The problem with this language is that a koine does not exist: "eml" is just a generic name for a set of different dialects of latin with strong lexical and grammatical links. So, no template is possible for "eml", strictly speaking, only separate templates for each dialect (mine is classified as "Western Emiliano", I suppose, but even this is too broad). How are these dialects with no code supposed to be treated? As subcases of ISO-639-3 languages? Do we need more specific templates, accepting not only a language code and a description, but also a dialect specification? --Bettelli 17:04, 21 March 2006 (CET)

Thank you to pass over here with this discussion. As said before: we will have plenty of these issues. One of the basic requirements to get the fine subdivision should be the completion of the Swadesh list of approx. 200 words as basis since this is the way to really show the differences between one language and the other - one "dialect" to the other. You mentioned that there are already dictionaries that include more than 20.000 words - maybe it would be helpful to list them somewhere. (to Gerard: where should we put such ressources as long as we don't have a portal page for a dialect/language?) As for the codes: sometimes it is hard to decide how to distinguish them. For Griko Salentino, for now we decided to use el-ita, because it is a language that comes from Greek and is spoken in Italy, but even there we have at least two varieties that differ in terminology. So how can we deal with that? ISO-language code + local attribution (large) + local attribution (more restricted?) - example: el-ita-cal, el-ita-pug? And a similar system for Emiliano? eml-west or like proposed arsan or arzan? Hmmmm .... --Sabine 18:32, 21 March 2006 (CET)

Tok Pisin babel template text[edit]

I've worked out a (tentative) set of translations for the babel templates for tpi:

  • tpi-1: Dispela manmeri i save rait long Tok Pisin liklik tasol.
    • Literally: This person [knows how to write/can write] in Tok Pisin a small amount only.
  • tpi-2: Dispela manmeri i save rait long Tok Pisin planti liklik.
    • Literally: This person can write in Tok Pisin quite a bit ("planti liklik" = "plenty little")
  • tpi-3: Dispela manmeri i save rait long Tok Pisin planti.
    • Literally: This person can write in Tok Pisin lots.
  • tpi-4: Dispela manmeri i save rait long Tok Pisin klostu olsem wantok.
    • Literally: This person can write in Tok Pisin close to native.
  • tpi-5: Dispela manmeri i wokim bisnis long Tok Pisin.
    • Literally: This person [works in/makes business out of] Tok Pisin.
  • tpi: Dispela manmeri tok bilong em i Tok Pisin.
    • Literally: This person's language is Tok Pisin. ("native" is implied in "tok bilong em")
  • And for the user_tpi category: Dispela manmeri olgeta ol i save tok long Tok Pisin.
    • Literally: These people can talk in Tok Pisin.

Corrections are more than welcome. I've included the literal translations so that you can see some of the decisions I had to make when translating. I know no word for "user"; it's easy to say "person uses" ("manmeri i yusim"), but that would make it difficult to convey "contributes" in the same sentence. (Well, for me. With luck, better speakers can come along and improve on this.) --Wytukaze 21:40, 14 March 2006 (CET)

Hmm, turns out Wikipedia does have some tpi templates after all. You lied to me, Gerard! ;) --Wytukaze 23:56, 14 March 2006 (CET)
I did not find them .. Lying is with intent .. so I was mistaken .. GerardM 00:40, 15 March 2006 (CET)

Occitan language templates[edit]

Could someone please have a look at the Occitan language templates and tell us if they are OK? Please have a look at the category. The templates were edited by an anonimous user and not knowing any Occitan I cannot check things. Thank you! --SabineCretella 14:54, 20 March 2006 (CET)

According to some web searches, it's ok. But I don't speak Occitan, so I'm not sure. Kipcool 19:43, 20 March 2006 (CET)
I am too primitive an occitan speaker, I'll extend the invitation to people who professionally deal with the language. --Bèrto 'd Sèra 21:11, 4 May 2006 (CEST)

Template:User pt-5[edit]

A suggestion for the text on the User pt-5 template: "Este usuário pode contribuir com um nível profissional de português."

Já feito. (Already done) --Shibo77 08:32, 21 March 2006 (CET)

It is ok to have user names in other scripts[edit]

Well, this says it all :) --ゲラルド・メイセン 09:30, 21 March 2006 (CET)

That would make it very difficult to find a username with non-ASCII characters. --Shibo77 11:25, 21 March 2006 (CET)
And make it impossible to distinguish Ms ??? from Mr ???

Jcwf 16:26, 21 March 2006 (CET)

A user has to be unique. That is basically what it is about.. When a user does not have a Latin background, he is not expected to have a Latin based username. This IS a multi language environment and you cannot expect to have one language or even one script to define the project. GerardM 16:31, 21 March 2006 (CET)
I'm sorry that I have to disagree. It is exactly because this has an international community that we need a "least common denominator". Hypothetically there are equal amounts of users from 200 different language versions of Wiktionary. More than half of them have Latin based scripts, and a great majority of them are able to type ASCII alphanumerics easily, without much trouble or installing new software. If a username was User:ゲラルド・メイセン, it would greatly favour those familiar with the Japanese script, which is 1/200 of the total user population, and heavily disfavour the other 199/200 because the other 199/200 will probably have to install a new software in order to even see the characters properly. If I wanted to find the user, ゲラルド・メイセン, unless by chance it is in the recentchanges list, it would be next to impossible to find that person. Moreover, there are times when all that is shown from a different script are boxes. On the Chinese Wiktionary, there was a vandal who made about 40 accounts calling XX names (all with 4 character usernames). As you might know the Chinese projects uses character conversion. There are four different sets of characters, and four different language settings. This means every character may have 16 different versions, and a four character username like the vandals could have 64 different versions depending on the language/character/input settings . In order to reach the vandal, I have to try each of the 64 settings. Fortunately, the vandal's usernames all happened to be shown at the top of the Allusers list. This is why I think usernames here should be easily typable on a standard alphanumeric keyboard. --Shibo77 17:38, 21 March 2006 (CET)
I agree. I have no problem with such usernames using such characters in their signatures, on their userpages, etc... but there has to be a way to find them... Celestianpower Hablame 19:57, 21 March 2006 (CET)
In many countries, including Italy, where I was born, it's illegal to use your script and your language to define yourself. My real name, as you see it here, will never be allowed to appear on my passport, as issued by such a "democratic" country. You are supposed to use a "common denominator" (italian in this case, but the id of the denominator does not change the problem, I guess occitans have the same kind of trouble in France). I find this simply insulting and I suppose anyone having his name in a non-latin script would feel the same, if forced to use an ASCI transliteration. There is no need to be able to pronounce my name, all you need to do is to click on it. And should this really become a problem, then using a progressive ID number coupled to the name would solve any problem whatsoever. With no need to see our names once again blurred into funny foreign scripts. Respect and practicity can walk toghether --Bèrto 'd Sèra 21:26, 4 May 2006 (CEST)

Gerard I assume this was only a theoretical question. It should be self evident that all languages and scrips are allowed in a multilingual project and comunity. We need to forget most of the issues we experienced before. special:Listusers does not list any non Latin based account. To see the fun about having mixt scripts in account names just take a look at [1] and [2].
Don't fear! The content from [3] reads right to left at [4]. Regards Gangleri · T 02:00, 22 March 2006 (CET)

Kirrkirr, a great tool to visualise dictionary content[edit]

I found this document for a great tool that does the visualisation of dictionaries. The tool can be downloaded from here. I will get into contact with Stanford and it would be great if we can cooperate :). GerardM 09:27, 22 March 2006 (CET)

I do not believe anyone can read this[edit]

Have you seen this:
Arbëreshë -  Afaraf -  Arvanitika -  Аҧсуа -  Afrikaans -  アイヌ イタㇰ -  Akan -  Gheg -  Tosk -  አማርኛ -  Amuzgo -  Englisc -  العربية -  ܐܪܡܝܐ -  aragonés -  Mapudungun -  অসমীয়া -  asturianu -  Авар МацӀ -  Aymar -  Azərbaycan dili -  Башқорт -  Bamanankan -  Boarisch -  Basa Bali -  Беларуская -  বাংলা -  भोजपुरी -  Bislama -  བོད་ཡིག -  bosanski -  ইমার ঠার/বিষ্ণুপ্রিয়া মণিপুরী -  brezhoneg -  Bodo -  буряад хэлэн -  Български -  Basa Ugi -  català -  Sugboanon -  česky -  Chamoru -  Нохчийн -  Chinuk wawa -  Choctaw -  ᏣᎳᎩ -  Чӑваш чӗлхи -  Tsetsêhestâhese -  Chikasha -  国语 -  國語 -  Kernowek -  corsu -  Nēhiyaw/ᓀᐦᐃᔭᐤ -  kaszëbsczi -  Cymraeg -  dansk -  Deutsch -  Thuɔŋjäŋ -  Zazaki -  ދިވެހ -  Dolnoserbski -  རྫོང་ཁ -  Ελληνικά -  Emilià -  English -  Middle Englisce -  Eʋe -  Esperanto -  eesti -  Euskara -  Eʋe -  فارسى -  Føroyskt -  Na Vosa Vakaviti -  suomi -  français -  Frysk -  Arpitan -  Frasch -  Seeltersk -  Fulfulde -  furlan -  Galoli -  Gàidhlig -  Gaeilge -  Ga -  赣语 -  Alemán Coloniero -  Kiribati -  Galego -  گیلک -  Gaelg -  𐌲𐌿𐍄𐌹𐍃𐌺 -  Αρχαία Ελληνική -  avañe'ẽ -  Schwyzerdütsch -  Wayuύ -  ગુજરાતી -  X̲aat Kíl -  هَوُسَ -  Hausa -  客家话 -  客家話 -  Kreyòl ayisyen -  Hawai'i -  עברית -  Otjiherero -  हिन्दी -   -  Hiri Motu -  Hornjoserbšćina -  湘语/湘话 -  magyar -  hrvatski -  Հայերեն -  Interlingua -  Bahasa Indonesia -  Interlingue -  Igbo -  Ido -  ꆇꉙ -  Iñupiak -  ᐃᓄᒃᑎᑐᑦ -  Ilokano -  íslenska -  italiano -  Basa Jawa -  Lojban -  日本語 -  ثاقبايليث -  Kalaallisut -  Kanuri -  कॉशुर -  ქართული -  Қазақша -  ភាសាខ្មែរ -  Gĩkũyũ -  kinyaRwanda -  Кыргызча -  Хакас тілі -  kurmancî -  kiKongo -  कोंकणी -  Kuanyama -  Kurdî/كوردی -  한국어/조선어 -  Hiraya -  Karjalan kieli -  Kölsch -  Kwaya -  Dzhudezmo -  Latina -  ພາສາລາວ -  Latviešu -  Lak -  Líguru -  Limburgs -  Lingála -  Lietuvių -  Ladin -  Lumbaart -  Lëtzebuergesch -  Luganda -  basa Madhura -  Kajin M̧ajeļ -  മലയാളം  -  मराठी -  Baso Minangkabau -  Македонски -  Kituba -  Bahasa Melayu -  Malti -  闽北语 -  闽北话 -  Māori -  Muskogee -  mirandés -  maxi -  ဗမာစာ -  مَزِروني -  闽南语 -  閩南語 -  Bân-lâm-gú -  Dorerin Naoero -  Diné bizaad -  napulitano -  Nahuatlahtolli -  Plattdüütsch -  Oshiwambo -  नेपाली -  नेपाल भाषा -  Nederlands -  nynorsk -  dönsk tunga -  norsk -  Novial -  Chi-Chewa -  lenga d'òc -  ଓଡ଼ିଆ -  Oromoo -  Иронау -  Pangasinán -  Kapampangan -  پنجابی -  ਪੰਜਾਬੀ -  Papiamentu -  Pennsilfaanisch Deitsch -  Pälzisch -  Norfuk -  पाली -  piemontèis -  Polski -  português -  دری -  Peskotomuhkati -  Runa Simi -  Vlax Romani -  kiRundi -  Romansh -  română -  armâneashti -  Русский -  うちなーぐち -  тыла -  Sängö -  संस्कृत -  sicilianu -  Scots -  සිංහල -  Slovenčina -  Slovenščina -  Gagana Samoa -  ChiShona -  سنڌي، سندھی -  सिन्धी -  af Soomaali -  seSotho -  español -  sardu -  српски -  srpski -  siSwati -  Basa Sunda -  kiSwahili -  svenska -  Schwäbisch -  Sächs'sch -  Reo Mā`ohi‎ -  தமிழ் -  Tatarça -  తెలుగు -  Tetun -  тоҷикӣ -  ไทย -  ትግርኛ -  Tagalog -  ou tokelauien -  tlhIngan Hol  -  lea fakatonga -  Tok Pisin -  Kokborok -  Türkmençe -  chiTumbuka -  Türkçe -  Tuvaluan -  Setswana -  Xitsonga -  Tweants -  Twi -  Тыва дыл -  удмурт кыл -  ئۇيغۇرچه -  Українська -  اردو -  vèneto -  Tshivenda -  tiếng Việt -  Vlaoms -  Volapük -  Walscher -  Wáray-Wáray -  Walon -  faka'uvea -  Wollof -  吴语 -  Хальмг -  isiXhosa -  ייִדיש/Yidish -  ede Yorùbá -  粵語/广东话 -  Zeêuws -  Sawcuengh -  中文 -  isiZulu

It looks great on the mainpage but I am sure that noone can read the character for the as language. Is this something that can be helped.. Is this something that needs fixing in MediaWiki? It should be more like this: অসমীয়া

Would the Ethnologue fonts that became available in January help ??? GerardM 23:53, 22 March 2006 (CET)

Style xx templates[edit]

{{msgnw:template:Style yi}}

<span style="font-size:{{{size|10pt}}};padding:4pt;line-height:1.25em; font-family: Times New Roman;" dir="rtl" >{{{yi}}}</span>

{{Style yi|yi=דער באַנוצער רעדט '''[[:Category:User yi|ייִדיש]]''' אַלס '''[[:Category:User yi-N|מוטער־שפּראַך]]'''.}}
<br clear="all" />
{{Style yi|yi=דער באַנוצער רעדט '''[[:Category:User yi|ייִדיש]]''' אַלס '''[[:Category:User yi-N|מוטער־שפּראַך]]'''.|size=8pt}}
<br clear="all" />
{{Style yi|yi=דער באַנוצער רעדט '''[[:Category:User yi|ייִדיש]]''' אַלס '''[[:Category:User yi-N|מוטער־שפּראַך]]'''.|size=12pt}}
<br clear="all" />

דער באַנוצער רעדט ייִדיש אַלס מוטער־שפּראַך.
דער באַנוצער רעדט ייִדיש אַלס מוטער־שפּראַך.
דער באַנוצער רעדט ייִדיש אַלס מוטער־שפּראַך.

GerardM please update the list at Style xx templates with all the language codes and insert the content of the templates already available there. As far as I know template:Style yi is the only style template using a default size {{{size|10pt}}}.
The "default value" for the optional parameter {{{size}}} should be used inside OmegaWiki to display text in that language both in the predefined MediaWiki namespaces and later also in the GEMET: namespace.
If a language comunity would like to use another size for the "Babel User templates" they should just add "|size='''yy'''pt" in their "Style xx templates"; they should do as they like. Best regards Gangleri · T 00:55, 23 March 2006 (CET)

different Chinese variations[edit]

Hi, I want to raise a discussion about the necessity of having so many variations of Chinese here. As you know, although there're a lot of dialects , or even languages according to some, consisting what we call Chinese, pratically they all have the same written form, with same vocabulary, same grammar. So I wonder if it's nessesary to have so many chinese portals here, since we communicate mainly through writing. I suggest to reduce the number of these variations to 3 or 4, e.g. zh, zh-wen (or: zh-yue, zh-min-nan). Well, I would still appreciate Shibo77's hard work of bringing them from zh.wikipedia.--Demos-λέγεις 09:38, 23 March 2006 (CET)

I would not presume to limit Greek to only one portal. The Griko spoken in Italy is in essence two distinct variations of Greek. I would not presume to limit that either. When the Chinese are of the opinion that the orthography is distinct, who are we to say otherwise ? China is so big, it is a continent in its own right. When you consider that how many languages exist in Europe, I would certainly have room for an equal amount of diversity in China. GerardM 10:00, 23 March 2006 (CET)
Sorry, maybe it's my signature which makes you confused. In fact I'm a Chinese and I don't speak Greek at all:) I've thought it over again, and I'm convinced that if the different portals are just for personal identification, I would absolutely agree with it, and that's what we do in Chinese Wikipedia. But if it's a place where we carry on our wiktionaries project related with Chinese, I would like to see everybody cooperate under a unifying zh portal. Perhaps I don't have a clear vision of the function of portals. I'd like to have your opinion. Thx. --Demos- 10:42, 23 March 2006 (CET)
Besides there are two sets of Chinese characters, there are differences among the vocabularies of different dialects. For example: 流動電話(HK),行動電話(TW),移动电话(CN). "今日仔" is found in Min-nan but not found in Hong Kong. --Xiaowei 17:36, 24 March 2006 (CET)
Hi, maybe it did not come over correctly what Gerard meant. The thing is that we have similar necessities for other languages as well. China is such a huge country that having varieties, that can be considered local languages, are normal. He took Griko Salentino as an example - for now this language is not considered as an own language since it does not even have an ISO code, but it is a separate language. It is neither Greek (many attribute it to it) nor any Italian dialect. It is a language that developed out of ancient Greek ... well I omit many things here. And even in these relativley small geographical regions we have differet languages - Griko is not Griko ... there are two varieties ... this example was meant to show that we want to identify these differences. Of course, there are words in common - like you have them with Italian and Neapolitan, like you have them with German and any German local language, but having terms in common does not mean they are the same language ... the differences are the important parts and these differences show if a language is to be considered as such. When you consider it the other way round: in financial texts in Italian you have most words in English ... so should we consider Italian be a subset of English? For sure not .... (just find that kind of my reasoning funny ... sorry, joking about myself ... I hope it comes through ... and this discussion helped me to find a way to talk about European languages ... thanks!). --Sabine 19:27, 24 March 2006 (CET)
Thank all of you for your opinion. I think it is a good subject worth of discussion, that in what extent a language can be deem as a language. At least in this project, what makes the difference between languages. As is in the "language".
If I've not misunderstood, when a language is defined, we have to translate all the [DefinedMeaning]] into its proper form. So, we'll have to treat as many languages as we commit. So if we have 20 variations of Chinese, the number of tranlations needed will be as many as 20 times the number of all the defined meanings.
What's more, I think you might have missed one point in my initial proposition, that all the variations of Chinese have almost the same vocabulary and the same grammar as mandarin. Since the wiktionary is to collect the words, we can only consider the words. Pratically I think the resemblance between the vocabularies (defined as the set of words) of diffrent variations is higher than 99% (I give the reasoning below, but it's not important). Which means that we'll have to redo 15.2 times the number of defined meanings in order to have the same words in different variations. It's not a pleasant job, isn't it?
So I'm wondering if it is worthwhile to accept so many languages just for fun. Or, more interestingly, would it possible to have heritage relationship between languages, so that when a child hasn't defined the meaning of a word, it can just take the value from his mother language. Thus we can have many (sub-)languages without multiply the data. I'm not a professional, so don't hesitate to correct me if you find it unreasonable.
*Reasoning: We just take order of magnitude. The proper vocabulary of each variation is of 10^2~10^3(maybe TW and HK are higher, but I don't think they're over 10^4; there's also zh-cmn which is almost identical to zh). As for the number of defined meanings, we can conservatively set it to 10^5~10^6 (an ordinary dictionary). Not to say that according to the high ability of combining words in Chinese, the number may be much higher. From that I deduce the result "99%". Again, I'm not a professional, so welcome to correct me if you find it unreasonable. --Demos- 23:51, 24 March 2006 (CET)
I agree, it looks to me as if it's the same difference than between the French spoken in France, the French spoken in Belgium, and the one spoken in Canada. I hope we'll do only one portal for French!
How many of these writing forms are official languages? Kipcool 00:16, 25 March 2006 (CET)
However, I do hope there will be a way to specify if a french word is used specifically in Belgium, or Canada, or Swissn... Koxinga 00:36, 25 March 2006 (CET)
Perhaps we could have only portals that are written significantly differently, but those differing mainly in pronunciation we would simply have the Babel templates so we could contact the user when in need of sound files? --Shibo77diskuto 10:51, 25 March 2006 (CET)

(return back to the beginning)I don't agree that. The different between these dialects is very importmant knowledge of human being. OmegaWiki need to include this. If a word is same in all dialects, we can only include a entry for zh; if it's different in yuh we can just add a entry for it. My mothertough is Taishan dialects, and if it has differents from yuh I will add a entry for it. --地球發動機 16:51, 15 April 2006 (CEST)


There has been a request to remove the templates for Andalusian. Andalusian is according to the Spanish law not a language. However, OmegaWiki is both not about what laws say (that is politics) and it is not only about languages. Andalusian I have been told is distinctly different, it has more Arab influences than Spanish has and it is closer to Neapolitan than Spanish is. Many words are specific to Andalusian and including these to Spanish because they are spoken in Andalusian is the wrong idea.

Another argument used is that it does not have a single orthography. This argument would exclude several languages that are already on the list, languages that have their own Wikipedia, so that is also not an argument not to have Andalusian.

OmegaWiki is about DefinedMeanings with strings attached :) GerardM 10:29, 25 March 2006 (CET)

Moment :-) it is not closer to Neapolitan than Spanish, but knowing Neapolitan you understand quite a lot more when hearing people speak Andalusian. Only knowing Spanish you would not reach the same level of comprehension. The Andalusian project on Wikicities for example is quite understandable to me :-) It is hard to say which language is closer to which in this case ...
Ortography: yes, all Italian minority languages would be excluded, since there is no official standard, all German minority languages would be excluded, nds (has up to 400 varieties only in Germany) would be excluded etc. etc. etc. Andalusian has its place in the language world (you can find texts online - poems, songs etc.). They are creating an encyclopaedia outside wikipedia, since things were taken up too much on a political level and not on the language level when discussing about this language - and I admire those who are working on the wikicities project since they DO instead of discuss ... yes, there are only a few of them, but also one person can do a lot ... As soon as we have editability it would be important to have the Swadesh list in OmegaWiki and we need it compiled for languages like Andalusian (with defined meanings ... and you will see that those approx 200 terms will become many, many more ...). --Sabine 11:45, 25 March 2006 (CET)
LOL not all of them :) We do have a standard orthography, since the end of the 18th century. But this won't solve the problem. Personally I do not see any problem (let alone a possible offence to nationalistic feelings) in the existence andalusian as an indipendent language/ortography. As long as there is a community willing to use a language, they have the right to do so. And nobody has the right to stop them from doing it. If you do not like andalusian, simply ignore it, it's not going to byte you, is it? Live and let live... it's so simple --Bèrto 'd Sèra 22:20, 6 May 2006 (CEST)
I am very sorry for being the "black sheep" in the discussion but Gerard doesn't seem to know the linguistic situation in Andalusia. If you anytime go to Andalusia you will see what an innovation is creating this new ortographic style. My claims do not have any political purpose. My claims are not based on any law, they are based on what I know about Andalusia, and about Spanish. Andalusian words in Spanish are regional Spanish words, or am I not reading Spanish when I read Federico García Lorca? Maybe, now, as I said already in other occasion, we have to change the name of what Spanish speaking people talks in Latinoamerican countries, as their "Spanish" comes mainly from Andalusian colonizers. You may go to Andalusia and ask on the street if what they talk is another language. --Javier Carro 13:51, 25 March 2006 (CET)
By the way, as we see that in many cases deciding what is and what is not a language will provoke discussion. How are we going to decide about it? The voting system used traditionally in Meta for creating new projects has probed to be obsolete. --Javier Carro 14:07, 25 March 2006 (CET)
I'd like to add that according to the Ethnologue Andalusian is a dialect of Spanish, as are the dialects spoken in South America, many of them further apart from standard Spanish than Andalusian. To the English speaking people it would be as if Australian, American, Texan and English should be considered different languages with their own grammar and dictionaries.
This takes us to the question of what languages are going to be accepted and what written variants are going to be accepted. If I say Texan is a language and, with four friends, I make a new written standard and decide we make our own Wikipedia/Dictionary, is it going to be accepted? What about artificial languages? Is Ethnologue going to be the authority or not? We'll have to reach a consensus, because if we don't we are going to have many problems.
Cheers, --Ecelan 13:01, 26 March 2006 (CEST)
Ethnologue is the authority for ISO-639. The procedures for a Wikipedia project is not the same as aplying for a portal in OmegaWiki. There is no point to it. You cannot vote if there is a reason to have a portal for a language, a dialect. If there is a large body of vocabulary specific, there is enough reason to have a portal. How we are going to present the data that is something that we have to sort out..
What I am waiting for is not that much how this discussion develops, I am much more interested in how it will morph in a discussion on how we are going to deal with American, Austranlian and all those other forms of English.... GerardM 14:12, 26 March 2006 (CEST)
Don't get me wrong, when I'm talking about is it going to be accepted? I'm not talking about making a new wiki or portal or something. I'm talking about the criteria for accepting a word/spelling as an entry. Similar examples would be if a group of texans or cockneys decide to spell their dialect more according to their pronounciation. How many people have to use the speling for it to be acceptable in this Wiki?
Let's see if I explain myself better. Ojo (eye) pronounced /oXo/ in Castillian and /oho/ in other varieties (Andalusian among them). The only written standard is ojo for all pronunciations. So, is oho spelling (as advocates for Andalusian as a language claim) as entry going to be acceptable? Where is the limit for a word or spelling to be acceptable? Literature? Google? Ethnologue? How many people have to use a language or a spelling to be on this wiki?
And please, do not interpret this as if I was trying to avoid localisms in the dictionary, that is not the case.
The problem is similar for conlangs. Are the Verdurian(Wikipedia) or Sildavian vocabularies welcome? Wikis usually are not primary sources, but is an external Web page enough to make the entry not a primary source? How many Web pages are necessary?
--Ecelan 17:01, 26 March 2006 (CEST)
Well, we were thinking about not natural languages, or better constructed/designed languages time ago. The thing is: they exist, but they are not natural. This means they are part of the artificial languages like Ido and Esperanto, or better like Klingon. Well if there are people who want to care about these languages, there 's no reason why we should not allow for adding them (it could well happen that sooner or later someone searches such a term), but these languages should not be automatically visible for anyone, or better one should be able to exclude them from being listed.
But one thing is clear: there is a huge difference between a language spoken by people and by one "built" for a comic series or whatever kind of publication. So please let us avoid mixing natural languages and artificially developed languages in this thread. If you want to go ahead with this discussion, please create a thread on artificial languages. Thank you! --Sabine 16:55, 12 April 2006 (CEST)
Yes, the only possible answer is into having people customise their list of languages to be shown. It's going to be way too many of them in any case, even if we accepted ONLY governmental languages. And people on slow connections (have you ever tried using internet on a boat, connecting on a mobile phone?) will definitely appreciate a low weight page, containing JUST what they need it to contain. I speak some 5-8 languages, I maybe interested into some more 15-20, because they are close enough to those I know for me to make use of the entries, but apart from curiosity I won't make any use of an arab/chinese/persian or hungarian entry. This is going to be the practical case for most users, if we let alone ideology . --Bèrto 'd Sèra 22:20, 6 May 2006 (CEST)


The organisation that is entrusted with the further development of the ISO-639 codes is Ethnologue. They have the description of a language on their website. I have created a template; {{Ethnologue|nld|nl|nld}} to link to this article at Ethnlogue and the values of the ISO-639-1 and the ISO-639-3.

I would like to see that this information becomes part of all Portals in one way or another.. The Ethnologue text is in English. GerardM 08:34, 26 March 2006 (CEST)


I have created a new template; {{Ethnologue-x|KUR|ku|kur|kmr}}. The first parameter is the SIL code, then ISO-639 1, 2 and 3 respectively. This is to indicate that the current code used is problematic. For Kurdish as we use in the WMF it definetly is.

Can someone help to make the template look better ? the iso-639-3 goes to a new line for reasons that I fail to understand.. GerardM 15:29, 26 March 2006 (CEST)

The new line has been fixed by putting a non-breakable space &nbsp;. The ISO 639-3 for Kurdish should be kur, see [5]. The more specific types of Kurdish are kur-ckb for Central Kurdish, kur-kmr for Northern Kurdish and kur-sdh for Southern Kurdish. ---Moyogo
Moyogo, in ISO-639-3 kur is a group of languages and what you call kur-kmr is not how Ethnologue defines them; it is just kmr. GerardM 19:12, 26 March 2006 (CEST)
OK, I guess I was talking about what is currently being defined by RFC 3066bis. ---Moyogo 20:35, 29 March 2006 (CEST)

Magical conversion[edit]

A multilingual project should use different extensions for magical character conversion and / or transcriptions.

{{convert|eo|Cxu vi kontribuas ankaux esperantlingve?}}

should work like {{subst:}} and change the source code while saving. The result in the page source should change to

Ĉu vi kontribuas ankaŭ esperantlingve?

"Amike" Gangleri · T 14:30, 26 March 2006 (CEST)

Do you mean: accept a string that is formatted in such a way that it CAN be converted and save it in the formatted way. GerardM 14:37, 26 March 2006 (CEST)
Yes "magical character conversion" is implemented in all Esperanto by default; you type "Cx" and the character saved in the source would be the UTF-8 character "Ĉ".
As soon as more and more languages will use LanguageConverter.php it will become common standard that you can type in any script variant of a language.
search MediaZilla for "LanguageConverter"
Best regards Gangleri · T 06:29, 27 March 2006 (CEST)
Now there's a wonderful idea! Would be great for Vietnamese, too. David 18:43, 26 March 2006 (CEST)


I have loaded the ERD with the OmegaWiki aricle; it helps explain the relation between the tables that are in use (I had send them to Erik for review) so I was able to :) GerardM 13:26, 28 March 2006 (CEST)

nested conditional templates[edit]

  • Halló! Dev_HEAD:template:WISOTestList#test shows an implementation of multilingual text with adaptations specific to the different scripts.
  • Conditional templates are very powerfull. Some people do neither like nested templates nor conditional one nor extendig template syntax and claim that it costs "performance".

MediaZilla:02777 / bugzilla:02777 – "{{substall:foo}} beside {{subst:bar}}"

  • This request would allow a bot to update relevant templates "on demand".
  • With conditional templates one can build all "Babel user templates with a few "tables". For each sentence one will need one basic "repository" / "template" with the translations and the language code.
  • I would like to build this generation of user templates at w:yi:. When they work there we could replace the templates here. The styling parameters can be used at all Wiktionaries. Would be happy about any help. Best regards Gangleri · T 16:43, 28 March 2006 (CEST)
  • Hi! It sounds as a great idea, but the example page seems to be empty. Where can I see an example? --Bèrto 'd Sèra 22:29, 6 May 2006 (CEST)

Friulian portal[edit]

I don't see Friulian/furlan portal (iso code:fur). Could you add it? BTW this project looks promising ;) Mandi Klenje


To what extent are we going to separate this from en? Do we need a Portal:en-us and Template:User en-us? Vildricianus 14:45, 29 March 2006 (CEST)

How to treat American-English eh Australian-English eh Canadian-English eh Jamaican-English, South-African-English. I have discussed this with many people and to be honest, there are several options. No choice has been made. A portal sure; have all the portals you like but please link to them from the Portal:en :) GerardM 19:10, 4 April 2006 (CEST)
In GEMET:biology, there are 2 English languages shown. Is en-us separated from en-uk in GEMET? Kipcool 14:43, 12 April 2006 (CEST)
In Gemet it certainly is and, there are good reasons to make this distinction. The orthography of English is decidedly different depending what tradition you adhere to. GerardM 16:29, 12 April 2006 (CEST)


Many languages at portals are like kurdish macro languages see [6]. Shall OmegaWiki make allowance for this? --Balû 09:13, 4 April 2006 (CEST)

We need these division - so it is clearly a yes for macro languages - they are different one from each other. Sorry for being short ... --Sabine 21:07, 4 April 2006 (CEST)
Arabic is one of those Macrolanguages, and apparently English will have a very similar status, so portals will be useful. ---Moyogo 00:04, 10 April 2006 (CEST)
Right now the macrolanguage portals are getting links to the various individual language that they cover, but that is a one-way street. Should the individual languages also get links to the macrolanguages?--Sannab 09:38, 15 May 2006 (CEST)
This is very much wanted.. The same goes for adding links to Wikipedia.. At this time only the English Wikipedia has sufficient coverage of languages that we can deal with this. Also, this will become different when we do have the Multilingual MediaWiki. GerardM 13:46, 15 May 2006 (CEST)

Macrolanguage portal format[edit]

The thing is we have not really IMO come up with a good format for macrolanguage portals, and it might be that we will not be able to find a standardized format. For Nahuatl (not a macrolanguage, but a language collective), Gerard did a long list in English, but then none of us currently working on OmegaWiki speak Nahuatl, in combination with a Babel-like box collection for all the included individual languages. For Chinese, there is another list format, in Chinese only.

If I understand it correctly, portals for macrolanguages and language collectives have been/should be included in Category:Ambiguous user language.

Another question is whether there should be user categories for macrolanguages or not, I am not sure. One option would be to have an undifferentiated user category (that is no 1-5,N) for the macrolanguage, since such specification seems less helpful for macrolanguages, but it might still be desirable to be able to identify as and locate speakers of a macrolanguage. If this is desirable, then the various user templates for the individual languages should most likely add the macrolanguage category too.

One thing to remember though, is that if we want people to start moving to the individual language codes in ISO-639-3, a macrolanguage portal should not be too useful *smile* --Sannab 10:33, 17 May 2006 (CEST)

special:Watchlist and special:Watchlist/edit[edit]

  • Halló! Behaviour in MediaWiki version 1.6: If you follow return from a "Wikibreak" and look at "Watchlist" you may notice that a lot of pages have disapeared. Deleted messages have disapeared from there.
CVS:/includes/SpecialWatchlist.php‎ and SVN:/includes/SpecialWatchlist.php?view=log
MediaZilla:05490 / bugzilla:05490 – "Your watchlist after a "wikibreak""
  • Regards Gangleri · T 14:21, 7 April 2006 (CEST)
Done Done SVN:/trunk/phase3/includes/SpecialWatchlist.php?r1=13517&r2=13538&pathrev=13538
Thanks! Gangleri · T Gangleri · T 09:09, 8 April 2006 (CEST)

What if there are only regional variations?[edit]

Hello, I'd like to point out a problem that applies to Italian regional languages but possibly to all language groups where a common standardized official dialect is missing. Inside the portal of indiviual languages we've written something like "we would like to take care of regional variations but we have to make basic work first". Ok I agree but... what when there isn't a standard version of what is considered to be an individual language?

Please consider sicilianu or napulitano. The first should correspond to the so-called "southern-extreme romance language", which includes variations spoken in three different regions of Italy: Sicily, Calabria and Salento. If we look to the corresponding portal on it.wiktionary ( and the actual babel templates here it seems that a single variation became representative of the entire language group. Not simply the "sicilian" variation but also a specific variation among the sicilian version of this language. Moreover the coat of arms of Sicily has been placed on the portal.

I don't want to criticize the good work made by the authors but... do you really think this is the right way? How a guy from Salento would agree to see his language, which has a good indipendent literature, to be arbitrarily annexed to the Palermo's dialect? Consider also the Napulitano, which should correspond to another romance language group that includes also Apulian dialects. Also here I'm very doubtful on the pertinence of the name "napulitano" (even if Ethnologue have this name, but Ethnologue seems to forget Apulian dialects and limits the napulitano to Campania, part of Lucania and Northern Calabria). Moreover I don't believe that an Apulian guy, and possibly also a Lucanian one, would agree to be a Napulitano-speaker and would agree on the fact that his language can be learned through the Napulitano course of wikibooks.

Also for the Greek dialects of Italy portal:el-ITA we face a similar situation: there are two variations that perhaps could be considered to form a language group apart from Modern Greek, derived from Koinè Greek. But what variation should be used for babel templates and portal, and tomorrow to give the first version of a term? I know the Salento's version and I used it to write the templates, but do you really think this is correct? I'm very skeptical on it.

Maybe in these cases we should create entries for individual regional variations (such as scn-PAL, scn-SAL..) especially when a good literature exist for a single variation and I think we should avoid creating misleading common portals and babel entries when a common official variation does not exist (or eventually accept common portals with the list of variations without babel boxes).

Thank you and sorry for this boring reasoning.

Frangisko 15:45, 9 April 2006 (CEST)

In OmegaWiki we can and we should distinguish between the different orthographies and the different dialects. When I asked you to create the Swadesh list for Greek, it was asked in such a way that you could do this for both versions of el-ITA (in effect putting a bomb under el-ITA). When people want to argue about languages, they can do it elsewhere. When people want to make a point about dialects and languages, they can do it here. The Swadesh list is one instrument to demonstrate the difference between dialects, languages and orthographies.. And yes, I want them all :) GerardM 17:02, 9 April 2006 (CEST)
Ok I've started to work on Swadesh lists for Extreme Southern Romance (variations of the single romance classified as ISO-693-3 scn) and on a Swadesh list for Greek languages (with two entries for the main variations). I'm not yet expert enough to create categories and sub-categories so I placed them on my user page and a new page. I can imagine that we will have to neaten all and collect all Swadesh lists somewhere... However, we'll see Calabrian Greek entries to compare the languages and decide if it would be proper to mantain an unique code for Grecanic or eventually create a new code with two regional variations such as grk-CAL or grk-SAL, or eventually consider them as variations of el, such as el-SAL, el-CAL, but in this case it should be more proper to consider them variations of koine greek (ke) or byzantine-Greek rather than Modern Greek (the problem is that the known written Byzantine Greek is not exactly the corresponding spoken language, the result is that Griko seems more similar to Modern Greek). Frangisko 19:53, 11 April 2006 (CEST)
Just one note on this: we are well aware of the fact that there are Apulian dialects that are very different to Neapolitan. The code nap is often understood as only being for the Neapolitan of Naples - so the Swadesh list is a way to show that nap as only attribution is not enough, that in fact there are these differences. Only by showing them, adding further sources we can really talk and say "we need a seperate language code for this or that language". For now we can only use the codes that are there and adapt them to our needs. We have the same problem for Emiliano and Romagnolo, and of course for many other languages that are still not considered as such, even if there are plenty of literature and documents. I hope this helps to understand why we are going this way. Thanks! --Sabine 22:31, 11 April 2006 (CEST)

Edit summaries[edit]

Just a quick question: How are edit summaries supposed to work? We can only put them in our native language and in an international project, that's not quite good enough is it? Perhaps in time, we could auto-translate these using the relational data depending on your user preference? Celestianpower Hablame 23:03, 9 April 2006 (CEST)

Perhaps there should be a kind of automated way to fill them in. Vildricianus 15:44, 15 April 2006 (CEST)
Yes - that would work, it would just say what you did ("added language field" or "added synonyms") and these could be auto-translated easily as the software would type them in automagically. Celestianpower Hablame 20:48, 20 April 2006 (CEST)

editing interface[edit]

Ok, a general question on how it will look like. (Yeah I know, I can wait, but...)

Take for example GEMET:Africa. The first one where I went. I saw the Polish section empty, so I naturally clicked [edit]. And what I see? Nothing. The edit box is empty. Well, one might expect that it will work more like [view source] on Wikipedia...

Anyway, what the editing interface will look like? The aforementioned page is just same text duplicated 4 times. Assuming I want to insert that "Polish" section, what will I edit? One section? Entire article at once? Will I have to copy the content 4 times? Or even more? (GEMET:Afrika, GEMET:Afryka, GEMET:Afrikka, GEMET:Afrique...)

Yours confused and impatient, Misza13 T C 23:48, 9 April 2006 (CEST)

impatient .. sure, same here .. As to your question. There is ONE DefinedMeaning, all that is needed is to translate is the definition into Polish .. :) GerardM 10:20, 10 April 2006 (CEST)

Thanks! Now more comments/questions to make sure I get it right:

  1. A GEMET: entry is just a plain entry for a word spelled out and doesn't carry any meaning by itself.
  2. When viewing a GEMET: entry, for each language in which it has a meaning, the DefinedMeaning is fetched and displayed (Celestianpower said it works by dynamic queries; unlike a casual Wiki), which explains duplicate entries (a word has a DefinedMeaning in many languages). Likewise, on GEMET:kort there is only Danski entry meaning map. However, kort also means court (like in tennis court) in polish. So when that DefineMeaning is expanded, it will show up on GEMET:kort as well?
  3. If I'm understanding 2. correctly, is it planned that duplicated DefinedMeanings be merged to reduce clutter (so that only distinct ones are actually shown)?
  4. Finally, if I'm understanding it right, I must say that this is a fantastic idea, great lingual-and-programming challenge (and experiment!) and the guy that came up with this idea is a freakin' genius!

Can't wait to see it in action! Yours, Misza13 T C 19:40, 10 April 2006 (CEST)

A Word does not have a DefinedMeaning. One specific Expression and an accompanying Definition in the same language make up a DefinedMeaning. ALL translations are not DefinedMeanings. When an Expression has multiple meanings each meaning will be associated with a DefinedMeaning. This DefinedMeaning does not need to be in Danish.

You are right that we will have to merge MANY DefinedMeanings in order to reduce clutter. When we merge multiple resources into OmegaWiki we will find a lot of redundancy all this has to be reduced to as few DefinedMeanings as is usefull. GerardM 17:10, 12 April 2006 (CEST)


Erik informed me that we are coming closer to the moment where OmegaWiki will have editability. I have seen a screenshot with two DefinedMeanings in there. The Definition is editable and it will be possible to add translations/synonyms (not edit them).

I will be ever so happy when the first people can start editing.. Because that is what it is all about. :) GerardM 19:25, 13 April 2006 (CEST)

As to timing, think this weekend.. GerardM 19:26, 13 April 2006 (CEST)

That's great. I'm very excited to see it. :-) --Tosca 20:41, 13 April 2006 (CEST)
This weekend seems to have passed :(. Celestianpower Hablame 20:50, 20 April 2006 (CEST)
Two weekends seem to have passed. Can we edit already? David 15:27, 24 April 2006 (CEST)
I am not happy about this. The good news is that I saw last Friday a version of software (running on a local host) that had WZ editability.. GerardM 20:10, 24 April 2006 (CEST)

Question on duplication[edit]

Here's a little example to illustrate a question I have about the OmegaWiki structure.

In English, the verb "to cut" refers to cutting with either a knife or a pair of scissors. However, in Dutch, "knippen" is (exclusively) for cutting with a pair of scissors (of any kind), whereas "snijden" is for all other meanings of "to cut" that I can think of.

Consider the following simplified scenario:

In the database, there exists only one DefinedMeaning, for the general action of "cutting". There are words associated with this DefinedMeaning, for different languages (which all have only one word for this DefinedMeaning), but Dutch has not yet been added. Then, at the point where we want to add the Dutch word, we find out about "knippen". At this point, we must add a new DefinedMeaning, because previously, no DefinedMeaning existed for "cutting with a pair of scissors".

What isn't entirely clear to me: what happens to the already existing associations of words with the original concept? Will they be duplicated and added to the new "cut with scissors" DefinedMeaning? Or will the new DefinedMeaning be created empty, and then associated only with the Dutch "knippen"?

Perhaps creating a new, empty DefinedMeaning will be a good idea in this particular example, because several languages have a word like "knippen". However, when OmegaWiki grows, I can imagine that we come across an exotic language that has a word without a fitting one-word translation in any other language. In that case, perhaps duplicating the entire DefinedMeaning and changing the entry for just the one language would be a better idea.

Is there a standard procedure planned to handle this eventuality, or will it be considered on a case-by-case basis? Or was I the first one to think of this (not very likely)? :)

I hope this is the right place to ask! If not, feel free to revert and post an answer to my user page. TIA! Laszlo 23:02, 13 April 2006 (CEST)

I think a new DefinedMeaning in all cases should be created. Added as a narrower term of cutting will link it to that. If you would make "knippen" without a new DefinedMeaning it would become a synonym, without the possablity to explain the difference. HenkvD 00:42, 14 April 2006 (CEST)
The DefinedMeaning associated with "knippen", would have "cut" as a translation. The definition defines that it is about cutting with sicors. The English "cut" does not cut it and is therefore marked as a non-endiginous meaning. Using relations it is indeed possible to indicate that "knippen" has a more precise meaning than "cut".
When a new DefinedMeaning is created, translations have to be consciously added. It cannot be assumed that what is good for one DM is also good for a similar DM. GerardM 13:15, 14 April 2006 (CEST)
Okidoki, sounds fair enough. Thanks for the answers! Laszlo 21:09, 15 April 2006 (CEST)

Porting of Swadesh lists from old Wiktionary[edit]

I noticed that there are some Swadesh lists in active development on this wiki. A lot more exist on the original Wiktionary (they all seem to belong to Category:Swadesh_lists). Shouldn't they be ported to this wiki? Or will this be done in an automated way at some point in the future? If so, should we continue editing/creating Swadesh lists on the old Wiktionary? Laszlo 16:24, 20 April 2006 (CEST)

Well, the Swadesh lists we are creating are being created thanks to a very particular reason: we have languages that have no official rules, where we have "language groups" and not single languages within a certain code. I don't know the policies on the English wiktionary - would they allow for many versions of Sicilian and many versions of Neapolitan (as well as for Lower Saxon and many other languages that have no officila standard)? If yes: go ahead creating them - if no: it would make sense to start them on OmegaWiki. The Swadesh list in all languages will be a project of its own - we will see that the 207 original terms will become much more, because of multiple defined meanings. It cannot be wrong to have Swadesh tables for every language :-)
Does that help? Thanks! --Sabine 20:34, 20 April 2006 (CEST)
I'm not exactly proficient in any of the languages you mentioned (or any other languages that don't have fully filled Swadesh lists yet), it was more of a general question. My main question was what would be done with the existing Swadesh lists, but I guess they will be ported once the project is started here, which I can hardly wait for! :) Any news on that BTW? GerardM? Laszlo 07:36, 21 April 2006 (CEST)
Well yes, I suppose we will import them here. Gerard is not here today ... well, I have the same feelings on "can hardly wait to be able to start doing things" :-) Ciao! --Sabine 08:03, 21 April 2006 (CEST)

Notes on editability for OmegaWiki namespace[edit]

  • When editing plese do not use the main edit tab, but the edit links on the right hand side of the screen. The main edit tab does not seem to work proprerly. --Sabine 19:48, 30 April 2006 (CEST)
    • This seems to be working fine now. —Vildricianus 16:44, 2 May 2006 (CEST)
  • I for myself will note on the discussion page of the article itself what I changed - so that this can be followed up if necessary. --Sabine 19:48, 30 April 2006 (CEST)
  • It is not possible to create wiki-links from within the definition. --Sabine 20:06, 30 April 2006 (CEST)
  • We cannot add a second different defined meaning to an existing one for now. --Sabine 20:48, 30 April 2006 (CEST)
  • Translations cannot be changed/corrected - please add problems with that on the discussion page adding also the template {{attention}}--Sabine 20:50, 30 April 2006 (CEST)
  • Defined meanings can be corrected/changed - please make sure that there are no misinterpretations and note a change of a defined meaning on the discussion page. This really essential. Thank you! --Sabine 20:52, 30 April 2006 (CEST)
  • Addition of synonyms is possible.--Sabine 21:00, 30 April 2006 (CEST)
  • The number of permissions to edit we can grant is not clear right now. It could well that the number is limited. Please don't be disappointed if we cannot grant you immediate access to edit the data.--Sabine 22:51, 30 April 2006 (CEST)
    There is no built-in limit to how many people we can accommodate. However, the current level of functionality is not built for concurrent access (so instead of getting an edit conflict, you will simply overwrite what someone else does), and most importantly, we have no way to roll back edits yet when they are bad, or to see exactly what has changed. Especially for the latter reason, I recommend only giving the functionality to people who a) are known and trusted, b) who understand how OmegaWiki is meant to work, e.g. what a DefinedMeaning is.--Erik 00:39, 1 May 2006 (CEST)
  • Be very careful when adding synonyms to make sure that any synoyms that you add with the checkbox ticked are exact synonyms or 'exact translations. At this stage a link "is more colloquial than" isn't possible. If in doubt, uncheck the box. Celestianpower Hablame 14:21, 2 May 2006 (CEST)

Minangkabau babel templates[edit]

Any chance of getting Minangkabau babel templates on WZ? They can be found here.... Thanks. Martijn 14:05, 2 May 2006 (CEST)

Ik heb gedaan wat ik kon .. Ken je het betaan van het Wiktionary IRC kanaal ?? GerardM 15:41, 2 May 2006 (CEST)
NB de code bij ons is min (conform de ISO-639-3).
Dank je! Klein aanvullend vraagje: moeten er geen categorieën bij gemaakt worden? Ik kan nu niet op IRC. Als ik thuis ben, zal ik het eens opzoeken. Groeten, Martijn 15:49, 2 May 2006 (CEST)

Feature requests[edit]

It is stated quite clearly on top of the Insect room that it is not the place for feature requests, yet I have added 2 items to it that are such, on the recommendation of GerardM. I was thinking of creating a separate page for such, but fear that such a page might lead to the addition of a plethora of vague wishful thinking. Would it be good to have such a page or not? And if yes, what would be a good name for it?--Sannab 15:14, 2 May 2006 (CEST)


Below, I've written a heavy chunk of palaver that I actually don't fully understand anymore myself. I can think of answers myself, yes, but they are mostly theoretic. What I described, however, is what I think will occur very frequently in reality when we will allow complete editability for each and every passer-by. It's meant as a kind of warning, even though it's more like a question that people who understand more about it have to solve.

  • Something I noticed is that quite a number of DefinedMeanings and Expressions are out of sync.
    • For example, the DefinedMeaning of conifer talks about an order of plants, and therefore the attached English Expression should be "conifers" instead of "conifer", is that right? No big deal. But what happens if translators focus on the English Expression they recognize, and translate that one, instead of taking into account what the DefinedMeaning says? The Expressions they add are wrong. For example, the Swedish Expression barrträd got attached to both the DefinedMeanings of "conifer" and "coniferous tree". Seeing it put in this sentence it seems correct (because "conifer" is a synonym of "coniferous tree"), would it not that the DefinedMeaning for "conifer" is actually the DefinedMeaning for "conifers"! Confusion!
    • Even worse: what if someone thinks the definition is wrong and changes it to a description of "conifer"? Then not only the other translations of the DefinedMeaning become corrupted, but the formerly correct Expressions attached to it are wrong! More confusion!
    • The worst is actually when someone adds ("translates") a definition that is correct for the Expression "conifer" but wrong for "conifers" (i.e. writes a "new" definition for the erroneously added English Expression "conifer", not taking into account the existing definitions), while the English translation of the DefinedMeaning is exactly the opposite, namely correct for "conifers" and wrong for "conifer". In that situation, what does the DefinedMeaning actually stand for? "Conifer" or "conifers"? What if half of the definitions speak about 1 tree while the other half mentions an order?
    • Now my question here actually is: is there anything else beyond the given definition and expressions that identifies or defines a DefinedMeaning? A kind of meta-definition? Something that cannot be changed about a DefinedMeaning without some kind of permission?
  • Another weird thing: there seem to be two distinct DefinedMeanings for coniferous wood and coniferous forest. How's that?
  • Now, unless it was someone who changed either the English Expression "conifer" or the English definition "an order of plants..." at conifer, it appears to me that the GEMET data is not completely reliable, and it confuses me more than the new software actually does.

Any takers? :-) —Vildricianus 21:13, 2 May 2006 (CEST)

Given a DefinedMeaning, the definition and the expression, this first pair are to be translated literally. The most important thing is that the Expression in translation fits the definition. That is key. With semantic drift you have expressions that do not fit the definition, when this happens the flag is turned off that indicates that the meaning is the same. This implies that the translation needs to be used with care and basically works only in one way. It also means that there should be a DefinedMeaning that fits the Expression and its Defenition.
Plurals and singulars can happen if this is the correct translation in a given DefinedMeaning. GerardM 21:47, 2 May 2006 (CEST)

Licensing matters[edit]

Cross-posted from the OmegaWiki blog. Discussion and suggestions invited. Also, could somebody please help import the {{cc-by}} template? —Dvortygirl 08:36, 3 May 2006 (CEST)

The trouble with GFDL
GFDL includes a clause that says that attribution must be given in order to copy content. It's true in CC-by, too. GFDL, though, requires that the individual contributors be named. For documentation, which was the original purpose of the GFDL license, that makes sense. For projects with more content and more contributors, the requirement to name individual contributors becomes a burden.
One interpretation of GFDL suggests that the use of a particular Wiktionary entry requires attribution of all its contributors. This approach would paralyze the free re-use of the data in applications such as spell-checkers. Another approach would be to handle the attribution en masse, such as by including a single list of contributors to imported data (perhaps with edit counts) without tracking who contributed what.
The GFDL license also does not consider the attribution of organizations. We do not know, for instance, the names of all the individuals who contributed the GEMET data. OmegaWiki has partners already, and will have many more as it grows.
In order to proceed smoothly, we propose the use of the CC-by license. At this stage, we welcome discussion. If you agree with this direction, please state your agreement to license your own contributions under CC-by on your user pages on OmegaWiki and your home Wiktionary.
  • Comment:
  • I find the introduction of an incompatible license highly suspicious. Is the sole purpose of doing so, simply to prevent Wiktionaries from ever using OmegaWiki data? --Connel MacKenzie 11:22, 3 May 2006 (CEST)

This is a measure that discrimates AGAINST OmegaWiki[edit]

The reason why GFDL was chosen was because of Wikipedia and, it is because what you did. Now there are more Free licenses than the GFDL. With CC-by for OmegaWiki, people can use the WZ data on Wiktionary. It is more problematic to use content from Wiktionary on OmegaWiki... GerardM 12:27, 3 May 2006 (CEST)

  • Answer to Connel's comment: why would a more open license that allows wiktionary to use OmegaWiki data be suspicious??? It is more open and easier to handle ... it is Wiktionary that is more closed down due to licensing issues compared to OmegaWiki. Hmmm ... --Sabine 13:20, 3 May 2006 (CEST)
Wouldn't it be possible to dual-license it for the sake of the old wiktionaries? Or would that give too much problems at a future date? \Mike 09:38, 5 May 2006 (CEST)

I'm no expert in legal issue and I admit I never read from the beginning to the end neither the GFDL neither the CC licence. But I really cannot undestand what's the problem with GFDL. Anyhow knowledge is not copyrightable so, if someone uses OmegaWiki to help him/herself with a translation there is no need at all to cite wiktionary or the authors so this project will be freely usable by anyone who needs it (just like wikipedia). On the other side if someone wants to take data from here to publish his own free dictionary why I sould be worried about how hard it will be for him/her to respect the GFDL? I will not contribute to this project to allow anyone to download the whole dump, print and resell it, I will contribute to allow people to use freely that small amount of knowledge I have (just like on wikipedia). --Berto 10:05, 9 May 2006 (CEST)

One of the reasons is that if you for example download a list for a spellchecker or a dictionary GFDL requires the inclusion of the full license text and the contributors - cc-by does not have the same restrictions even if it gives attribution anyway. So GFDL, like it is now (as much as I know they are talking about a change that would easen the distribution with GFDL) would make it more difficult to use the data outside OmegaWiki ... and this is against the scope of free content like provided by OmegaWiki.
As to bi-licensing: this would anyway create the situation that contributors of the old wiktionaries have to agree with bi-licensing, since also their content would be bi-licensed.
Ciao! --Sabine 10:53, 9 May 2006 (CEST)
But the problem with the license of the old wiktionary material would remain if we were to license WZ by some CC-license, right? \Mike 13:21, 9 May 2006 (CEST)


How will this be be handled? There are only talk pages for Spellings, not for DefinedMeanings, right? Is there some smart solution for this or will there be need for some kind of regulation? —Vildricianus 22:00, 3 May 2006 (CEST)

I propose the talk page of (for instance) the English Expression should be used as a common talkpage, except of course when the discussion is only related to an other language. What do others think about this? HenkvD 09:20, 4 May 2006 (CEST)
That seems good, though I would rather see it as 'on the talk page for the Expression that is a constituting part of the DefinedMeaning', which happens to be English for the current set.--Sannab 11:49, 4 May 2006 (CEST)
From my perspective we need a talk page on different levels. We need talk pages for the Expression as well as for the DefinedMeaning. This way the notion of a DefinedMeaning gains relevance and consequently it will be easier understood. GerardM 12:22, 4 May 2006 (CEST)
Indeed we do, I understood this as being a suggestion on how to handle it until we do get talk pages for DefinedMeanings.--Sannab 12:54, 4 May 2006 (CEST)
Indeed my suggestion if for the time until a talkpage is available for a DefinedMeaning. Hopefully the history will be on DefinedMeaning and on Expression levels too. HenkvD 20:29, 4 May 2006 (CEST)
Would it perhaps be useful to either create a separate Attention-template for problems with the DefinedMeaning, or to add a param to the existing one. The purpose would be to allow for a separate Category to collect DefinedMeanings needing attention.--Sannab 08:58, 5 May 2006 (CEST)
A template for DefinedMeaning problems will not help for the Talk-problem. On the other hand it might be usefull to create a few templates (with or without special categories) for standard problems like
* list of expressions Template:List
* broken links Template:Broken
* mixed DefinedMeanings Template:Mixed
* wrong translations Template:Translation
* etc. etc.
* Template:Attention for all others
When the software allows changing of translations the broken links could be changed more easily. HenkvD 13:34, 5 May 2006 (CEST)
Well, I was bold... I added a second optional parameter to Template:Attention, as well as instructions (noincluded) on how to use it. --Sannab 14:15, 5 May 2006 (CEST)
What about parameters |List of expressions, |Broken link and |Translation ? HenkvD 16:56, 5 May 2006 (CEST)
Well, I am not really convinced they are needed, but if you think they are, feel free to implement them.--Sannab 18:46, 5 May 2006 (CEST)

Please inform on a portal where a font can be found[edit]

We are getting languages where even the fonts that I have added to my system in the past are not sufficient to prevent me from not seeing anything relating to a script. I would appreciate it when we add information on the portals where the fonts can be found. Thanks, GerardM 18:38, 4 May 2006 (CEST)

Discussing DefinedMeanings[edit]

Even though we as yet should not modify the original Definitions, that is not a valid reason not to discuss improvements of them. Since changing the original Definition will necessitate a verification of all associated Expressions, it is not to be undertaken lightly, and it should surely be discussed.

I think we need a separate forum for such discussions. Perhaps we could model it on the Requests for verification scheme on English Wiktionary, unless someone has a better idea. I am however stumped as to what would be a good name for such a page; Clarifications of Definitions have crossed my mind, but seems unsatisfactory. Also, which namespace should this be in?

Then there is the question of Language. Do we need one page for each Language, or can we handle them all on one page? At them moment (with the GEMET import), all original Definitions are in English, but this will surely change once new DefinedMeanings can be added.

Also, should this forum also have room for the discussion of translations of Definitions, or should they be considered as less crucial (well, they are, since changing them does not cause a chain reaction of possible semantic drift)? I think that once there are talk pages for Definitions, then discussions on translations of Definitions could be held there, but this does not really hold for discussions on the original Definition, since it has ramifications for all Languages.--Sannab 20:48, 6 May 2006 (CEST)

Bulgarian themes[edit]

Moved to the Insect room#Bulgarian themes

Proposed BIG portal change[edit]

I find more and more problems that have to do with the mix of codes that we currently use for our language codes. There are ISO-639-1 and ISO-639-2 codes; there are the old Ethnologue codes and there are codes that people came up with. Some of the codes currently in user are horrible, the code als for Alemannisch is in actual fact the language code for Albanian. The sq code for Albanian currently in use is now a language family.

What I am proposing is, that we are going to standardise in OmegaWiki for the Babel information on ISO-639-3. The consequence is that we have to forget about languages codes like en de fr it nl he el etcetera. We will basically use the full codes. It also means that languages like Kurdish will be split up in the languages that are considered languages under ISO-639-3.

When we are in need of a code where ISO-639-3 does not help us out, we will use codes that can be easily distinguished from the ISO-639-3 codes.

To recapitulate this is ONLY about language proficiency and the naming of portals. When you can select a language in Multilingual MediaWiki, I expect you select the NAME of a language and not a code. Thanks, GerardM 11:57, 10 May 2006 (CEST)

I'd say that we need to come up with a system or list ASAP. If ISO-639-3 suffices, we'll need a complete list of it here. If it doesn't suffice, we also need a complete list of additions to it. OK, "complete" is a bit ambitious, but I'd certainly favour something that can be used as an internal reference, rather than having to hunt down the codes on Ethnologue or —Vildricianus 14:57, 10 May 2006 (CEST)
I fully agree to this portal change - even if it means quite a lot of work right now. But: we need a unique system and all the mixture of codes from ISO 639-1 to 3 is just too confusing for many. and Vildricanus: yes, you are right, we need that list of languages. Ciao! --Sabine 15:07, 10 May 2006 (CEST)
Would we redirect Portal:en to it's new home then, so people who aren't familiar with wierd codes and such can still find what they're looking for? Or is this a given? Celestianpower Háblame 18:03, 10 May 2006 (CEST)


In order to be able to do this, I have been active to add Ethnologue templates to the existing portals.. They can be found on the template:OmegaWikilang . One question, would it make sense not only to link to Ethnologue but also to the articles on the en:wikipedia (and possibly to the local Wikipedia) ? GerardM 18:01, 10 May 2006 (CEST)

Yes, I think links to the other Wikimedia (yet-to-be) sister projects would be very useful. --Shibo77diskuto 07:37, 12 May 2006 (CEST)

Sort order of the language names in the Portal list[edit]

At this moment the languages are sorted by their language code. Will this still be the best way or is sorting alphabetically better ? GerardM 08:19, 11 May 2006 (CEST)

Yes, I believe there should be one standard only, ISO-639-3, and when ordering we should only use the alphabetical order of the ISO-639-3 codes. It seems to be the most neutral way and ordering them by any other method would simply increase confusion. --Shibo77diskuto 07:37, 12 May 2006 (CEST)
I was told, some time ago, that environment issues, like the interface language and the sorting order, were going to be taken care of by the wiki. That's fine with me; I see no reason to define a preset order if the wiki is capable of sorting as fit for a particular language anyway. Aliter 04:34, 5 June 2006 (CEST)
The Portal list are in a Template. They are therefore not subject to ordering in a database manner. GerardM 11:29, 5 June 2006 (CEST)

Additional codes[edit]

We also need to reach consensus on what to do when we see need for a code not in ISO-639-3. There are several cases when this may happen; ISO-639-3 is work in progress.

  • There are speech varieties on par with our understanding of Language that lack a code, possible solutions:
    1. ISO-639-3 contains a range of codes intended for local usage.
    2. Create a OmegaWiki code that cannot be mistaken for a ISO-639-3 code, f ex by having a wz suffix or prefix
    3. Disallow such languages on OmegaWiki (it is an option even if I do not like it)
  • Since the concept of Language is tightly knit to spelling, an orthographic alternative will (AFAIK) require another Language.
    1. Use the ISO-639-3 code with a suffix reflecting the script variation.
  • There is also the case for things like the cyrillic alphabet, which we currently have a user box/category for. How should these be treated?
    1. Use a code that cannot be mistaken for a ISO-639-3 code

I am sure I have missed both possible code needs, and possible solutions. *smile*--Sannab 14:16, 12 May 2006 (CEST)

We will not use the code ranges intended for local use.
The codes used will be only used in databases or in templates and we do not really want many templates do we ?
Cyrillic is not a language.. Personally I am not happy that it is in the language codes.
When a "language" is invented where there is no clear linguistic reason to call it a language, it can get a code that is marked as such.. In order to get words in WZ, they have to jump through some hoops.. :) GerardM 14:58, 12 May 2006 (CEST)
I would vote for #2. A "Language Babel" should only contain languages, we can set up another "Script Babel" just for user's mastering of different scripts, and they should have a different set of codes, I would propose lat-script for the latin script</code>, cyr-script for the cyrillic script and so on... For codes that have not yet been included in the ISO-639-3 standard, we should invent a three lettre code, but with the -WZ suffix appended. This would make it clear that it is not an actual ISO-639-3, in case it be included in a future revision of the ISO-639-3 standard. (At the moment, the Wikipedias use all different sorts of codes, one can't be sure which one is ISO-639-1/3 or invented.) --Shibo77diskuto 15:30, 12 May 2006 (CEST)
Please no inventing codes! There is already an ISO standard for script codes, ISO 15924. And ISO 639-3 has a code range (qaa through qtz) set aside for private use as well. —Muke 13:24, 27 May 2006 (CEST)
I absolutely agree that we should not invent codes. There is no real need for it anyway as the database does not insist on any code. It insist on a name. The codes are there for applications build on top of the WZ data. It is there for datamining and search engines and the like. I am aware of script codes and I am aware of the private use range.. There are however only a few of those so given the scope of WZ I do not expect this to suffice ..
One thing that is much more interesting is to explore if we can become an organisation (we hopefully being the WMF) that will be on speaking terms with the standards bodies. This would allow us to define what hoops people have to jump through before we approach people in the standards bodies with "our" recommendatation. Given the large corpuses we have build of modern language usage and given the relevance of wikipedia as a resource on the Internet we could become a partner. The thing we need is some organisational maturity to pull this one off. GerardM 19:00, 29 May 2006 (CEST)

Adding translations to GEMET[edit]

Well, I myself have a problem when working on GEMET: I often need a lot of time to find which words are in there in Italian to be than able to translate them into Neapolitan. Of course also starting with English is the same problem. Why? In Special:Allpages you get an endless list of mixed languages. So I would very much like to see a page where I can find all the words in one language only (for example Italian - that would be the preferred one) from there I simply start to click on the single words where I immediately can work on and that costs much less time and will allow for more results. Hmmmm ... a csv table to create such a list from would be nice. Ciao! --Sabine 15:11, 10 May 2006 (CEST)

On my subpage I menationed this as well.
Maybe this could be solved by
  • Categorisation
  • additional feature in the MediaWiki software
  • ....
HenkvD 00:12, 11 May 2006 (CEST)
GerardM replied:
Categorisation is not an option for the relational content. We are working on Multilingual MediaWiki. This will bring us the basic functionality to do clever things with respect to languages. Until we have this functionality we have other things with a higher priority. Making things editable and versionable are key at the moment. Another thing that is of a big importance is working towards importing data.
Multilingual MediaWiki would be perfect, but I guess it will take too long to implement. I would opt for categorisation per language soon. HenkvD 00:12, 11 May 2006 (CEST)

MLMW is under active development, but we might be able to hack up a quick index to the GEMET terms in particular languages if the users need it.--Erik 05:17, 11 May 2006 (CEST)

Sure indices would be nice if that does not delay inordiately the arrival of mark of the Expression and Definition that constitutes the DefinedMeaning and editable Relations... *smile* --Sannab 08:23, 11 May 2006 (CEST)

Part of the Language table[edit]

Would it not be a good idea to have an indicator that states if there is a Wikipedia in that language ?? GerardM 18:29, 10 May 2006 (CEST)

Yes, it would. Celestianpower Háblame 18:54, 10 May 2006 (CEST)
Indeed it would, however, we must not forget that the wikipedias not are likely to change their language codes even if we do, so the code the wikipedia uses must also be stored. Then of course there is the problem om language families, should the various Kurdish languages all link to the Kurdish Wikipedia (I presume there is only one), or should an attempt be made to identify which kurdish is used there and only link from that?--Sannab 21:36, 10 May 2006 (CEST)

OmegaWiki licensing analysis[edit]

Gerard asked me to write a few words about copyright and licensing issues relating to OmegaWiki. The usual disclaimer: I am not a lawyer, and it may be helpful to consult with the Wikimedia legal committee before taking some steps. I helped develop the copyright policies for Wikinews.

As most of you are aware, when Wikipedia was started, the GNU Free Documentation License was one of relatively few free content, copyleft licenses available. Creative Commons had not yet released its license portfolio. Parts of the FDL were controversial from the beginning, especially its allowance for "invariant sections" (which cannot be modified); Wikipedia explicitly rejects these invariant sections.

The CC licenses were introduced 4 days after the launch of Wiktionary. Naturally, Wiktionary, too, adopted the GFDL. Besides being an available and tested choice, it also had the advantage of allowing relatively straightforward copying from Wikipedia content to Wiktionary and vice versa.

However, the FDL is a very flawed license even for encyclopedic texts, let alone dictionaries. As the name "Documentation License" suggests, it was intended for printed software manuals. Its heavy use in a collaborative wiki environment was never envisioned. Moreover, the Free Software Foundation has been very slow in responding to concerns about problems in the license. The FSF is the only institution which could fix the problems with the license by releasing a newer version. Documents released under terms such as "Released under the GNU FDL 1.2 or higher" could then be smoothly upgraded. I am not aware of any progress in this matter, however. It is also likely that, in spite of the fact that Wikipedia is by many orders of magnitude the largest body of GFDL work, the FSF is reluctant to alter the license in a way which significantly affects its own software manuals which are under the FDL.

The most significant problem of the GFDL with regard to Wiktionary(Z) is that it requires inclusion of a full copy of the license text itself (section 4.H), which is over 19,000 characters in length, whenever you distribute a GFDL-licensed work. This makes printed copies of short dictionary extracts practically unworkable. The requirement to include a full attribution history is also not easily met in practice. From a legal point of view, the GFDL is a US-centric license not adapted to different jurisdictions.

Recent versions of the Creative Commons licenses have been developed specifically with collaborative editing environments in mind. The CC-BY license allows the licensor to designate an entity for attribution, such as "The OmegaWiki community". Instead of having to name every individual who has ever edited a page, content users can simply link back to the site. The CC licenses also allow specifying the URL of the license instead of including its full text. They have been adapted to international jurisdictions through the iCommons project.

It should also be noted that, in practical terms, much of the so-called GFDL content in Wiktionary is not copyrightable at all. While a Wiktionary database as a whole would be considered a copyrightable collection, a list of translations and synonyms or similar relational data is generally not copyrightable (see Feist v. Rural for a U.S. Supreme Court ruling on copyrightability of directories). Definitions are a different story, but given that these are typically very short, even when copyrightable, many uses would be considered legitimate under fair use / fair dealing exemptions to copyright law.

As such, imposing the heavy terms of the GFDL (including the copyleft principle), besides reducing the utility of the work, seems morally questionable as well. Dictionaries are essentially functional works. From a practical point of view, one should ask whether the copyleft principle of requiring that contributions must be re-licensed under the GFDL helps or hurts the primary goal of achieving the highest possible utility of the content.

I propose that we migrate OmegaWiki specifically, to the CC-BY License 2.5 or higher. This license is currently used by the Wikinews project. We decided to use it because of its simplicity, and because of the possibility to attribute stories simply to "Wikinews". Now, the question is how to best achieve such a migration. I propose the following:

  • For the time being, all existing and newly contributed content on OmegaWiki will be labeled as being dual-licensed under the GFDL and CC-BY 2.5.
  • When we start importing content from the old Wiktionaries, we will have to ask all contributors we can reach if they are willing to relicense their contributions. The most effective way to do this would be to show the user a quick "Do you agree to relicense your contributions under the following terms?" message when they next visit the site, and if they do, set a flag in the database. The importer scripts can then recognize this flag, and only import content where no significant changes have been made by users who have not agreed with the license migration.
  • Eventually, we may drop the GFDL entirely as unnecessary.

Yes, we would lose a significant amount of existing old Wiktionary content. But I'd rather have a sane licensing schema for OmegaWiki than one which makes the data effectively unusuable. Also note that the simpler our license, the less discussion we'll need to have about licensing issues. ;-) Finally, I think the challenges of converting the relatively unstructured old Wiktionary data will lead to significant loss of content anyway.

The primary reason for dual-licensing during an interim period is that there's some argument about whether CC-BY content can be included in GFDL works. While CC-BY is relatively simple, it includes at least one requirement which is not found, in this form, in the GFDL, namely the restriction 4.a that users of licensed content have to remove attribution if the licensor requests it (essentially a clause which exists to protect writers from being associated with works they want to have nothing to do with). However, CC has been much more cooperative in resolving compatibility issues than the FSF, so I think we can get clear one-way compatibility relatively soon.

Unless there are strong objections, I'll simply relabel the existing content here as being CC-BY/GFDL dual-licensed. We'll need a proper process for the migration of existing Wiktionary content, but there's relatively little content here yet, so I hope we can avoid some bureaucracy. --Erik 08:49, 12 May 2006 (CEST)

Well, I fully agree to this - we need to make it easy to re-use the data present on OmegaWiki, it would not make sense to have thousands of possibilities and then not being able to do ... Ciao! --Sabine 10:17, 12 May 2006 (CEST)
Sorry, to put a dampener in here , but if once licensed under GFDL, how can we ever legally remove that license?--Sannab 10:29, 12 May 2006 (CEST)
At this stage we are testing software. When we have to lose all our data because of technical reasons, we would. At this moment we cannot attribute who did what anyway. The statement of a license is very much a statement of intent at this moment in time. Given the explicit aims of this project, it is unlikely that the trusted people that can edit would object on principle. When they do, their work will be removed when they can indicate what they have done. After the license change they can choose to edit again (and look foolish). GerardM 11:31, 12 May 2006 (CEST)
I guess the only way would be to have all the contributors multilicense their work. Of course someone could still use the content under the GFDL, since it remains in effect perpetually, but CC-BY-SA seems similar enough to the GFDL that it wouldn't be a real problem. I noticed a while back that Creative Commmons has a "wiki" license scheme that's really just BY-SA. [7] – Minh Nguyễn (talk, contribs) 10:53, 12 May 2006 (CEST)
GFDL is not compatible with CC-BY if I understand correctly, and CC-BY-SA has so far not been mentioned.--Sannab 10:57, 12 May 2006 (CEST)
CC-by is the one thing that I require. People have to know where they can add / edit / improve stuff. When OmegaWiki is going to be the success I expect it to be, the sa bit will become largely irrelevant as more data makes the WZ more information rich. The point of WZ is very much that doing it alone is more expensive and, it does not make you friends. GerardM 11:31, 12 May 2006 (CEST)
CC-BY-SA is not compatible with the GFDL, though Creative Commons is working on one-way compatibility (including CC-BY-SA content in GFDL, not the other way around). I agree with Gerard that copyleft is not necessary for OmegaWiki, the strength is in the community and the amount of content that is already there. If others want to create proprietary commercial dictionaries from the collections in WZ, they can try, but they will have to compete with freely available solutions and WZ itself.--Erik 00:20, 13 May 2006 (CEST)
So, I educated myself a little on licensing (reading license texts sure is fun ;-)). My understanding so far is that neither CC-BY nor CC-BY-SA are fully compatible with the GFDL at this point. There might be one-way compatibility between CC-BY and GFDL, meaning that CC-BY content can be relicensed under GFDL, but not the other way around. So we can't take GFDL content from the old wiktionaries unless we multi-license perpetually or rewrite everything.
I think Creative Commons has many advantages (no invariant sections, no need to attach a lengthy license text, attribution to the whole community) but I'm not convinced that it's a good idea to abandon the copyleft principle by using CC-BY and not CC-BY-SA. I see a potential problem here: How much do people care about copyleft? Do they see it as an essential requirement or not and would they still contribute with the knowledge that modifications of their work can be relicensed under more restrictive or proprietary terms? --Tosca 02:50, 30 May 2006 (CEST)
Well, personally, I'd rather liscence it under the public domain, so anyone can use it for any purpose with or without attribution. However, I live in the real world, so I would summarise my views thus:
  1. It's very important to be able to copy text a verbatim from all of the other Wiktionaries. Hereby, in my view, we need to liscence under a liscence compatible with GDFL.
  2. From what I've read above, Creative Commons seems a better liscence overall. Less paperwork, etc...
  3. Since we want to become an official Wikimedia project, I suspect that the Wikimedia Foundation would need us to be GDFL compatible at least.
The perpetual multi-liscence idea sounds interesting, in my opinion. Would it work? Regards, Celestianpower Háblame 11:31, 30 May 2006 (CEST)
Alas, public domain is not possible, because in some countries (France, Germany, ...?) it is forbidden to release your work under such license (you can't abandon your rights).
You can be in Wikimedia without being compatible with GFDL. See Wikinews for example. GFDL is not really convenient, so, at some point, we'll have to move to Creative Commons, and the sooner the better. The best would be to move all wikimedia projects to CC, but it's not really possible (Jimbo, can you help?). What can be done for now is to say somewhere that you agree to make your old contributions CC-by. I've done that there: [8] (see at the bottom). The perpetual multi-licensing would be good for exporting the data to whether CC or GFDL, but wouldn't help importing GFDL-only (Wiktionary) here. denke ich Kipcool 11:00, 31 May 2006 (CEST)

A way should be to ask FSF to change the need of report entyrely the license, as stated above wikimedia projects have the greatest amounts of GFDL licensed texts, so I hope we have some rights or some possibility to ask for a change. We must contact the legal committee about this. The Doc 15:51, 2 June 2006 (CEST)

The past I don't know how long, the Wiktionaries have been told to enter information in a certain way, as this would make the conversion to the one-database wiktionary easier. After all the trouble put into the wiktionaries, are you going to tell them to go to hell because you don't like their license? Aliter 22:19, 5 June 2006 (CEST)
Yes, and it has been really worthwhile. The standardisation that has taken place was as much as anything a contributing factor to the great quality improvements that have been achieved. As to telling people to go to hell? Hardly, I would rather ask people rather to reconsider the licensing of their contributions. I would not be suprised that people do not mind a license that is even more Free. Furthermore given the technical possibilities and impossibilities it is impossible, and I have stated this often enough, to export the wiktionary content and include it in any other context. I know for a fact that this has also not been done in the past. Also given the pecularities of copyright law, it is extremely difficult to prevent the use of data anyway. You cannot copyright facts. GerardM 00:34, 6 June 2006 (CEST)

Language tags[edit]

Sabine pointed me to this article. It explains really well how the languages are to be indicated in the world wide web. It also explains about how the ISO-639-3 can be used within the new ISO 3066bis code.

We do not follow ISO3066bis because we do standardise on ISO-639-3. However, we will be able to create tags that are ISO3066bis compliant .. I think. One thing I do want to do is follow the way they extend the codes with things like orthography and regional information.. :) GerardM 17:19, 15 May 2006 (CEST)

I am very interested in ISO 15924 but am a non-techie. I really wish there were a way to tell browsers to, for example, display an individual sinographeme in a non-simplified Chinese, Chinese simplified, Japanese simplified (shinjitai) or non-simplified Korean (hanja) font even when there is no difference in appearance, or when there is only a minor difference that is not reflected by separate Unicode identities assigned to the character.
ISO 15924 seems promising, but why isn't there a distinct code for shinjitai as opposed to simplified Chinese? Are we supposed to use zh-Hans to have a character displayed in a simplified Chinese font, ja-Hans for a shinjitai font, zh-Hant for a traditional Chinese font and ko-Hant for a hanja font?
We might also need a guideline for words or sentences in mixed script such as “匂い” and “글字”.
(Currently, many WP users add zh-TW or zh-CN even to old Chinese names/words/characters that have nothing to do with either Taiwan or the mainland. Merely having zh is sometimes not a problem for readers using fonts that contain both variants, but is not a good solution IMHO. Without proper handling, characters might unnecessarily appear in an “odd” font in cases where there is a separate Unicode value.)
Do any fonts or browsers support ISO 15924 yet? I have no clue as to how and when all this works.
(WP talk page)
The ISO 15924 list provides you with a list of names of scripts that are supported in Unicode. Unicode is expressed in fonts and depending on what fonts you have loaded in your operating system and the level of Unicode support provided by an application.
What we hope to do in OmegaWiki is to identify many things as being in a specific language. Typically there will be one script associated with that language and within a language only a set number of characters are accepted. This is something that we want to check in our content (feature request). In a WZ Expression it will be explicitly only one script as we will not allow Expressions to be intermingled with other scripts. This means in effect that we will have botht the traditional and the simplified script.
When people say zh-CN it is essentially wrong. It is Hans and Hant when you identify the script. In WZ the language will not be zh or zho but cmn for Mandarin. GerardM 11:48, 1 June 2006 (CEST)
It would be cool if we could at some point in the future somehow mark romanisations as foreign-language but at the same time avoid having the browser use the script we told it to use normally for that language. For example, zh-cmn-Latn for a Chinese expression in pinyin, or zh-Latn-TW for a Taiwanese name in pinyin. The problem is currently that when you have your browser preferences set to use e.g. the font AR PL ZenKai Uni to display Traditional Chinese, it will use the same font to display pinyin romanisations of Traditional Chinese. Depending on the font, tone marks may be fairly ugly, and another non-CJK Unicode font with acceptable tone mark rendering would be preferable. Do you think this will be possible?
I haven't yet found how to mark Kanji. Should characters be individually marked as ja-Hant if they look the same in Chinese and as ja-Hans if they have been simplified (Shinjitai)? I guess not; simply marking them as ja-Hani seems better, whether in Shinjitai or pre-Shinjitai form. I hope it's explained somewhere in the introduction which I am going to read later. I have also noticed that gugyeol (an obsolete Korean script) is not in the list. 17:48, 1 June 2006 (CEST)
As far as I am aware, there is a difference between a "pinyin" and a transliteration. In a "proper" transliteration it is possible to use standards to go from one script to another. A standard transliteration is "mechanical" in nature and does not need to be stored as such as it can be generated.
I think I have never claimed pinyin were a transliteration. And not being able to automatically switch between Han and pinyin does not mean that they're not useful in a dictionary – to the contrary. 20:12, 1 June 2006 (CEST)
As to using fonts, fonts are part of the installation of a user's computer. I do not know how to have people have fonts when they do not have them. I have mentioned often that the WMF should help people by providing the fonts that are free to have.. I would even like to buy out some producers of fonts. Many people do not buy/pay for their fonts. It would be cool if we could find a donor to pay for this.
I mean if a user has set their browser to use fonts that are appropriate for the respective script (e.g. hanja, latin pinyin or whatever), why not let WikZ make use of that? 20:12, 1 June 2006 (CEST)
Please, you raise interesting subjects would you consider creating a user, that would make the conversation more personal/frienldy. Thanks, GerardM 18:20, 1 June 2006 (CEST)
Sorry for that, I'm in a hurry now but shall create an account later. Thank you for your replies. 20:12, 1 June 2006 (CEST)

Language-specific problems with articles[edit]

A couple of separate, but related issues.

  1. I recently came across Expression:bathing water / Expression:badvatten, where the direct translation of the English definition to Swedish matches the Swedish expression badly - in Swedish, we have one word covering both "bath" and "bathe". Should all language-specific problems like this be collected in one big category - or should we create language-specific categories? IMHO a category for articles where there is a problem with the Swedish expression-definition pair makes perfect sense. People who would like to eventually fix those, should have ways where they easily find them.
  2. When the problem lies in one language where the definition badly matches the expression, like in the example above - should it not make more sense to write the comment on the talk in the language in question? I can well image languages where most native speakers have other languages than English as their best second language. IMHO we should not encourage that all the work be done with English as a major tool - that would exclude lots of people (although in the case of Swedish, the group of people who are interested in languages but not so good in English is a small minority).
  3. In the same vein, "attention"-templates regarding problems with a specific language could IMHO better be phrased, and maybe even named, in the language in question.

All these things combined makes me want to create a copy of Template:Attention in Swedish, Template:Kolla upp, Template:Attention/sv or similar, which would include a category for pages where the Swedish expression/definition couple is problematic.

Thoughts? // Habj 23:42, 17 May 2006 (CEST)

It could be handled by an additional parameter to the {Attention} template, or it could even be handled by the current 2nd parameter like {Attention| <description> |Translation in swe} which will put it in Category:Translation in swe needing attention. HenkvD 00:13, 18 May 2006 (CEST)
Well, that would handle one part of the problem, grouping the Swedish articles, however it would not solve the other one, which would be to allow a language specific interface through which to mark articles. For the set of people working on OmegaWiki right now, using English as the language of communication does not seem to be a great problem, but do we wish to exclude contributors that feel less comfortable with English? One way to handle it would be language specific templates, either simple redirects to Attention, or calls for Attention that add the language parameter. How much of this problem is going to go away with multilingual? Not that I suggest we wait for it... Then on the other hand, this is most likely not a major problem while we still are so much in testing stages, and while editing is restricted, but I must confess that the template being in English, as well as the categories, makes it feel awkward for me to describe the problem in Swedish, which I might be able to do to a more specific degree, and also more usefully for those eventually aiming to fix the problem. --Sannab 10:25, 18 May 2006 (CEST)

At this time, we do not yet have Multilingual MediaWiki. This is when we get ability to identify content as being in a language. An Expression is in a language ... When the Attention template is added to the Expression (SynTrans) that is problematic, these texts will become identified by Language in the "near" future. I would not invest much effort in this at this moment in time. GerardM 23:48, 19 May 2006 (CEST)

International Beer Parlour