Wednesday, March 23, 2011

illumos has Serbian Family Language Support



I just integrated:

changeset: 13312:537259ad27f6
tag: tip
user: Garrett D'Amore
date: Wed Mar 23 08:35:14 2011 -0700
description:
324 need serbian locale support
Reviewed by: Rich Lowe
Approved by: Garrett D'Amore

This is a bit unusual relative to most of the locales, because Serbo-Croatian is a language fraught with some unique political considerations:

There is a common root language, that everyone speaks and understands. But speakers of it rarely agree on what to call it. In Serbia its Serbian. In Bosnia its Bosnian. And so on for Croatian and Montenegrin.

In illumos, we have followed the Unicode CLDR example, and we now have these locales:

hr_HR.UTF-8 - Croatian in Croatia
sr_BA.UTF-8 - Serbian in Bosnia and Herzegovina
sr_ME.UTF-8 - Serbian in Montenegro
sr_RS.UTF-8 - Serbian in Serbia

I want to apologize to anyone offended by this decision, but rather than make a contentious decision on our own, I decided it was best to simply follow the decisions of an international standards body. I believe that there is no fundamental difference in the languages, although some national variances appear to be present in the data files. If someone has more accurate names for these, or believes that some aliased locales will assist with compatibility with other operating systems, then I would be happy to hear suggestions. Ideally from someone familiar with accepted practice in these locations.

There is another wrinkle in all this too. This language -- thanks largely to occupation by Soviet forces as part of the SFR Yugoslavia, is commonly represented using two different alphabets -- Cyrillic and Latin. Generally most locations use Latin, but within Serbia, Cyrillic is mandated by law. So sr_RS uses Cyrillic, while the others use Latin.

Here are the two alphabets:

А Б В Г Д Ђ Е Ж З И Ј К Л Љ М Н Њ О П Р С Т Ћ У Ф Х Ц Ч Џ Ш
A B C Č Ć D Dž Đ E F G H I J K L Lj M N Nj O P R S Š T U V Z Ž

Anyway, if someone sees room for corrections or improvements here, especially if they are familiar with the language(s) and/or region(s), I would appreciate hearing back from you.


8 comments:

James said...

Good call on the names, and good work on making the OS more accessible.

ihosama said...

"This language -- thanks largely to occupation by Soviet forces as part of the SFR Yugoslavia, is commonly represented using two different alphabets -- Cyrillic and Latin."

This sentence is patently WRONG!!!

1) Cyrilic was the FIRST written form for the Slavonic languages, ever. It was used in Serbia BEFORE it spread to the Russian region. See http://en.wikipedia.org/wiki/Cyrilic

2) Soviet troops NEVER occupied Yugoslavia. Not even for a day. Not even during WW2.

I am sure this text was just the inevitable consequence of US education system, so please correct it.

Keep up the good work!

Cheers from Slovakia, the place where Cyrilic originated 1100+ yrs ago.

Milan Niznansky

Garrett D'Amore said...

ihosama:

Thanks for the clarification!

As much as I'd like to blame someone else for my errors, I think I have to blame my own ignorance, and my own incorrect assumptions. The errors in the statement are wholly my own fault, and I'm grateful for the correction.

biant92 said...

Thank you very much for the Serbian Language Family Support, with regard to politicnih and historical fact, in my opinion it is less important than the fact that the Serbian and other languages ​​from the former Yugoslavia to be supported by the operating system (Illumos), which continues the tradition of OpenSolaris.

Davor said...

1) Cyrilic was the FIRST written form for the Slavonic languages, ever.

To be exact: Glagolitic alphabet was the FIRST written form for the Slavic languages, ever. It's successor, Cyrillic, became more popular later.
:D

Cheers from Slovakia, the place where Cyrilic originated 1100+ yrs ago.

Cyrillic alphabet was developed in Bulgaria. On the other hand, Glagolitic alphabet was created or at least formalized and expanded with new letters for non-Greek sounds by Saint Cyril during his visit to Great Moravia in 862. So this must be the Slovakian connection.

There is a common root language, that everyone speaks and understands.

OK, I'll try to explain this.

No. There is a common "language standard" that everyone speaks and understands in Croatia, Bosnia And Herzegovina, Serbia and Montenegro. Root languages (Chakavian, Kajkavian and Shtokavian dialects) in some cases are so different and so versatile that the people within the same ethnic group have difficulties to understand each other. Croats speak all three of them, and other nations speak different versions of Shtokavian. But even that Serbo-Croatian standard (Shtokavian) was in the days of Yugoslavia always pluricentric, which means that it was one language with several standard versions, both in spoken and in written forms. So differences exist, and if there is nothing unusual about en_US and en_UK locales or fr_CA and fr_FR locales, there shouldn't be anything unusual about this matter.

Keep the good work, Garret...

Cheers from Croatia.

Royal said...

CROATIAN LANGUAGE AND SERBIAN LANGUAGE ARE TWO DIFFERENT NATIONAL STANDARD LANGUAGES !

EXAMPLE TEXT:

CROATIAN LANGUAGE TEXT:
Glede ispušnih plinova i zagađivanja zraka u Jeruzalemu, bilo bi potrebito poduzeti mjere sigurnosti!

SERBIAN LANGUAGE TEXT:
У погледу издувних гасова и загађивања ваздуха у Јерусалиму, било би потребно предузети мере безбедности!

Royal said...

Serbo-Croatian is not language, the is group different languages.
See document:
*[http://www.nsk.hr/UserFiles/File/Slu%C5%BEeno%20prihva%C4%87anje%20izmjena%20ISO%20639-2%20Registration%20Authority.pdf Different Serbo-Croatian]
*[http://www.danshort.com/ie/iesatem.htm Croatian language, Serbian language and Bosnian language are indepedente group languages 1]
*[http://www.ethnologue.com/show_family.asp?subid=292-16 Croatian language, Serbian language and Bosnian language are indepedente group languages 2]


== Example of translation into different languages ==
{| class="wikitable"
! English
! Croatian
! Serbian
|-
! '''[[English language]]''': With regard to the general air quality in the city of Jerusalem is necessary to take urgent security measures to prevent poisoning of the population of the exhaust gases.
Potential application mijera that affect the flow of personal health status and which may contribute to better overall care of the population in the area of the city center would certainly be to reduce traffic stopper on fundamental Kostanjica where intersections are located tourist islands, such as the Christian church, these disciplinary measures are proposed primarily in the holiday season such as Easter and Christmas. Officers and NCOs from the army barracks in the city should contribute to the precise use of the prescribed measures, surveillance of traffic chaos at the check point and looking through use of aviation.
|-
! '''[[Croatian language]]''': Glede opće kvalitete zraka u gradu Jeruzalemu potrebito je poduzeti žurne mjere sigurnosti kako bismo spriječili trovanje pučanstva ispušnim plinovima.
Možebitna primjena mijera koje utjeću na tijek osobnoga zdravstvenoga stanja a koje mogu pridonjeti boljoj općoj skrbi pučanstva na prostoru gradskoga središta svakako bi bilo smanjenje prometnoga čepa na bitnim raskrižjima gdije su smješteni turistički otoci, poput kršćanskih crkvi, ove stegovne mjere predlažu se ponajprije u vrijeme blagdana kao što su Uskrs i Božić. Časnici i dočasnici iz vojarni u gradu trebaju pridonjeti točnoj uporabi propisanih mjera, nadzorom prometnoga kaosa na nadzornim točkama i promatranjem kroz uporabu zrakoplovstva.
|-
! '''[[Serbian language]]''':
*(an official letter to the constitutional)
У погледу опште квалитета ваздуха у граду Јерусалиму потребно је да се предузму хитне мере безбедности како би спречили отровање становништва издувним гасовима. Евентуална примена мера које утичу на ток личног здравственог стања које могу да допринесу бољем општем збрињавању становништва на простору градског центра свакако би била смањење саобраћајног колапса на суштинским раскрсницама где су смештена туристичка острва, попут хришћанских цркава, ове дисциплинске мере предлажу се најпре у време празника као шта су Васкрс и Божић. Официри и подофицири из касарни у граду треба да допринесу тачној употреби прописаних мера, контролом саобраћајног хаоса на контролним тачкама и посматрањем преко употребе ваздухопловства.
*(transcript of the latin alphabet)
(U pogledu opšte kvaliteta vazduha u gradu Jerusalimu potrebno je da se preduzmu hitne mere bezbednosti kako bi sprećili otrovanje stanovništva izduvnim gasovima. Eventualna primena mera koje utiću na tok ličnog zdravstvenog stanja koje mogu da doprinesu boljem opštem zbrinjavanju stanovništva na prostoru gradskog centra svakako bi bila smanjenje saobračajnog kolapsa na suštinskim raskrsnicama gde su smeštena turistička ostrva, poput hrišćanskih crkava, ove disciplinske mere predlažu se najpre u vreme praznika kao šta su Vaskrs i Božić. Oficiri i podoficiri iz kasarni u gradu treba da doprinesu tačnoj upotrebi propisanih mera, kontrolom saobračajnog haosa na kontrolnim tačkama i posmatranjem preko upotrebe vazduhoplovstva.)
|}

Garrett D'Amore said...

Well, I don't know enough of any of these languages to be able to tell the difference, although I can see the Cyrillic vs. the Latin.

My understanding from sources like this one:

http://en.wikipedia.org/wiki/Comparison_of_standard_Bosnian,_Croatian_and_Serbian

Is that even linguists cannot decide whether or not the the languages are the same, although in recent years there have been political forces that have created different national standards (apparently for reason relating to politics, rather than language itself.)

As I said, I've used the ISO's treatment of this, in categorizing the languages. I have no desire to get into a political debate about this. Please don't make me regret integration support for these languages by attempting to draw me into a political debate about their classification.