[publican-list] Publican 1.0 nearly here

Miloš Komarčević kmilos at gmail.com
Mon Nov 2 00:57:27 UTC 2009


Thanks for responding so promptly Rudi.

On Sun, Nov 1, 2009 at 11:09 PM, Ruediger Landmann
<r.landmann at redhat.com> wrote:
> Unfortunately, as you note, Serbian labelling is really a mess right now. As
> a friend of mine likes to say: "Standards are a good thing. There are so
> many to choose from!" :)

Good to hear you're aware of the issue.

> A brief recap:
>
> Prior to Publican and Transifex: sr_Latn (what standard is that anyway?).
> There are still many files and directories with this name in the various
> repos.

I chose sr_Latn for docs as it was the only option supported by
DocBook XML. I also remember reading somewhere that the separators
(dash vs underscore) should be treated identically (DocBook seemed to
convert dashes to underscores anyway; sorry can't find the reference),
and CLDR uses underscores as well, so I was hoping glibc will
eventually follow suit with extending existing sr_RS, hence the choice
of underscore. So, in short, the underscore is of POSIX heritage that
DocBook crossbred with the XML/W3C/IETF one. Anyway, this is the least
of our issues ;)

> Transifex 0.5 strictly insists on the locales as defined in glibc or it
> won't render the language name (although it will still serve the file),
> hence sr at latin in the Transifex branches of many F-11 docs. The situation is
> complicated because if a PO file with a particular name /ever/ existed,
> Transifex seems to remember it, so some repos have ended up with Serbian
> (Latin) files under both names.

Yes, but I (as the team coordinator) was fine with that, and I wanted
to keep UI (glibc) and docs translations separate on purpose (because
I consider glibc's script modifiers to be "broken"). sr_Latn
translations just didn't show up in Transifex stats, and we didn't
care as long as we could download and submit, while the stats were
assumed be identical to those of sr at all times (I was making sure
they're in sync, Latin version is "automagically" transliterated from
Cyrillic, while the other way doesn't work so well). The confusion
started when sr at latin started showing up in new publicanized docs out
of our control. For a while there was a feature in Transifex (pre
0.3-0.4?) to hide a language, and ideally we would have liked to have
all of the Latin translations hidden because they should track the
original sr anyway, and we didn't want people submitting and tracking
them separately as an independent "language", so it was all perfectly
fine and intentional.

On a side note, I have also noticed that Transifex exhibits some
strange "memory" behavior and doesn't always reflect the true state of
the upstream repo when a file is removed.

> On the other hand, Publican 0.x never handled script subtags properly, but
> by chance (not design!) happened to handle sr-LATN correctly, hence its use
> in the actual doc directories.

Understood.

> Moving forward:
>
> As soon as F-12 is released, we can move all Publican sr-LATN directories to
> sr-Latn-RS, which Publican 1.0 handles correctly.
>
> As soon as Fedora's Transifex instance is upgraded to 0.7 (or later), we can
> also move all sr_Latn and sr at latin to sr-Latn-RS, which Transifex 0.7
> handles correctly.

Agreed for docs, but don't know if moving UI stuff away from glibc's
sr at latin is a good idea atm. From what I hear, Transifex 0.7 will
support both ways just fine.

> Hopefully, this will end the madness! And, like you say, prevent any
> confusion that we might cause up- or downstream.
>
> It might be worth filing a bug against Fedora-Localization : l10n-requests
> so that we can track this work; feel free to assign it to me.

Thanks, will do as soon as F12 is out and infra is updated as you mentioned.

> Finally, I note that the IANA register of subtags does not set any default
> script for Serbian (ie, the "Suppress-Script" parameter is not set).
> Therefore, Serbian, when written with the Cyrillic script is probably most
> accurately tagged sr-Cyrl; so perhaps we should move sr and sr-RS to
> sr-Cyrl-RS as well?

True, we have intentionally tipped slightly in favor of the Cyrillic
default as it is the script of official government documents etc. as
prescribed by the Constitution (see Article 10 of [1]), although they
are probably equally used in practice. We would probably like to keep
the Cyrillic default preference, as there is consensus with other L10n
teams such as Gnome, KDE and OOo on this as well. (Curiously enough,
Google pages also default to Cyrillic script when you set up your
browser for sr only.)

Thanks again for taking an interest in this, I hope other languages
using multiple scripts will benefit from these discussions as well in
the long run.

Regards,
Miloš

[1] http://www.parlament.gov.rs/content/eng/akta/ustav/ustav_1.asp




More information about the Fedora-trans-list mailing list