[opensuse] collation bug in locales using UTF-8 (cur= a<A<b<B<z<Z; should be A<B<C<Z<a<b<c<z)
- From: Linda Walsh <suse@xxxxxxxxx>
- Date: Sun, 27 May 2012 13:07:13 -0700
It seems that Open SuSE suffers from this bug as well:
Somewhere along the line, due to people paying attention to POSIX, they though
they could change collation orders for any locality outside of the 'C' locality to anything they wanted.
What they didn't realize is that Unicode also specifies a correlation order that
is roughly (maybe exactly in the C range), equivalent to the C range.
While localized character sets iso-8859-xx... and others might have different collating orders, for those who have UTF-8 as the default encoding, or
more so, set encodings to lang_CO.UTF-8, the character set collation order
should take precedence -- in __AT LEAST__, character ranges as used in Regex's
(as those are 'pure characters and no words are involved -- then only the ordering of the character set should be used).
Maybe this needs to be fixed in the gnu sorting functions?
Unicode sources are quoted with links in the above bug...
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
To contact the owner, e-mail: opensuse+owner@xxxxxxxxxxxx
- Prev by Date: Re: [opensuse] What file indexers &c to kill, remove on 12.1?
- Next by Date: Re: [opensuse] Re: 11.3 repositories
- Previous by thread: [opensuse] What file indexers &c to kill, remove on 12.1?
- Next by thread: [opensuse] Re: collation bug in locales using UTF-8 (cur= a<A<b<B<z<Z; should be A<B<C<Z<a<b<c<z)