Re: utf-8 national character input in Konsole/KDE/X?
- From: "Enrique Perez-Terron" <enrio@xxxxxxxxx>
- Date: Sat, 07 Jan 2006 22:32:46 +0100
On Sat, 07 Jan 2006 01:15:29 +0100, Mikko Harjula <harjula@xxxxxxxxxxx> wrote:
I'm running KDE on Fedora Core 4:
$ konsole -v
Qt: 3.3.4
KDE: 3.4.2 Level "b"
Konsole: 1.5.2
I get:
$ konsole -v
Qt: 3.3.4
KDE: 3.5.0-0.2.fc4 Red Hat
Konsole: 1.6
My problem is that I cannot input national characters from keyboard in KDE
Konsole. This used to work fine with my old RedHat 9 using iso-8859-1. I
have the adiaeresis etc. keys on my keyboard. I have googled around and
others having problems like this are trying to type chinese or other much more
complicated, so this must be something that should work right out of the box!
Yes, it has always done for me. FC1, FC2, FC3 and FC4.
In virtual console (text mode you get with alt-FN, N=1-6 outside ox X) I can
type and see utf-8 characters. If XXX is a string with national characters I
can say:
$ cat >XXX
XXX
^D
$ cat XXX
XXX
$
Also less works if I set LESSCHARSET to utf-8. However even in virtual
console 'ls' shows utf-8 national characters in filenames as two question
marks.
If LESSCHARSET matters, I think the applications are doing something strange,
like having incorrect envoronment. (LANG, LC_ALL).
In X (KDE) the default xterm (konsole) is worse. I cannot type
national characters, not even as dead key sequences. National
character keys and 'dead' keys print nothing and also swallow the
following characters. The display shows only some small dots, smaller
and positioned a bit lower than a period. The ls command prints the
"??" for each national char.
It is not clear if ls thinks the utf chars are non-printable, or if it
is the terminal program that does not know what to do with them.
I would like to spy a little, but "ls | od" is not good, because
ls behaves differently when the stdout is not a terminal. Use the
command "script":
cd
script
ls
exit
od -c typescript
The 'cat XXX' shows the contents of the
file OK - of course I have to use wildcards to match the filename.
Less works too. So it seems konsole can show utf-8 but the input
method is wrong.
"script" will capture your keypresses as seen by the shell.
In GNU Emacs 21.4.1 under X and in konsole (option -nw) entering national
characters produces nothing, dired shows national chars in filenames correctly
and displays file content correctly.
XEmacs 21.4 (patch 17) under X seems to work correctly with national
characters, but with -nw option input does not work anymore.
Xterm is a little better than Konsole because there I can type these letters
and they show up correctly, but every time I type one letter the next letter
is not displyed before I type something else. And this behaviour sticks even
through returns, so it seems to be deep inside some input processing as if
these would be like dead keys. By the way the dead-key mechanism (pressing
the diaeresis followed by a produces adiaeresis) does not work either.
Currently I have:
$ env|egrep '^(LANG|LC_)'
LC_COLLATE=posix
LANG=fi_FI.UTF-8
There are stages:
1. You reboot, kernel starts "init". This program has no initial environment. It sets a few
variables like RUNLEVEL.
2. "init" starts "prefdm". This is a shell script, but it does not load and .bashrc.
It does, however, source /etc/profile.d/lang.sh
3. prefdm starts kdm. I don't know how many steps are involved here. I think is starts
X and a program to display a login screen. This program returns a username, and
kdm runs a session manager, makes it owned by the user, and gives it an initial
envoronment, including, I think, HOME. The session manager is responsible for
setting up the session programs, starting the window manager, the panel, and all
the other programs that are needed to run a desktop. I don't know what environment
e.g. the panel program gets.
4. The panel or the window manager starts Konsole. Konsole inherits an environment from
the program that starts it.
5. Konsole starts a shell, possibly as a "login" shell, certainly as an interactive shell.
This determines what files the shell sources. See man bash, section INVOCATION.
The files are /etc/profile, ~/.bash_profile, ~/.bash_login, ~/.profile, or ~/.bashrc.
The point is that when you run "env" you see the envoronment that the shell sets up after
reading these files. Konsole has a different environment. However, you can find out
what environment running programs are seeing, using
tr '\0' '\012' </proc/1234/environ
with the appropriate process id instead of "1234".
and locale is:
$ locale
locale: Cannot set LC_ALL to default locale: Tiedostoa tai hakemistoa ei ole
LANG=fi_FI.UTF-8
LC_CTYPE="fi_FI.UTF-8"
LC_NUMERIC="fi_FI.UTF-8"
LC_TIME="fi_FI.UTF-8"
LC_COLLATE=posix
LC_MONETARY="fi_FI.UTF-8"
LC_MESSAGES="fi_FI.UTF-8"
LC_PAPER="fi_FI.UTF-8"
LC_NAME="fi_FI.UTF-8"
LC_ADDRESS="fi_FI.UTF-8"
LC_TELEPHONE="fi_FI.UTF-8"
LC_MEASUREMENT="fi_FI.UTF-8"
LC_IDENTIFICATION="fi_FI.UTF-8"
LC_ALL=
The message "Tiedostoa tai hakemistoa ei ole" is "no such file" in Finnish.
On my system there is no missing file:
$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
I can do:
$ LANG=no_NO.UTF-8
$ locale
LANG=no_NO.UTF-8
LC_CTYPE="no_NO.UTF-8"
LC_NUMERIC="no_NO.UTF-8"
LC_TIME="no_NO.UTF-8"
LC_COLLATE="no_NO.UTF-8"
LC_MONETARY="no_NO.UTF-8"
LC_MESSAGES="no_NO.UTF-8"
LC_PAPER="no_NO.UTF-8"
LC_NAME="no_NO.UTF-8"
LC_ADDRESS="no_NO.UTF-8"
LC_TELEPHONE="no_NO.UTF-8"
LC_MEASUREMENT="no_NO.UTF-8"
LC_IDENTIFICATION="no_NO.UTF-8"
LC_ALL=
I suggest you do "strace -e trace=file locale". I get:
$ strace -e trace=file locale
execve("/usr/bin/locale", ["locale"], [/* 47 vars */]) = 0
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=79670, ...}) = 0
open("/lib/libc.so.6", O_RDONLY) = 3
fstat64(3, {st_mode=S_IFREG|0755, st_size=1485672, ...}) = 0
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=49608448, ...}) = 0
fstat64(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 1), ...}) = 0
LANG=no_NO.UTF-8
LC_CTYPE="no_NO.UTF-8"
etc., as before.
My /etc/sysconfig/i18n contains:
LANG="fi_FI.UTF-8"
SYSFONT="latarcyrheb-sun16"
SUPPORTED="fi_FI.UTF-8:fi_FI:fi"
Looks good, I have
LANG="en_US.UTF-8"
SUPPORTED="en_US.UTF-8:en_US:en:fr_FR.UTF-8:fr_FR:fr:no_NO.UTF-8:no_NO:no:es_ES.UTF-8:es_ES:es"
SYSFONT="latarcyrheb-sun16"
My /etc/X11/xorg.conf has:
Section "InputDevice"
Identifier "Keyboard0"
Driver "kbd"
Option "XkbModel" "pc105"
Option "XkbLayout" "fi"
EndSection
Looks good, I have
Section "InputDevice"
Identifier "Keyboard0"
Driver "kbd"
Option "XkbModel" "pc105"
Option "XkbLayout" "no"
EndSection
I don't have ~/.i18n or ~/.Xmodmap and I don't know if I should have one.
When I start xev and press and releas 'ä' I get:
KeyPress event, serial 31, synthetic NO, window 0x2600001,
root 0x5f, subw 0x0, time 36690043, (434,546), root:(438,567),
state 0x10, keycode 48 (keysym 0xe4, adiaeresis), same_screen YES,
XLookupString gives 1 bytes: (e4) "�
XmbLookupString gives 1 bytes: (e4) "�
What does the string in quotes look like? In my news reader it is coming across
as double-quote, i-with-dieresis, inverted-question-mark(spanish),
vulgar-fraction-one-half, and there is no terminating double quote.
XFilterEvent returns: False
KeyRelease event, serial 31, synthetic NO, window 0x2600001,
root 0x5f, subw 0x0, time 36690142, (434,546), root:(438,567),
state 0x10, keycode 48 (keysym 0xe4, adiaeresis), same_screen YES,
XLookupString gives 1 bytes: (e4) "�
The adiaeresis seems to be correct but XLookupString gives something funny.
It should be like this:
KeyPress event, serial 28, synthetic NO, window 0x2400001,
root 0x40, subw 0x0, time 361871676, (105,108), root:(1169,1113),
state 0x10, keycode 48 (keysym 0xe6, ae), same_screen YES,
XLookupString gives 2 bytes: (c3 a6) "æ"
XmbLookupString gives 2 bytes: (c3 a6) "æ"
XFilterEvent returns: False
KeyRelease event, serial 31, synthetic NO, window 0x2400001,
root 0x40, subw 0x0, time 361871754, (105,108), root:(1169,1113),
state 0x10, keycode 48 (keysym 0xe6, ae), same_screen YES,
XLookupString gives 2 bytes: (c3 a6) "æ"
that is, XlookupString should return two bytes, not one. (I don't know how the
"ae" character displays in your news reader or browser, on my screen it is the
standard Danish/Norwegian ligature.)
Check what is in the environment of the xev program! It's this program
that runs the XlookupString function and gets a single character instead of two.
I have spent several days on this without any luck (including several
reinstalls) and would appreciate any help. Also leads to in-depth
documentation thoroughly explaining the mechanisms involved when keycodes go
through kernel, X, KDE, konsole, bash and whatever would be mostly welcome.
Been there, all forgotten. Repeatedly. Each time, started over afresh. Arghh.
When you find the origin of the problem, you have probably changed a few things
that were actually correct, and you may fail to realise that you have found it.
Double arghhh.
-Enrique
.
Relevant Pages
- Re: utf-8 national character input in Konsole/KDE/X?
... >> I'm running KDE on Fedora Core 4: ... >> Konsole: 1.5.2 ... >> My problem is that I cannot input national characters from keyboard in KDE ... >> and locale is: ... (comp.os.linux.setup) - utf-8 national character input in Konsole/KDE/X?
... I'm running KDE on Fedora Core 4: ... $ konsole -v ... My problem is that I cannot input national characters from keyboard in KDE ... (comp.os.linux.setup) - utf-8 national character input in Konsole/KDE/X?
... $ konsole -v ... My problem is that I cannot input national characters from keyboard in KDE ... (comp.os.linux.x) - Character set weirdness
... I have a strange problem with special characters over samba. ... encounetring umlauts, so I won't even try to describe what grip and xmms ... konsole even accepts and prints umlauts on ... (Debian-User) - console (Konsole) fonts: not found
... of boxes in the standard console (everywhere, even in regular/English man ... Error - Konsole ... This is the output of $locale: ... (Debian-User) |
|