Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode keyboard input on Windows NT #31

Closed
SDLBugzilla opened this issue Feb 10, 2021 · 0 comments
Closed

Unicode keyboard input on Windows NT #31

SDLBugzilla opened this issue Feb 10, 2021 · 0 comments

Comments

@SDLBugzilla
Copy link
Collaborator

This bug report was migrated from our old Bugzilla tracker.

These attachments are available in the static archive:

Reported in version: 1.2.9
Reported for operating system, platform: Windows (All), x86

Comments on the original bug report:

On 2006-01-10 13:18:02 +0000, Alex Volkov wrote:

Is there any interest in adding Unicode input on NT (and above)? I realize that Win9x does not support the ToUnicode() API, but we can at least support that on NT/2k/XP/etc, while keeping the Ascii input on Win9x.
Would such a patch be desired?

On 2006-01-10 15:39:58 +0000, Ryan C. Gordon wrote:

This would be nice to have...in these cases, you usually end up having to use use LoadLibrary() on system DLLs to see if the Unicode entry points exist, and falling back to ASCII behaviour when then don't...otherwise apps using SDL.dll will refuse to start up on Win95 or whatever.

--ryan.

On 2006-01-10 15:48:14 +0000, Alex Volkov wrote:

Quite so. I'll write up a patch, hopefully soon.

On 2006-01-11 22:18:05 +0000, Alex Volkov wrote:

Created attachment 20
NT Unicode input patch

Here is the patch. Successfully tested on Win2k and Win95, but there is no reason why this would not work on other NT or 9x versions. It should not interfere with the current WinCE code, however, unicode input might work on WinCE, if specifically enabled -- I do not know if WinCE supports the ToUnicode() API, and the MSDN docs are very silent about it.

It turns out, ToUnicode() API is present in Win9x versions of user32.dll, but is essentially a no-op, so the LoadLibrary+GetProcAddress trick would not work. Instead, we have to check which platform we are running on with GetVersionEx().

On 2006-01-14 02:03:00 +0000, Ryan C. Gordon wrote:

An alternate patch was just posted to the mailing list:

http://www.devolution.com/pipermail/sdl/2006-January/072030.html

Alex, can you comment on which direction to go from here? Maybe take parts of each patch, or favor one completely?

--ryan.

On 2006-01-14 09:10:19 +0000, John Popplewell wrote:

Hi,

the main improvements that could be made to my patch would be to make the test for the platform a one-off, and to handle WM_INPUTLANGCHANGE so that the calls to GetKeyboardLayout() and GetLocaleInfo() could be minimized.

However, I don't know whether this matters much, given the rate of WM_KEYDOWN events.

I've not been able to test this on Win95 or WinME, only 98,2K and XP. I could wheel out an old Win95 CD and a spare HD if necessary. I've only ever seen one ME machine, although I might be able to arrange testing there.

It's also worth pointing out that I'm testing with a UK and US keyboard only, although I do have a UK keyboard with sticky labels on that pretends to be a German keyboard!

My main (test) Win98 machine (Western locale) has all the international support (code-pages, fonts etc.) installed, so I've not tried on a typical, minimalist install.

Unless it crashes, I don't see a problem with requiring appropriate international support for input to work correctly. This will probably only affect development/test machines as international users will almost certainly have all the right stuff installed for their keyboard to work in the first place,

regards,
John.

On 2006-01-14 09:59:48 +0000, Alex Volkov wrote:

There are advantages and disadvantages in both patches. Both patches will not work 100% correctly if the translation to Unicode returns more than 1 Unicode char (like with non-spacing accent chars), though I have yet to see a keyboard that produces those. And this is no more broken than the equivalent X11 code.

John's patch has a single copy of SDL_ToUnicode, so it's easier to maintain, however, I really would not want to make all those calls (GetVersionEx, etc.) on every keypress and release.
I cannot testify to John's Left/Right Shift key detection, but I am not very fond of using hardcoded scancodes for that, as scancodes are keyboard-specific. However, in John's defense, Win9x will most likely not even run on exotic hardware that has really weird scancodes, though nothing is guaranteed.

As for Unicode input on Win9x in general, it's a great idea using MultiByteToWideChar() to translate the codepage chars to Unicode, but this will break any current Win9x SDL app that is already relying on codepage chars (and not Unicode). Unlike WinNT, Win9x does not really support Unicode, so the input in done by replacing the 0x80-0xff range with a locale-specific codepage, and some SDL apps may already be abusing this. Myself, I do not care about Win9x all that much, and if those apps break, it's the app vendor's fault, and all the more reason for ppl to switch to NT, but from a maintainer's perspective, this may be better defered to SDL 1.3 to keep the ABI intact.

On 2006-01-14 11:36:27 +0000, John Popplewell wrote:

(In reply to comment # 6)

<snip!>
John's patch has a single copy of SDL_ToUnicode, so it's easier to maintain,
however, I really would not want to make all those calls (GetVersionEx, etc.)
on every keypress and release.
That can be fixed. I'll put another patch together.

I cannot testify to John's Left/Right Shift key detection, but I am not very
fond of using hardcoded scancodes for that, as scancodes are keyboard-specific.
However, in John's defense, Win9x will most likely not even run on exotic
hardware that has really weird scancodes, though nothing is guaranteed.

The scancodes for the shift-keys appear to be the same on all keyboards I've tried, and if they are different it will just act like it does now. Support for an arbitrary number of valid scancodes could be added I suppose.

As for Unicode input on Win9x in general, it's a great idea using
MultiByteToWideChar() to translate the codepage chars to Unicode, but this
will break any current Win9x SDL app that is already relying on codepage
chars (and not Unicode). Unlike WinNT, Win9x does not really support Unicode,
so the input in done by replacing the 0x80-0xff range with a locale-specific
codepage, and some SDL apps may already be abusing this. Myself, I do not care
about Win9x all that much, and if those apps break, it's the app vendor's
fault, and all the more reason for ppl to switch to NT, but from a maintainer's
perspective, this may be better defered to SDL 1.3 to keep the ABI intact.

I don't think this is accurate. My tests suggest that the same problem applies to NT systems with the current version - they also currently receive character codes in "the 0x80-0xff range with a locale-specific codepage" - so the application will break on Win9x and NT.

I've not found an application that does anything like this, but it is difficult to belive that it hasn't happened :-)

Regarding Unicode support in Win9x, once you've got Unicode characters from SDL it would be possible for the application to link to the MS provided 'unicows.dll' if they need transparent Unicode handling,

best regards,
John.

On 2006-01-14 20:04:13 +0000, John Popplewell wrote:

Created attachment 21
ToUnicode() patch for Win9x/ME/2K/XP

This modified patch removes the windib left/right-shift key stuff and minimizes the call overhead when handling WM_KEYDOWN events.

Whilst testing the windib driver I discovered that the left and right shift keys aren't independent e.g. hold down the left-shift key, then toggle the right-shift key - nothing. This isn't how the directx and x11 driver behave, so I dumped those changes.

On 2006-01-15 19:03:51 +0000, John Popplewell wrote:

Created attachment 22
Improved ToUnicode() patch

Use of a function pointer makes the code simpler and more run-time efficient.

On 2006-01-15 20:37:42 +0000, John Popplewell wrote:

Thought some explanation of the GetCodePage() function and the use of MultiByteToWideChar() was in order. MS examples of MultiByteToWideChar() usage use the flag CP_ACP or the value returned by GetACP(). This works (translates an 8-bit code-page relative character into a 16-bit Unicode character) as long as the keyboard mapping matches the code-page of your system. This is probably the case for a lot of users, but not for developers or users who work with multiple languages.

For example, my UK Win98 systems have a system code-page identifier of 1252, which means that MultiByteToWideChar(CP_ACP,...) will work for other countries with the same code-page e.g. German, Spanish etc. If I set the keyboard mapping to Polish (1250) or Greek (1253) I get rubbish.

I wondered how 'notepad' did it and discovered that 'notepad' stops you changing to a keyboard mapping that doesn't share the same code-page as the system! This is how I discovered the WM_INPUTLANGCHANGE / WM_INPUTLANGCHANGEREQUEST messages.

However, 'Wordpad' lets you change the keyboard mapping, and handles all the characters fine - so it can be done!

The call to GetLocaleInfo() using an LCID made from the language identifier returned by GetKeyboardLayout() is just the most direct route I found for getting a code-page identifier that changes with the keyboard mapping, there may be a single function call that does this...

Anyway, sorry for rambling and hope this helps,

cheers,
John.

On 2006-01-16 01:56:45 +0000, Alex Volkov wrote:

Patch looks good, John, except I think the dx5 part wont compile with Visual C 6. VC6 cannot do C99 var declaration intersperced with code, so you should put 'BYTE keystate[256];' back where it was, or simply change vkey above it to
UINT vkey = MapVirtualKey(scancode, 1);

On 2006-01-16 02:07:22 +0000, Alex Volkov wrote:

(In reply to comment # 7)

As for Unicode input on Win9x in general, it's a great idea using
MultiByteToWideChar() to translate the codepage chars to Unicode, but this
will break any current Win9x SDL app that is already relying on codepage
chars (and not Unicode). Unlike WinNT, Win9x does not really support Unicode,
I don't think this is accurate. My tests suggest that the same problem applies
to NT systems with the current version - they also currently receive character
codes in "the 0x80-0xff range with a locale-specific codepage" - so the
application will break on Win9x and NT.

On my Win2k with a Russian keyboard, all the cyrillic keys get translated to '?' by the ToAscii() function, which is what SDL is currently using. Perhaps you were testing with ToAsciiEx()?
On Win95, however, ToAscii() translates the cyrillic keys to the 0xa1-0xff range (codepage 1251, I think).

On 2006-01-16 07:13:10 +0000, John Popplewell wrote:

Created attachment 23
ToUnicode() patch - bug fix for VC6

(In reply to comment # 11)

Patch looks good, John, except I think the dx5 part wont compile with Visual C
6. VC6 cannot do C99 var declaration intersperced with code, so you should put
'BYTE keystate[256];' back where it was, or simply change vkey above it to
UINT vkey = MapVirtualKey(scancode, 1);

Oops! Thanks for prompting me to try VC6, it didn't like my pointer-to-function declaration either. I've been using MSYS/MinGW ...

cheers,
John.

On 2006-01-16 08:07:34 +0000, John Popplewell wrote:

(In reply to comment # 12)

(In reply to comment # 7)

As for Unicode input on Win9x in general, it's a great idea using
MultiByteToWideChar() to translate the codepage chars to Unicode, but this
will break any current Win9x SDL app that is already relying on codepage
chars (and not Unicode). Unlike WinNT, Win9x does not really support Unicode,
I don't think this is accurate. My tests suggest that the same problem applies
to NT systems with the current version - they also currently receive character
codes in "the 0x80-0xff range with a locale-specific codepage" - so the
application will break on Win9x and NT.

On my Win2k with a Russian keyboard, all the cyrillic keys get translated to
'?' by the ToAscii() function, which is what SDL is currently using. Perhaps
you were testing with ToAsciiEx()?
That's possible, I was flailing around for a while :-)

On Win95, however, ToAscii() translates the cyrillic keys to the 0xa1-0xff
range (codepage 1251, I think).

I stand corrected. Interesting. I didn't test with Russian. With Polish, ToAscii() maps the barred-L character to ASCII L and the o-acute is mapped to 0xF3 which is o-acute in code-page 1252 (my default). I get the same effect as you with Russian though.

This is quite good news: doesn't it mean that no application can use a work-round to handle international characters when running on Windows?

I've had a look at some SDL-based projects and haven't (yet) found any that would be affected by this fix. Typically:

  • no use of unicode field, often with a GUI keyboard for name entry
  • ignore characters >= 128 (Tux Paint used to)
  • Works internally with Unicode, so it will just start working the same as on X11 (PyGame)

Advice on the SDL documentation Wiki contains this code fragment:

char ch;
if ( (keysym.unicode & 0xFF80) == 0 ) {
ch = keysym.unicode & 0x7F;
}
else {
printf("An International Character.\n");
}

which I believe will still work.

Do you think it's worth trying to identify affected applications?

best regards,
John.

On 2006-01-16 13:59:05 +0000, Alex Volkov wrote:

(In reply to comment # 14)

This is quite good news: doesn't it mean that no application can use a
work-round to handle international characters when running on Windows?

Technically, yes. No Windows SDL application can abuse the unicode field on Win9x and WinNT right now. And since the unicode field gets non-1252 codepage chars only on Win9x, it should be relatively safe. I seriously doubt anyone is writing and maintaining apps that run only on Win9x right now.

And for languages in the codepage 1252 the behavior will not change with this patch (one of the beauties of UCS/Unicode). Also the code fragment from the wiki that you mentioned will still stand.

Do you think it's worth trying to identify affected applications?

All things considering -- I do not think so. As codepage 1252 remains unchanged, if there are any others, they are abusing the Win9x behavior and should correct their code.

I found this issue with SDL myself while trying to add support for Russian input in a game. Let's just assume that I was the first, since I did not find any other bug reports re this ;-)

On 2006-01-19 03:51:05 +0000, Ryan C. Gordon wrote:

Reassigning this bug to Sam for final deliberation.

Sam, please note that the commit of Bug # 47 makes the latest version of this patch apply with offset warnings, but it otherwise applies cleanly, still.

Alex, John, thank you for all the discussion and cooperation on this bug...the collaboration is really exciting to see!

--ryan.

On 2006-01-19 04:10:30 +0000, Sam Lantinga wrote:

Thanks guys! This patch is now in CVS.

On 2006-01-27 11:23:11 +0000, Ryan C. Gordon wrote:

Setting Sam as "QA Contact" on all bugs (even resolved ones) so he'll definitely be in the loop to any further discussion here about SDL.

--ryan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant