| Summary: | changeset 8142/8143 breaks reception of external WM_(UNI)CHAR messages | ||
|---|---|---|---|
| Product: | SDL | Reporter: | Andreas Ertelt <bugzilla-sdl> |
| Component: | events | Assignee: | Sam Lantinga <slouken> |
| Status: | RESOLVED ABANDONED | QA Contact: | Sam Lantinga <slouken> |
| Severity: | normal | ||
| Priority: | P2 | ||
| Version: | HG 2.1 | ||
| Hardware: | All | ||
| OS: | Windows (All) | ||
| See Also: |
https://bugzilla.libsdl.org/show_bug.cgi?id=1876 https://bugzilla.libsdl.org/show_bug.cgi?id=2287 |
||
| Attachments: |
TranslateMessage / unicode workaround
SDL_windowsevents.c.patch |
||
|
Description
Andreas Ertelt
2014-02-19 16:12:42 UTC
Created attachment 1567 [details]
TranslateMessage / unicode workaround
After thinking about this issue some more I realized that the problem requires the translation being done in SDL_PumpEvents().
My suggestion would be moving the manual translation code down (it turned out the repetition loop isn't actually necessary) to PumpEvents and calling TranslateMessage() for messages other than WM_KEYDOWN just in case.
For me this patch works fine, but I cannot account for your Qt case.
Just noticed that doing it this way causes complex UNICODE input through WM_KEYDOWN to only return question marks again. So this needs more work. Turns out the character code in wparam gets truncated when sending the message as WM_CHAR - using WM_UNICHAR fixes this problem. Therefore we would have to distinguish between translations that fit into UTF16 supported by WM_CHAR and use UTF32 which have to be sent as WM_UNICHAR. For testing purposes I sent codes from 0 to 6000 through WM_CHAR events with different system locales and keyboard layouts set. The results showed a seemingly fixed translation table with smaller and larger gaps returning 3F (?) all throughout this range. Meaning the problem is no truncation issue as previously assumed. It was also pointed out on the forums that the result from ToUnicode() is actually a UTF-16 encoded string and not a unicode codepoint. Which means that the conversion (at least for the higher unicode tables) is incorrect (as well as the function naming stating UTF32 instead of unicode) which however does not cause any ill effect being that the affected characters have never been accessible anyway. Another oddity I found is that some emoji characters when sent through the Windows 8 touch keyboard (which is slightly different from the onscreen keyboard) seem to produce not one but two WM_KEYDOWN events - the first one being translated to a lone UTF-16 high surrogate marker, the following one being the continuation of the UTF-16 code. I looked at the conversion in WIN_ConvertUTF32toUTF8() again and the name is actually fine - it should however only be used in WM_UNICHAR. WM_CHAR has to use WideCharToMultiByte() instead. Should conversions in general be using SDL_iconv() even if they're system specific? I found other locations in SDL using WideCharToMultiByte() already, so I'm guessing no? Regarding the separate events from wider characters - other libraries seem to indeed store and drop WM_CHAR messages containing lone surrogates to be used together with the next WM_CHAR event. Is there a particular place where such states should be stored in SDL or should a simple global variable be introduced for this? Regarding the WM_CHAR filtered character issue - the application itself was simply not compiled with -DUNICODE and -D_UNICODE. So despite IsWindowUnicode() returning true, WM_CHAR was not receiving UTF-16 codes. Created attachment 1573 [details] SDL_windowsevents.c.patch To sum things up- TranslateMessage() was never the actual issue in bug #1876, rather the fact that the application itself was not compiled with -DUNICODE that made it behave "unexpected" to me. The patch was practically trying to force ANSI-applications to receive Unicode characters. This information really ought to make it into the wiki entry describing SDL_TEXTINPUT in form of a remark like "To receive Unicode characters on Windows your application has to be compiled with the macro UNICODE defined (or _UNICODE when using VC++). Failing to do so will result in receiving question marks (? = 0x3f) for certain characters." The attached patch allows for WM_CHAR and WM_UNICHAR to function again with WM_CHAR now also supporting the wider range of UTF-16 encodings sent using split events. The patch defines a global variable "Uint16 last_surrogate;" to store the first part of a split event WM_CHAR message - I figured this was okay as it had been done this way further down the file (right in front of SDL_RegisterApp()) already. The key handling / IME text input code has changed so many times because it's broken in different ways I'm very hesitant to change what's there. However I'm marking this bug assigned for review once I'm brave enough to change it. :) Hello, and sorry if you're getting dozens of copies of this message by email. We are closing out bugs that appear to be abandoned in some form. This can happen for lots of reasons: we couldn't reproduce it, conversation faded out, the bug was noted as fixed in a comment but we forgot to mark it resolved, the report is good but the fix is impractical, we fixed it a long time ago without realizing there was an associated report, etc. Individually, any of these bugs might have a better resolution (such as WONTFIX or WORKSFORME or INVALID) but we've added a new resolution of ABANDONED to make this easily searchable and make it clear that it's not necessarily unreasonable to revive a given bug report. So if this bug is still a going concern and you feel it should still be open: please feel free to reopen it! But unless you respond, we'd like to consider these bugs closed, as many of them are several years old and overwhelming our ability to prioritize recent issues. (please note that hundred of bug reports were sorted through here, so we apologize for any human error. Just reopen the bug in that case!) Thanks, --ryan. |