We are currently migrating Bugzilla to GitHub issues.
Any changes made to the bug tracker now will be lost, so please do not post new bugs or make changes to them.
When we're done, all bug URLs will redirect to their equivalent location on the new bug tracker.

Bug 3211

Summary: Complex text layout support
Product: SDL_ttf Reporter: anood almuharbi <anood.almuharbi>
Component: miscAssignee: Sam Lantinga <slouken>
Status: NEW --- QA Contact: Sam Lantinga <slouken>
Severity: enhancement    
Priority: P2 CC: dr.khaled.hosny, fahad.alsaidi, grumbel, philipp.wiesemann, sylvain.becker
Version: unspecified   
Hardware: All   
OS: All   
Attachments: patch
new patch
new patch
new patch
new patch
new patch
new patch

Description anood almuharbi 2015-12-22 06:36:59 UTC
Created attachment 2342 [details]
patch

This patch adds support for languages that require "complex text layout" (https://en.wikipedia.org/wiki/Complex_text_layout).

We are using "libraqm" (https://github.com/HOST-Oman/libraqm), a small source code-only library that wraps FriBidi (for bidirectional text support) and HarfBuzz (for text shaping), and does proper BiDi and script itemization.

The CTL support is enabled by default but can be disabled at compiling time, and we provide a fallback function that uses your original code without CTL support.
Comment 1 Sylvain 2015-12-22 10:41:02 UTC
Hi, 

You might be interested by this patch:
https://bugzilla.libsdl.org/show_bug.cgi?id=3046

This is SDL_ttf on HarfBuzz + FreeType.
But there is no dependency to libraqm neither FreeBidi.

It's up to the user to do BiDi, ie to reverse the string if needed, inverse parenthesis, so on.
For instance, for RTL language, I only reverse the string, then tell harfbuzz to shape as LTR, and let SDL_tff renders naturally as LTR. It works correctly.

Also, I notice, it was important to be able to tell the "script" used to HarfBuzz, otherwise the text shaping was incomplete.
Cheers,
Comment 2 Khaled Hosny 2015-12-22 11:21:52 UTC
(In reply to Sylvain from comment #1)
> Hi, 
> 
> You might be interested by this patch:
> https://bugzilla.libsdl.org/show_bug.cgi?id=3046
> 
> This is SDL_ttf on HarfBuzz + FreeType.
> But there is no dependency to libraqm neither FreeBidi.
> 
> It's up to the user to do BiDi, ie to reverse the string if needed, inverse
> parenthesis, so on.

I think it is better to do BiDi in one place i.e in SDL_ttf, experience have shown that most developers don’t have enough experience to do it correctly.

> For instance, for RTL language, I only reverse the string, then tell
> harfbuzz to shape as LTR, and let SDL_tff renders naturally as LTR. It works
> correctly.

This won’t work for Arabic or any RTL language that require shaping, so I’m surprised it worked for you.

> Also, I notice, it was important to be able to tell the "script" used to
> HarfBuzz, otherwise the text shaping was incomplete.

True, and libraqm does this. It handles script itemisation and resolve the script of common characters, and even handling paired characters.
Comment 3 Khaled Hosny 2015-12-22 11:25:19 UTC
Further more, linraqm is a source only library so it adds no dependency of its own since it should be included with SDL_ttf. We wrote it as a collection of common code we are using to added complex text support to several projects.
Comment 4 Sylvain 2015-12-22 12:33:34 UTC
For RTL, it works.
but this is the sum of 3 awkward things :
reverse the string buffer.
ask hb to shape for LTR
SDL_ttf naturally draw as LTR.


I don't really see where you set the HB_SCRIPT_. E.g. HB_SCRIPT_LATIN, HB_SCRIPT_ARABIC, HB_SCRIPT_HAN, HB_SCRIPT_HEBREW, etc.
without this, I am absolutely sure I got incomplete text shaping. Maybe not for latin, arabic, but for some there was missing glyph substitutions.


Anyway, I would also prefer to have everything in SDL_tff ...
Comment 5 Khaled Hosny 2015-12-22 17:46:20 UTC
(In reply to Sylvain from comment #4)
> For RTL, it works.
> but this is the sum of 3 awkward things :
> reverse the string buffer.
> ask hb to shape for LTR
> SDL_ttf naturally draw as LTR.

I’m pretty sure this would have several problems, it might seem to work at the surface but it is not the way how RTL text should be passed to HarfBuzz and will fail in many situations.

> I don't really see where you set the HB_SCRIPT_. E.g. HB_SCRIPT_LATIN,
> HB_SCRIPT_ARABIC, HB_SCRIPT_HAN, HB_SCRIPT_HEBREW, etc.
> without this, I am absolutely sure I got incomplete text shaping. Maybe not
> for latin, arabic, but for some there was missing glyph substitutions.

The code in `itemize_by_script` does the script detection, and we then use it in `harfbuzz_shape`.
Comment 6 Sylvain 2015-12-23 14:52:47 UTC
I agree with you there should be issues (like context pattern substitution that wouldn't match) but I think I doubled check my renderings when I wrote that, and I just found it was a good workaround for some other issue.

In doubt, I changed to the correct way. Not a single difference in rendering! pixels are equals.

In fact, this is even easier:

For a plain RTL string, just set the direction as RTL to HarfBuzz, and SDL_ttf+HarfBuzz is just able to render it browsing from 0 to len-1 the "hb_glyph_info_t".

On top of SDL_ttf, I have also changed a little bit my thin layer that allows to render strings containing both RTL + LTR.



I think, there is another issue in text rendering that would be important to take care of.
Many times, the string to render cannot be rendered with only one font.
It can also happens with RTL. For instance, you have a RTL string that contains some latin LTR sub-string.
Assuming the RTL font is not able to render the LTR string.

If SDL_ttf handles the shaping of bi-direction text support, it would be also important to render from a set of fonts.
Comment 7 Khaled Hosny 2015-12-23 15:38:15 UTC
(In reply to Sylvain from comment #6)
> I agree with you there should be issues (like context pattern substitution
> that wouldn't match) but I think I doubled check my renderings when I wrote
> that, and I just found it was a good workaround for some other issue.
> 
> In doubt, I changed to the correct way. Not a single difference in
> rendering! pixels are equals.

Try an RTL string containing parenthesis (or other characters with Bidi_Mirroring property), they will not be mirrored if text is shaped LTR. Mirroring them in the input string is not enough as it will not handle characters that has no encoded mirror in Unicode, and there is also features like ltra/rtla and ltrm/rtlm that depend on passing the correct direction to HarfBuzz.

> I think, there is another issue in text rendering that would be important to
> take care of.
> Many times, the string to render cannot be rendered with only one font.
> It can also happens with RTL. For instance, you have a RTL string that
> contains some latin LTR sub-string.
> Assuming the RTL font is not able to render the LTR string.
> 
> If SDL_ttf handles the shaping of bi-direction text support, it would be
> also important to render from a set of fonts.

Current SDL_ttf API works with single font at a time, I agree that supporting font fallback is important, but it will require a new API and an entirely different set of dependencies (if one wants to integrate with system font management APIs).
Comment 8 anood almuharbi 2015-12-28 05:12:07 UTC
Created attachment 2348 [details]
new patch

I have attached new patch , that fix small issue in the fall back function
Comment 9 Philipp Wiesemann 2015-12-28 19:47:10 UTC
Instead of malloc()/free() the wrappers SDL_malloc()/SDL_free() should be used (to improve portability).
Comment 10 anood almuharbi 2015-12-30 05:22:55 UTC
Created attachment 2352 [details]
new patch

I have added a new patch , to replace free() and malloc() with SDL_free() and SDL_malloc().
Comment 11 anood almuharbi 2015-12-30 05:24:00 UTC
(In reply to Philipp Wiesemann from comment #9)
> Instead of malloc()/free() the wrappers SDL_malloc()/SDL_free() should be
> used (to improve portability).

Thank you sir :)
Comment 12 anood almuharbi 2016-01-05 10:38:35 UTC
Created attachment 2353 [details]
new patch

I have uploaded a new patch , I forgot to add raqm.c and raqm.h in the previous patch.
Comment 13 anood almuharbi 2016-01-05 10:52:00 UTC
Created attachment 2354 [details]
new patch

added a new patch to support changes in libraqm.
Comment 14 anood almuharbi 2016-01-24 08:48:35 UTC
Created attachment 2368 [details]
new patch

 We are using now a standalone library 'Raqm', that wraps FriBidi (for bidirectional
    text support) and HarfBuzz (for text shaping), and does proper BiDi and scrip itemization.
    
Raqm support is enabled by default but can be disabled at compiling time.

https://github.com/HOST-Oman/libraqm
Comment 15 Philipp Wiesemann 2016-01-26 21:43:39 UTC
The function text_layout() from the patch calls exit(). This makes it more difficult for applications to handle errors (e.g. show a message or save files before abort).
Comment 16 anood almuharbi 2016-01-27 07:09:11 UTC
Created attachment 2369 [details]
new patch

Replace exit with return.
Comment 17 anood almuharbi 2016-01-27 07:10:21 UTC
(In reply to Philipp Wiesemann from comment #15)
> The function text_layout() from the patch calls exit(). This makes it more
> difficult for applications to handle errors (e.g. show a message or save
> files before abort).

You are right :), thanx.
Comment 18 Khaled Hosny 2016-02-26 18:08:21 UTC
Is there any interest from SDL_ttf maintainer(s) in this work, or what be the better course of action is to work out some other way (a different library?) to support complex text scripts in SDL?
Comment 19 grumbel 2018-09-10 08:43:13 UTC
The patch still has a bunch of issue. Namely it mixes up char/index when accessing the cache thus leading incorrect characters being displayed and it doesn't free raqm_t properly.

I attempted to fix those issue:

https://github.com/SuperTux/SDL_ttf/commit/be4e16698237cf83d65e86fb83655eee9f29ac12

https://github.com/SuperTux/SDL_ttf/commit/25588c5b32af5d26e140e5df181b27f664f0e47f

But given that there is still a bunch of Find_Glyph() in the code further fixing might be needed.
Comment 20 Sylvain 2018-10-19 13:15:38 UTC
I have also integrated a text shaping library, and I have extracted the part where I replaced the SDL_ttf Glyph cache system by an Index cache system.

I proposed the patch as I thought it was reducing the number of collision (hence an improvement) but that was a mistake. In the end, it's similar.

Nevertheless, the Index cache system is required to run a text shaping engine. 
So you can have a look, maybe it will help you to fix your issue (bug 4307).