Skip to content

Add language parameter to Text objects #29794

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: text-overhaul
Choose a base branch
from

Conversation

QuLogic
Copy link
Member

@QuLogic QuLogic commented Mar 22, 2025

PR summary

Along with #29695, this language parameter is an additional setting that may affect text layout. As with the other PR, this is not complete, but rather something to get the API decided upon (and likely will be rebased on raqm PR after it's done.)

There are two parts to the API:

  1. the language parameter attached to Text, and any other API that wraps it (i.e., Axes.text or Axes.annotate)
  2. the language parameter in backend's Renderer.draw_text

At the moment, for both, I have set these to str | list[tuple[str, int, int]] where the latter denotes a list of (language, start, end)-tuples. However, this was before I found out about the font feature machinery, and it is possible that we may wish to use that syntax instead, at least for API part 1. For part 2, it's a little nicer to have the explicit types, just for passing to the FT2Font extension, but this may not be relevant to other implementations.

PR checklist

@anntzer
Copy link
Contributor

anntzer commented Mar 22, 2025

re: adding language to draw_text: the general discussion about extensibility of renderer API applies, i.e.

  • if you're going to add language as a parameter to draw_text then it should be at least keyword-only.
  • it's likely that the more maintainable API would likely to just embed language as part of fontproperties (it's not really different from e.g. font features, I'd think) and tell the backend to read that info.
  • for draw_text we have actually more or less given up on having a "clean" renderer API because we pass the whole Text instance to draw_text (per Propagate mpl.text.Text instances to the backends and fix documentation #1081) anyways so perhaps backends that care could also just read that info off the Text instance. It feels a bit dirty, but heh... ("dirty" because we as well may just have a renderer API that's renderer.draw_text(textobj) to start with...)

@anntzer
Copy link
Contributor

anntzer commented Mar 28, 2025

Side point regarding allowing list[tuple[str, int, int]] as language setting: this may be a problem if one uses somelanguage[idx:] to set the language from an index up to the end of the string (whose length could technically be modified later, even though it's not clear this necessarily makes much practical sense). Likely you need list[tuple[str, int, int | None]] (and then you may possibly support None (=0) for the start as well).

@QuLogic
Copy link
Member Author

QuLogic commented Apr 16, 2025

In the call a couple weeks ago, we decided to drop the new language parameter to draw_text, because that already had a mtext parameter which contained the whole Text object. This would avoid any signature introspection.

However, one followup is that Renderer.get_text_width_height_descent would also depend on font features/language settings, and really should get these passed through somehow as well.

@@ -65,7 +65,7 @@ def layout(string, font, *, kern_mode=Kerning.DEFAULT):
"""
x = 0
prev_glyph_idx = None
char_to_font = font._get_fontmap(string)
char_to_font = font._get_fontmap(string) # TODO: Pass in language.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, this function is getting rewritten with libraqm, so this is just a note for later.

@@ -43,7 +43,7 @@ def warn_on_missing_glyph(codepoint, fontnames):
f"Matplotlib currently does not support {block} natively.")


def layout(string, font, *, kern_mode=Kerning.DEFAULT):
def layout(string, font, language, *, kern_mode=Kerning.DEFAULT):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These params should be kwonly? (throughout most of the PR, I think)

@QuLogic QuLogic force-pushed the text-language branch 2 times, most recently from 3c188df to 931423b Compare May 13, 2025 03:28
@QuLogic
Copy link
Member Author

QuLogic commented May 22, 2025

I've added a text.language rcParam as discussed on the call last week.

@QuLogic
Copy link
Member Author

QuLogic commented May 23, 2025

Also added a What's new note, though as with the other PR, it won't show the difference until libraqm is added.

@QuLogic QuLogic changed the base branch from main to text-overhaul June 5, 2025 01:38
@QuLogic QuLogic moved this to Ready for Review in Font and text overhaul Jun 5, 2025
fig.text(0.5, 0.3, f'\\U{ord(char):08x}', fontsize=40, horizontalalignment='center')
fig.text(0, 0.1, f'English: {char}', fontsize=40, language='en')
fig.text(1, 0.1, f'Inari Sámi: {char}', fontsize=40, language='smn',
horizontalalignment='right')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a (str, int, int) example just to cover all the forms?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're not ready to expose that just yet, so it's not been documented.

Comment on lines +780 to +787
with pytest.raises(TypeError):
font.set_text('foo', language=[1, 2, 3])
with pytest.raises(TypeError):
font.set_text('foo', language=[(1, 2)])
with pytest.raises(TypeError):
font.set_text('foo', language=[('en', 'foo', 2)])
with pytest.raises(TypeError):
font.set_text('foo', language=[('en', 1, 'foo')])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you parameterize so there are labels to make it clear what's wrong with each case?

font = ft2font.FT2Font(file, hinting_factor=1, _kerning_factor=0)
font.set_text('foo')
font.set_text('foo', language='en')
font.set_text('foo', language=[('en', 1, 2)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this just testing it doesn't blow up? is there a .get_language to test that the setting didn't do anything strange?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, smoke testing only; full tests come with libraqm.

Comment on lines +1196 to +1203
with pytest.raises(TypeError, match='must be list of tuple'):
Text(0, 0, 'foo', language=[1, 2, 3])
with pytest.raises(TypeError, match='must be list of tuple'):
Text(0, 0, 'foo', language=[(1, 2)])
with pytest.raises(TypeError, match='start location must be int'):
Text(0, 0, 'foo', language=[('en', 'foo', 2)])
with pytest.raises(TypeError, match='end location must be int'):
Text(0, 0, 'foo', language=[('en', 1, 'foo')])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parameterize on match and language?

_api.check_isinstance((list, str, None), language=language)
language = mpl._val_or_rc(language, 'text.language')

if isinstance(language, list):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restricted to list or is any iterable fine?

with pytest.raises(TypeError, match='must be list of tuple'):
Text(0, 0, 'foo', language=[1, 2, 3])
with pytest.raises(TypeError, match='must be list of tuple'):
Text(0, 0, 'foo', language=[(1, 2)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a list of tuple

Copy link
Member Author

@QuLogic QuLogic Jun 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the tuple is the wrong length (I just didn't write out the whole message in the match.)

_api.check_isinstance((list, str, None), language=language)
language = mpl._val_or_rc(language, 'text.language')

if isinstance(language, list):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so does this mean that if language is a list it has to be a list where each element is a 3 tuple? so you can't do language = ['en', 'sr', 'ru']? which what would the list of tuple be used for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that would not make sense, but it's understandable that it's not clear since it's purposely undocumented. If you provide multiple languages, then they would need to specify a range over which they apply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Ready for Review
Development

Successfully merging this pull request may close these issues.

3 participants