PEP 501: improvements inspired by PEP 750's tagged strings #3904

ncoghlan · 2024-08-14T02:59:58Z

Accumulating ideas prompted by the PEP 750 discussion at https://discuss.python.org/t/pep-750-tag-strings-for-writing-domain-specific-languages/60408 before working on an update to the PEP 501 text:

To adjust TemplateField for eager evaluation:

getvalue -> value (expression is eagerly evaluated at template definition time)
no conv field (conversions are applied at template definition time)

This gives the following interface for the concrete type:

class TemplateLiteralField(NamedTuple):
    value: Any
    expr: str
    format_spec: str | None = None

Based on the discussions with @warsaw in the PEP 750 thread (e.g. https://discuss.python.org/t/pep-750-tag-strings-for-writing-domain-specific-languages/60408/122 and https://discuss.python.org/t/pep-750-tag-strings-for-writing-domain-specific-languages/60408/135 ), it's looking like neither template literals nor tagged strings would be particularly beneficial for i18n use cases.

It's definitely possible to integrate them:

string.Template could support construction from the native template syntax (extracting the template's field names from the interpolation fields, together with a string-keyed dict mapping the field names to their eagerly interpolated values)
string.Template could implement the native template interpolation protocol, rendering itself in a normalised form (the simplest version would always render the fields as ${name}, but a slightly nicer version would emit $name when it is unambiguous to do so)

However, the integration would have significant caveats:

you’d either be limited to the ${...} substitution form (since the compiler wouldn’t see $... as defining an interpolation field), or else $... substitutions would still need to use dynamic name lookups at rendering time. Whether the $ was required or optional in the ${...} form would be up to the i18n templating support functions.
to allow interpolating more than simple references to named variables, you’d need to adapt the specifier string to include a way of naming fields for i18n substition (for example, repurpose the specifier string as naming the field such that i18n"The result of adding ${x} to ${y} is ${x+y:expr_result}" or _(t"The result of adding ${x} to ${y} is ${x+y:expr_result}") would map to the English translation catalog entry "The result of adding $x to $y is $expr_result". A regular specifier string could still be allowed after a second :, since colons are permitted in specifier strings)
any runtime normalisation performed prior to catalog entry lookup would also need to be supported in the tools that extract the translation catalog entries from the source code. This normalisation wouldn't be readily reversible in the general case, so you'd need to also generate a separate reverse index to allow catalog entries to be mapped back to the places where they're used (rather than being able to just search directly for the catalog string appearing in the code)

You'd presumably get a minor performance win by replacing dynamic variable name lookups with compiler supported field interpolations, but even that is questionable since many (most?) i18n templates are interpolating local variable values that can be retrieved with a single dict lookup.

Instead, to get i18n use cases away from using dynamic string lookups, we'd likely need to define a dedicated "$-string" (dollar string) syntax that used PEP 292 interpolation syntax to define a TemplateLiteral instance. Such a syntax could also be really interesting for shell command execution.

When discussing support for building lazy template field evaluation on top of the f-string inspired eager field evaluation, consider the following points:

describe callable fields, where the updated format builtin, and hence the default template renderer, supports () as a format specifier on a field definition to indicate that the result should be called when rendering (allowing for convenient lazy evaluation with either a lambda: prefix or passing in a reference to an existing zero-argument callable).
describe named fields, where the template renderer produces an object that allows the field names given by the field expression values to be bound to replacement values in a later method call (akin to str.format and str.format_map) rather than producing a fully resolved object in the initial rendering operation
note that a future PEP could add explicit syntactic support for lazy fields, where {-> expr} is equivalent to {(lambda: expr)} (syntax idea inspired by the syntax for return type annotations)

Give examples, such as delaying expensive function calls when logging:

logging.debug(t"This is a log message with eager evaluation of {expensive_call()}")
logging.debug(t"This is a log message with lazy evaluation of {expensive_call!()}")

logging.debug(t"This is a log message with eager evaluation of {expensive_call_with_args(x, y, z)}")
logging.debug(t"This is a log message with lazy evaluation of {(lambda: expensive_call_with_args(x, y, z))!()}")

and naming fields in reusable SQL statements:

stmt = sql(t"INSERT INTO table (column1, column2) VALUES ({"column1"}, {"column2"})")
new_entries = [{"column1": c1, "column2": c2} for c1, c2 in get_entry_data()]
results = db.executemany(stmt, new_entries)

(SQL is an interesting case, since executemany specifically wants to give the DB API control of repeated substitutions so it can optimise things. Parameter substitution isn't just about avoiding SQL injections)

(cc @nhumrich )

The text was updated successfully, but these errors were encountered:

ncoghlan · 2024-08-14T04:28:39Z

@nhumrich Since both our names are on the PEP, let me know if there's anything here that you'd prefer we didn't add (the lazy evaluation stuff in particular seems potentially controversial, although I must say I like the way that second logging example looks - just 4 extra characters to say "this is evaluated at rendering time, not template definition time")

nhumrich · 2024-08-14T05:05:43Z

Awesome. I will look more in depth the details required to add these things.

In the meantime, I am not convinced that lazy evaluation is something this PEP needs to deal with. It is not the purpose of the PEP to solve every possible case, but rather, allow the user to solve those problems in their own programs.

I think it's more likely that expensive_call in a template literal string would be the format itself, rather than some arbitrary function wanting to be delayed.
Since the format function is already a callable, I think the normal use case of deferred execution is still easy and expected.

If for some reason, a user needed deferred execution of a function call inside the template itself, they could always add support for callables inside their own template renderer. Then use that in conjunction with lambdas or partials. Which, is no different from how you would handle it anywhere else in python.

Passing in your own custom format function that handles callables feels so easy, I don't see why we would need to overcomplicate the PEP to handle it.

We could potentially add that delayed evaluation of the values themselves could be considered in a followup PEP if there warrants a need.

ncoghlan · 2024-08-14T05:48:44Z

Your thoughts on declaring lazy evaluation out of scope make sense to me, so I've reworded the initial post accordingly (I also moved the suggested () for rendering time function invocation to the format specifier rather than having it outside the field specifier where render_field implementations wouldn't be able to see it)

Edit: I also updated my notes on mentioning string.Template
Edit 2: added details on the changes to TemplateField and TemplateLiteralField to account for eager vs lazy evaluation

ncoghlan · 2024-08-19T02:25:19Z

Added further notes on the potential utility for i18n use cases based on the PEP 750 discussion with @warsaw (short version: while some level of integration is theoretically possible, the potential benefits are small enough and the related challenges significant enough that it likely won't be worth spending anyone's time to actually make that integration happen).

This isn't in the main set of notes yet, but writing those up did give me an idea in relation to this comment @warsaw made in the PEP 750 thread:

Another thing that occurs to me though, is that to be a more-or-less drop in replacement for flufl.i18n style translatable strings, you’d need to be able to support the $-string (PEP 292) style placeholders. Is the PEP 750 placeholder syntax restricted to f-string style {placeholder} syntax?

It is specifically PEP 750 that commits all possible tagged strings to using f-string style interpolation fields, with no scope for future syntactic variations.

By contrast, PEP 501 only commits t-strings to directly aligning with the f-string syntax. That means it would leave the door open to a future PEP proposing a dedicated _ string prefix that instead used the PEP 292 syntax to define interpolation fields. I think that's a genuine design benefit arising from the more limited proposal, so it's likely worth mentioning.

ncoghlan · 2024-08-24T12:10:02Z

Adding in another change inspired by PEP 750: making the interpretation of conversion specifiers lazy. While template literals will make a, r, and s work as they do in f-strings by default, other template renderers may handle them differently. Template literals will also support () as a "call the result of the expression at rendering time" conversion specifier, giving a basic level of support for delayed rendering (enough for logging to support lazy field rendering without needing custom specifier processing).

I also added some further notes about the i18n use case, including the potential benefits of leaving the door open to a future syntactic proposal for "dollar-strings" (which would use the PEP 292 substitution syntax, but the same runtime interpolation machinery).

I'm going to start a draft PR for these changes tomorrow (Aug 25 AEST).

ncoghlan · 2024-08-25T07:46:41Z

WIP PR on my fork of the PEPs repo: ncoghlan#8

( I won't merge it there, but this potentially allows comments on the amendments as they're in progress, rather than having to wait until I'm ready to submit the full PR to the main PEPs repo)

ncoghlan · 2024-09-01T05:55:44Z

Decent progress today: WIP PR now has a first draft of the updated proposal and specification sections. I also added some notes on sections that should either be moved around, or else referenced earlier than they are.

Next step is to review the design discussion section and update it as needed.

ncoghlan · 2024-09-01T07:43:09Z

First pass at updating the design discussion section (including adding a "How To Teach This" suggestion).

Still a few specific TODOs to fill in.

ncoghlan · 2024-09-01T12:43:06Z

In filling out the design discussion section, I noticed a problem with the way custom conversions specifiers were defined: actually using them would break the default renderer. For now, I'm going to tweak the syntax to allow the default renderer to cleanly ignore them, but it may be simpler to only define the new lazy evaluation specifier. The reason I'm not going down the latter yet is because I have concerns that omitting custom conversion specifier support will lead to ad hoc conventions in the format specifier field that serve the same purpose as custom conversion specifiers.

ncoghlan · 2024-09-01T14:01:29Z

OK, first pass at the update is done. Next steps before creating the PR against the PEPs repo:

proofreading and copyediting (I should be able to get to that some time this week)
preferably a first pass design review from @nhumrich (with the lazy field evaluation support and the related changes to conversion specifier handling being the most likely aspect to cause concerns)
seeing if there are any more sections that can be outright deleted, or at least cut down heavily (e.g. PEP 675 only gets a passing reference now)
consider whether the section order should change further (I already moved a few things around)

ncoghlan · 2024-09-04T10:07:23Z

Just noting an idea that I considered adding, but decided it didn't add enough to the PEP to be worth the extra up-front complexity (vs just adding it later as a regular feature request): a from_raw_segments class method that handles wrapping plain strings and 4-tuples into TemplateLiteralText and TemplateLiteralField objects. The basic idea is simple enough, but the implementation gets fiddly if you want to make it do the right thing when the passed in values are already instances of those classes. It's also not clear yet if that's the right programmatic API to provide (for example, building on https://docs.python.org/3/library/string.html#string.Formatter is another potential candidate).

* switch to lazy conversion specifier processing (includes adding `operator.convert_field`) * added proposal for `!()` as a new conversion specifier that invokes `__call__` (rather than `__repr__` or `__str__`) * add `render_text` callback to `TemplateLiteral.render` signature (default value: `str`) * new protocol: `typing.InterpolationTemplate` (protocol corresponding to the concrete `types.TemplateLiteral` type) * new protocol: `typing.TemplateText` (equivalent to `Decoded` from PEP 750) * new protocol: `typing.TemplateField` (inspired by `Interpolation` from PEP 750, with adjustments for eager field evaluation) * new concrete type: `types.TemplateLiteralText` (equivalent to `DecodedConcrete` from PEP 750) * new concrete type: `types.TemplateLiteralField` (inspired by `InterpolationConcrete` from PEP 750, with adjustments for eager field evaluation) * added iteration support to `TemplateLiteral`, producing `TemplateLiteralText` and `TemplateLiteralField` instances in their order of appearance (keeping the "no empty TemplateLiteralText entries" rule from PEP 750) * change the way `TemplateLiteral` works based on the way PEP 750 works * added or updated discussion notes about several included and deferred features Closes #3904

ncoghlan · 2024-09-04T13:32:30Z

No ETA on a new discussion thread yet. There's heaps of time until Python 3.14's beta cycle starts, and I think discussions will be more productive if we wait until the next iteration of PEP 750 has been published so folks can more easily compare the latest versions of both proposals (they look more different than they actually are at the moment, since the PEP 501 update is based on the upcoming PEP 750 amendments that were already announced in the discussion thread rather than on the current version).

ncoghlan mentioned this issue Aug 14, 2024

PEP 501: Re-open and revise in consideration of PEP 701 #3047

Merged

ncoghlan mentioned this issue Sep 4, 2024

PEP 501: update for tagged strings #3944

Merged

4 tasks

ncoghlan closed this as completed in #3944 Sep 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PEP 501: improvements inspired by PEP 750's tagged strings #3904

PEP 501: improvements inspired by PEP 750's tagged strings #3904

ncoghlan commented Aug 14, 2024 •

edited

Loading

ncoghlan commented Aug 14, 2024

nhumrich commented Aug 14, 2024

ncoghlan commented Aug 14, 2024 •

edited

Loading

ncoghlan commented Aug 19, 2024

ncoghlan commented Aug 24, 2024 •

edited

Loading

ncoghlan commented Aug 25, 2024

ncoghlan commented Sep 1, 2024 •

edited

Loading

ncoghlan commented Sep 1, 2024

ncoghlan commented Sep 1, 2024

ncoghlan commented Sep 1, 2024 •

edited

Loading

ncoghlan commented Sep 4, 2024 •

edited

Loading

ncoghlan commented Sep 4, 2024

PEP 501: improvements inspired by PEP 750's tagged strings #3904

PEP 501: improvements inspired by PEP 750's tagged strings #3904

Comments

ncoghlan commented Aug 14, 2024 • edited Loading

ncoghlan commented Aug 14, 2024

nhumrich commented Aug 14, 2024

ncoghlan commented Aug 14, 2024 • edited Loading

ncoghlan commented Aug 19, 2024

ncoghlan commented Aug 24, 2024 • edited Loading

ncoghlan commented Aug 25, 2024

ncoghlan commented Sep 1, 2024 • edited Loading

ncoghlan commented Sep 1, 2024

ncoghlan commented Sep 1, 2024

ncoghlan commented Sep 1, 2024 • edited Loading

ncoghlan commented Sep 4, 2024 • edited Loading

ncoghlan commented Sep 4, 2024

ncoghlan commented Aug 14, 2024 •

edited

Loading

ncoghlan commented Aug 14, 2024 •

edited

Loading

ncoghlan commented Aug 24, 2024 •

edited

Loading

ncoghlan commented Sep 1, 2024 •

edited

Loading

ncoghlan commented Sep 1, 2024 •

edited

Loading

ncoghlan commented Sep 4, 2024 •

edited

Loading