Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate locale.getdefaultlocale() function #90817

Closed
vstinner opened this issue Feb 6, 2022 · 19 comments
Closed

Deprecate locale.getdefaultlocale() function #90817

vstinner opened this issue Feb 6, 2022 · 19 comments
Labels
3.11 stdlib

Comments

@vstinner
Copy link
Member

@vstinner vstinner commented Feb 6, 2022

BPO 46659
Nosy @malemburg, @vstinner, @serhiy-storchaka, @eryksun
PRs
  • #31166
  • #31167
  • #31168
  • #31206
  • #31214
  • #31218
  • Files
  • cal_locale.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2022-02-24.13:41:35.585>
    created_at = <Date 2022-02-06.17:33:14.432>
    labels = ['library', '3.11']
    title = 'Deprecate locale.getdefaultlocale() function'
    updated_at = <Date 2022-02-24.14:53:20.800>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2022-02-24.14:53:20.800>
    actor = 'lemburg'
    assignee = 'none'
    closed = True
    closed_date = <Date 2022-02-24.13:41:35.585>
    closer = 'vstinner'
    components = ['Library (Lib)']
    creation = <Date 2022-02-06.17:33:14.432>
    creator = 'vstinner'
    dependencies = []
    files = ['50606']
    hgrepos = []
    issue_num = 46659
    keywords = ['patch']
    message_count = 19.0
    messages = ['412647', '412652', '412664', '412666', '412667', '412668', '412687', '412800', '412819', '412825', '412826', '412827', '412829', '412842', '413744', '413745', '413907', '413910', '413915']
    nosy_count = 4.0
    nosy_names = ['lemburg', 'vstinner', 'serhiy.storchaka', 'eryksun']
    pr_nums = ['31166', '31167', '31168', '31206', '31214', '31218']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue46659'
    versions = ['Python 3.11']

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 6, 2022

    The locale.getdefaultlocale() function only relies on environment variables. At Python startup, Python calls setlocale() is set the LC_CTYPE locale to the user preferred encoding.

    Since Python 3.7, if the LC_CTYPE locale is "C" or "POSIX", PEP-538 sets the LC_CTYPE locale to a UTF-8 variant if available, and PEP-540 ignores the locale and forces the usage of the UTF-8 encoding. The *effective* encoding used by Python is inconsistent with environment variables.

    Moreover, if setlocale() is called to set the LC_CTYPE locale to a locale different than the user locale, again, environment variables are inconsistent with the effective locale.

    In these cases, locale.getdefaultlocale() result is not the expected locale and it can lead to mojibake and other issues.

    For these reasons, I propose to deprecate locale.getdefaultlocale(): setlocale(), getpreferredencoding() and getlocale() should be used instead.

    For the background on these issues, see recent issue:

    @vstinner vstinner added 3.11 stdlib labels Feb 6, 2022
    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 6, 2022

    cal_locale.py: Test calendar.LocaleTextCalendar() default locale, manual test for #75349.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 6, 2022

    New changeset 04dd60e by Victor Stinner in branch 'main':
    bpo-46659: Update the test on the mbcs codec alias (GH-31168)
    04dd60e

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 6, 2022

    New changeset 06b8f16 by Victor Stinner in branch 'main':
    bpo-46659: test.support avoids locale.getdefaultlocale() (GH-31167)
    06b8f16

    @malemburg
    Copy link
    Member

    @malemburg malemburg commented Feb 6, 2022

    For these reasons, I propose to deprecate locale.getdefaultlocale(): setlocale(), getpreferredencoding() and getlocale() should be used instead.

    Please see the discussion on https://bugs.python.org/issue43552: locale.getpreferredencoding() needs to be deprecated as well. Instead we should have a single locale.getencoding() as outlined there... perhaps in a separate ticket ?! Thanks.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 6, 2022

    Please see the discussion on https://bugs.python.org/issue43552: locale.getpreferredencoding() needs to be deprecated as well. Instead we should have a single locale.getencoding() as outlined there... perhaps in a separate ticket ?! Thanks.

    Yeah, I read this issue. But these things are too complicated :-) I prefer to move step by step.

    Once locale.getencoding() (or a similar function) is added, we can update the deprecation message.

    I hope to be able to deprecate getdefaultlocale() and to add such new function in Python 3.11.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 6, 2022

    New changeset 04dd60e by Victor Stinner in branch 'main':
    bpo-46659: Update the test on the mbcs codec alias (GH-31168)

    This change is not correct, I created bpo-46668 to fix it.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 7, 2022

    New changeset 7a0486e by Victor Stinner in branch 'main':
    bpo-46659: calendar uses locale.getlocale() (GH-31166)
    7a0486e

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Feb 8, 2022

    getdefaultlocale() falls back to LANG and LANGUAGE. It allows also to specify a list of looked up environment variables. How could this use case be covered with getlocale()?

    @eryksun
    Copy link
    Contributor

    @eryksun eryksun commented Feb 8, 2022

    getdefaultlocale() falls back to LANG and LANGUAGE.

    _Py_SetLocaleFromEnv(LC_CTYPE) (e.g. setlocale(LC_CTYPE, "")) gets called at startup, except for the isolated configuration [1].

    I think calendar.Locale*Calendar should try the LC_CTYPE locale if LC_TIME is "C", i.e. (None, None). Otherwise, it's introducing new default behavior. For example, with LC_ALL set to "ru_RU.utf8":

    3.8:

        >>> locale.getlocale(locale.LC_TIME)
        (None, None)
        >>> locale.getlocale(locale.LC_CTYPE)
        ('ru_RU', 'UTF-8')
        >>> cal = calendar.LocaleTextCalendar()
        >>> cal.formatweekday(0, 15)
        '  Понедельник  '

    3.11.0a5+:

        >>> locale.getlocale(locale.LC_TIME)
        (None, None)
        >>> locale.getlocale(locale.LC_CTYPE)
        ('ru_RU', 'UTF-8')
        >>> cal = calendar.LocaleTextCalendar()
        >>> cal.formatweekday(0, 15)
        '     Monday    '
        >>> locale.setlocale(locale.LC_TIME, '')
        'ru_RU.utf8'
        >>> cal = calendar.LocaleTextCalendar()
        >>> cal.formatweekday(0, 15)
        '  Понедельник  '

    [1] https://docs.python.org/3/c-api/init_config.html?#isolated-configuration

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 8, 2022

    Serhiy: "getdefaultlocale() falls back to LANG and LANGUAGE. It allows also to specify a list of looked up environment variables. How could this use case be covered with getlocale()?"

    What's your use case to use env vars rather than the current LC_CTYPE locale?

    My concern is that when setlocale() is called, the current LC_CTYPE locale is inconsistent and you can get mojibake and others issues.

    See for example:
    https://bugs.python.org/issue43552#msg389069

    Marc-Andre Lemburg wants to deprecate it:
    https://bugs.python.org/issue43552#msg389076

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 8, 2022

    I think calendar.Locale*Calendar should try the LC_CTYPE locale if LC_TIME is "C", i.e. (None, None). Otherwise, it's introducing new default behavior. For example, with LC_ALL set to "ru_RU.utf8": (...)

    Oh. Serhiy asked me to use LC_TIME rather than LC_CTYPE.

    See also my example in the PR:
    #31166 (comment)

    @eryksun
    Copy link
    Contributor

    @eryksun eryksun commented Feb 8, 2022

    Oh. Serhiy asked me to use LC_TIME rather than LC_CTYPE.

    Since Locale*Calendar is documented as not being thread safe, __init__() could get the real default via setlocale(LC_TIME, "") when locale=None and the current LC_TIME is "C". Restore it back to "C" after getting the default. That should usually match the behavior from previous versions that called getdefaultlocale(). In cases where it differs, it's fixing a bug because the default LC_TIME is the correct default.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 8, 2022

    Eryk: I created #75397 which uses the user preferred locale if the current LC_TIME locale is "C" or "POSIX".

    Moreover, it no longer gets the current locale when the class is created. If locale=locale is passed, just use the current LC_TIME (or the user preferred is the locale is "C" or "POSIX").

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 22, 2022

    New changeset ccbe804 by Victor Stinner in branch 'main':
    bpo-46659: Fix the MBCS codec alias on Windows (GH-31218)
    ccbe804

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 22, 2022

    New changeset b899126 by Victor Stinner in branch 'main':
    bpo-46659: Deprecate locale.getdefaultlocale() (GH-31206)
    b899126

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 24, 2022

    New changeset 4fccf91 by Victor Stinner in branch 'main':
    bpo-46659: Enhance LocaleTextCalendar for C locale (GH-31214)
    4fccf91

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 24, 2022

    locale.getdefaultlocale() is now deprecated.

    calendar now uses locale.setlocale() instead of locale.getdefaultlocale().

    The ANSI code page alias to MBCS now has better tests and better comments.

    Thanks Eryk Sun for your very useful feedback!

    @malemburg
    Copy link
    Member

    @malemburg malemburg commented Feb 24, 2022

    Thanks, Victor.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    vstinner added a commit to vstinner/cpython that referenced this issue May 25, 2022
    The function was already deprecated in Python 3.11 since it calls
    locale.getdefaultlocale() which was deprecated in Python 3.11.
    vstinner added a commit to vstinner/cpython that referenced this issue May 25, 2022
    The function was already deprecated in Python 3.11 since it calls
    locale.getdefaultlocale() which was deprecated in Python 3.11.
    vstinner added a commit to vstinner/cpython that referenced this issue May 25, 2022
    The function was already deprecated in Python 3.11 since it calls
    locale.getdefaultlocale() which was deprecated in Python 3.11.
    vstinner added a commit that referenced this issue May 25, 2022
    The function was already deprecated in Python 3.11 since it calls
    locale.getdefaultlocale() which was deprecated in Python 3.11.
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 25, 2022
    …3196)
    
    The function was already deprecated in Python 3.11 since it calls
    locale.getdefaultlocale() which was deprecated in Python 3.11.
    (cherry picked from commit bf58cd0)
    
    Co-authored-by: Victor Stinner <vstinner@python.org>
    miss-islington added a commit that referenced this issue May 25, 2022
    The function was already deprecated in Python 3.11 since it calls
    locale.getdefaultlocale() which was deprecated in Python 3.11.
    (cherry picked from commit bf58cd0)
    
    Co-authored-by: Victor Stinner <vstinner@python.org>
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 stdlib
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants