Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doc] open builtin function: specifying the size of buffer has no effect for text files #74903

Closed
direprobs mannequin opened this issue Jun 20, 2017 · 2 comments
Closed
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes docs Documentation in the Doc dir easy topic-IO type-bug An unexpected behavior, bug, or error

Comments

@direprobs
Copy link
Mannequin

direprobs mannequin commented Jun 20, 2017

BPO 30718
Nosy @izbyshev, @nitishch, @slateny
PRs
  • bpo-30718: Add information about text buffering #32351
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2017-06-20.20:18:48.999>
    labels = ['easy', 'type-bug', '3.9', 'expert-IO', '3.11', '3.10', 'docs']
    title = '[doc] open builtin function: specifying the size of buffer has no effect for text files'
    updated_at = <Date 2022-04-06.04:56:54.398>
    user = 'https://bugs.python.org/direprobs'

    bugs.python.org fields:

    activity = <Date 2022-04-06.04:56:54.398>
    actor = 'slateny'
    assignee = 'docs@python'
    closed = False
    closed_date = None
    closer = None
    components = ['Documentation', 'IO']
    creation = <Date 2017-06-20.20:18:48.999>
    creator = 'direprobs'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 30718
    keywords = ['patch', 'easy']
    message_count = 2.0
    messages = ['296482', '307779']
    nosy_count = 5.0
    nosy_names = ['docs@python', 'izbyshev', 'nitishch', 'direprobs', 'slateny']
    pr_nums = ['32351']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue30718'
    versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

    @direprobs
    Copy link
    Mannequin Author

    direprobs mannequin commented Jun 20, 2017

    *This behavior was tested on a Linux system with Python 3.5 and 3.6

    Passing the buffer size for the builtin function open has no effect for files opened in text mode:

     >>> sys.version
    '3.5.3 (default, Jan 19 2017, 14:11:04) \n[GCC 6.3.0 20170118]'
    
    >>> f = open("/home/user/Desktop/data.txt", "r+", buffering=30)
    >>> f.write("A" * 40)
    40

    My assumption is that, f is a text buffer and f.buffer is the binary buffer. Therefore, the buffering argument to open sets the buffering size to the binary buffer f.buffer. Confusingly, f.write("A" * 40) didn't fill the buffer although the 40 ASCII chars=40 bytes have been written to f which exceeds its buffer size (30 bytes) nothing was flushed by Python and (instead) the data set in f object.

    The problem is that, it seems that f acts as a text buffer with its own buffer size and its own flushing behavior which obstructs many concepts. Here are the main points:

    A) Despite passing the buffer size to open, f object acts as a text buffer and its size is set to f._CHUNK_SIZE.

    B) The default buffer size set to f by default renders the buffering argument to open virtually useless, this is because the programmer might think that Python flushes the data according to the binary buffer size passed to open. That is, when the programmer codes something like:

    f = open("/home/user/Desktop/data.txt", "r+", buffering=30)
    f.write("A" * 40) 

    for a file opened by open, the programmer's assumption would most likely be that Python flushes the buffer when it's greater than 30 bytes in size for text files. But it really has another buffer on top of the binary buffer and the buffering argument sets the buffer size of the binary buffer f.buffer not f, the text buffer and f relies on the buffer size as set by default that can be seen through f._CHUNK_SIZE or from io.DEFAULT_BUFFER_SIZE.

    C) Calling f.flush flushes both buffers (f and f.buffer) all the way to f.buffer.raw and this further validates the point that given the buffering argument for text files, would technically be useless.

    From Python Documentation for open:

    "buffering is an optional integer used to set the buffering policy. Pass 0 to switch buffering off (only allowed in binary mode), 1 to select line buffering (only usable in text mode), and an integer > 1 to indicate the size in bytes of a fixed-size chunk buffer. When no buffering argument is given, the default buffering policy works as follows: ..."

    "and an integer > 1 to indicate the size in bytes of a fixed-size chunk buffer." if this behavior was intentional in the implementation of Python, then I think the documentation should say something like this:

    and an integer > 1 sets the the default buffer size.

    @direprobs direprobs mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-IO type-bug An unexpected behavior, bug, or error labels Jun 20, 2017
    @izbyshev
    Copy link
    Mannequin

    izbyshev mannequin commented Dec 7, 2017

    Yes, clarifying buffering for text mode in open() would be nice.

    @direprobs: just in case you didn't know, you can achieve what you want with something like the following in pre-3.7:

    with open("/dev/null", "wb", buffering=10) as outb, \
    io.TextIOWrapper(outb, write_through=True) as out:
    out.write("x" * 20)

    Sadly, write_through can't be passed to open(), but it can be changed on existing TextIOWrapper since 3.7 (via new reconfigure() method).

    @iritkatriel iritkatriel added easy 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes docs Documentation in the Doc dir and removed interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Dec 10, 2021
    @iritkatriel iritkatriel changed the title open builtin function: specifying the size of buffer has no effect for text files [doc] open builtin function: specifying the size of buffer has no effect for text files Dec 10, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @slateny slateny closed this as completed May 12, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes 3.11 only security fixes docs Documentation in the Doc dir easy topic-IO type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants