Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation changes to BufWriter #78551

Open
wants to merge 7 commits into
base: master
from

Conversation

@Lucretiel
Copy link
Contributor

@Lucretiel Lucretiel commented Oct 30, 2020

This PR contains some proposed implementation updates to BufWriter, with a focus on reducing total device write calls in the average case. These updates are based on lessons learned from the design of LineWriterShim, as well as some discussions with @workingjubilee on this topic.

There are three main changes to how BufWriter works:

  • write_all now fully fills the buffer before flushing. Previously, it would fill the buffer as much as possible without splitting any incoming writes. However, we assume that a caller of write_all is foremost interested in minimizing write calls to the underlying device, which we can best achieve by maximizing the size of buffers we flush. Of course, incoming data that exceeds the total size of the buffer is still forwarded directly to the underlying device.
    • Conversely, we assume that a caller of write would prefer to have unbroken writes if possible (that is, would prefer Ok(n) where n == buf.len()). The implementation of write is therefore left unchanged.
  • write_vectored is unchanged when inner.is_write_vectored(). However, in the case where the underlying device doesn't offer any specialization, we now take care to buffer together all the incoming sub-bufs, even if the total size exceeds our buffer, and only fall back to directly forwarded writes in the case that an individual slice exceeds our buffer size.
  • Added flush_buf_vectored(), a new internal method that is similar to flush_buf, but additionally attempts to use vectored operations to send the new incoming data along with the existing buffered data in a single operation. I took care to ensure that this never results in more system calls than it would have already; in particular, it immediately stops forwarding as soon as the existing buffer is fully forwarded, even if 0 new bytes have been sent. write and write_all were refactored to take advantage of it, and write_vectored was slightly modified to forward to write if exactly 1 buffer is being written.
  • Additionally, as a much more minor change, removed several unnecessary uses of the self.panicked guard fences. BufWriter::panicked is about preventing duplicate writes of buffered data, so it isn't necessary to fence writes to the underlying device when the buffer is known to be empty.

Follow up items

Tasks

  • Test the new behavior

Open questions

Stuff I want to call out and ensure is addressed in review:

  • Do the assumptions underpinning the BufWriter changes make sense.
  • Would it make sense for BufWriter to unconditionally return true for is_write_buffered, since it specializes vectored writes by buffering them together, even when the underlying device offers no such specialization?

This was split off to a separate PR from #78515

Changes to some BufWriter Write methods, with a focus on reducing total
device write calls by fully filling the buffer in some cases
@rust-highfive
Copy link
Collaborator

@rust-highfive rust-highfive commented Oct 30, 2020

r? @joshtriplett

(rust_highfive has picked a reviewer for you, use r? to override)

// Unlike with `write`, we assume that a caller of `write_all` is
// interested in minimizing system calls even if the buffer is split.
// This method tries to fill up the buffer as much as possible before
// flushing, whereas `write` prefers not split incoming bufs.

// Bypass the buffer if the the incoming write is larger than the whole
// buffer.
if buf.len() >= self.capacity() {
self.flush_buf()?;
return self.get_mut().write_all(buf);
}
// FIXME: Why no len > capacity? Why not buffer len == capacity? #72919
if buf.len() >= self.buf.capacity() {
self.panicked = true;
let r = self.get_mut().write_all(buf);
self.panicked = false;
r
} else {
self.buf.extend_from_slice(buf);
Ok(())

// In order to reduce net writes in aggregate, we buffer as much as
// possible, then forward, then buffer the rest
let amt_buffered = self.write_to_buf(buf);
if amt_buffered < buf.len() {
self.flush_buf()?;
// At this point, because we know that buf.len() < self.buf.len(),
// we know that this will succeed in totality
self.buf.extend_from_slice(&buf[amt_buffered..]);
}
Ok(())
Comment on lines 309 to 330

This comment has been minimized.

@m-ou-se

m-ou-se Oct 30, 2020
Member

Lgtm.

This only changes behaviour in the case the buf is smaller than the total capacity, but bigger than the available space in the buffer. Before: Flush the buffer, then add all of buf to the buffer. After: Add as much of buf to the buffer as fits before flushing, then put the remainder of buf in the buffer.

Makes sense to me. All of buf gets copied anyway and this still results in a single write call, but this might reduce the overall number of write calls.

// We assume that callers of `write` prefer to avoid split writes where
// possible, so we pre-flush the buffer rather than doing a partial
// write to fill it.
if self.buf.len() + buf.len() > self.buf.capacity() {
self.flush_buf()?;
}
// FIXME: Why no len > capacity? Why not buffer len == capacity? #72919
if buf.len() >= self.buf.capacity() {
self.panicked = true;
let r = self.get_mut().write(buf);
self.panicked = false;
r
self.get_mut().write(buf)
Comment on lines 294 to 301

This comment has been minimized.

@m-ou-se

m-ou-se Oct 30, 2020
Member

Lgtm.

This is just comments and removing unecessary panicked = true/false.

let total_len: usize = bufs.iter().map(|buf| buf.len()).sum();

if total_len + self.buffer().len() > self.capacity() {
self.flush_buf()?;
}
if total_len >= self.buf.capacity() {
self.get_mut().write_vectored(bufs)
} else {
// Correctness note: we've already verified that none of these
// will overflow the buffer, because total_len < capacity
bufs.iter().for_each(|buf| self.buf.extend_from_slice(buf));
Ok(total_len)
}
Comment on lines 335 to 347

This comment has been minimized.

@m-ou-se

m-ou-se Oct 30, 2020
Member

Lgtm.

Does not change behaviour in the is_write_vectored() case.

library/std/src/io/buffered/bufwriter.rs Outdated Show resolved Hide resolved
library/std/src/io/buffered/bufwriter.rs Outdated Show resolved Hide resolved
@m-ou-se
Copy link
Member

@m-ou-se m-ou-se commented Oct 30, 2020

Do the assumptions underpinning the BufWriter changes make sense.

I'm slightly worried about splitting things from write_all when the data could fit in the buffer if it was flushed first.

The documentaion here says:

This method will continuously call write until there is no more data to be written or an error of non-ErrorKind::Interrupted kind is returned.

Strictly speaking it doesn't specify that it will call write on the whole buffer, but I think that's what it implies. It's not a weird assumption to make that a write_all() to a BufWriter with less than .capacity() bytes will be not be split. It might be fine to break this assumption, but right now there's no way to avoid it with write_all, as BufWriter does not expose anything like a write_out_the_internal_buffer() function that makes room for the next write_all() without also calling flush() on the underlying buffer.


Would it make sense for BufWriter to unconditionally return true for is_write_buffered, since it specializes vectored writes by buffering them together, even when the underlying device offers no such specialization?

The documentation says:

Determines if this Writeer has an efficient write_vectored implementation.

If a Writeer does not override the default write_vectored implementation, code using it may want to avoid the method all together and coalesce writes into a single buffer for higher performance.

So I'd say yes, this write_vectored implementation qualifies. You'd not want a user of BufWriter to manually buffer things to call write instead of write_vectored.

Lucretiel added 6 commits Oct 30, 2020
- BufWriter now makes use of vectored writes when flushing; it attempt to
write both buffered data and incoming data in a single operation when it
makes sense to do so.
- LineWriterShim takes advantage of BufWriter's new "vectored flush"
operation in a few places
- Fixed a failing test. No new tests yet.
- Added BufWriter::available; used it to simplify various checks
- Refactored write to make it more simple; extensively recommented it
- Fixed bugs in write implementation; decompressed it to make the flow
more clear.
- Replaced several uses of .capacity() with .available(); in these cases
they're identical (because they always occur after a completed flush)
but available makes more sense conceptually.
}
}

fn is_write_vectored(&self) -> bool {
self.get_ref().is_write_vectored()
true

This comment has been minimized.

@the8472

the8472 Oct 31, 2020
Contributor

Is that really an improvement? It would result in the caller going through the effort of managing IoSlices only for BufWriter to disassemble them into the buffer, assuming they're consisting of small writes. And in case of large writes we would only try to pass through the first one.

The caller might as well call write() multiple times to have things aggregate into the buffer.

This comment has been minimized.

@Lucretiel

Lucretiel Oct 31, 2020
Author Contributor

I'm certainly assuming that no one is going out of their way to use vectored writes if it isn't imminently convenient to do so, and in particular they probably aren't dynamically doing their own "buffering" of IoSlice objects to eventually forward. For instance, write_fmt isn't going to be creating a buffer of IoSlice for writing the constitutent parts. I'm instead assuming that people are using vectored writes in cases where they happen to already have a fixed set of buffers available; for instance, an HTTP protocol might use it to forward HTTP headers, or a variant of println might use it to append a newline to the base content.

This reflects my own usage of write_vectored in BufWriter; I only call it in cases where I conveniently have a fixed set of buffers I'm interested in writing.

The quoted documentation from @m-ou-se supports this "always return true" behavior: #78551 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

6 participants
You can’t perform that action at this time.