Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upImplementation changes to BufWriter #78551
Conversation
Changes to some BufWriter Write methods, with a focus on reducing total device write calls by fully filling the buffer in some cases
(rust_highfive has picked a reviewer for you, use r? to override) |
// Unlike with `write`, we assume that a caller of `write_all` is | ||
// interested in minimizing system calls even if the buffer is split. | ||
// This method tries to fill up the buffer as much as possible before | ||
// flushing, whereas `write` prefers not split incoming bufs. | ||
|
||
// Bypass the buffer if the the incoming write is larger than the whole | ||
// buffer. | ||
if buf.len() >= self.capacity() { | ||
self.flush_buf()?; | ||
return self.get_mut().write_all(buf); | ||
} | ||
// FIXME: Why no len > capacity? Why not buffer len == capacity? #72919 | ||
if buf.len() >= self.buf.capacity() { | ||
self.panicked = true; | ||
let r = self.get_mut().write_all(buf); | ||
self.panicked = false; | ||
r | ||
} else { | ||
self.buf.extend_from_slice(buf); | ||
Ok(()) | ||
|
||
// In order to reduce net writes in aggregate, we buffer as much as | ||
// possible, then forward, then buffer the rest | ||
let amt_buffered = self.write_to_buf(buf); | ||
if amt_buffered < buf.len() { | ||
self.flush_buf()?; | ||
// At this point, because we know that buf.len() < self.buf.len(), | ||
// we know that this will succeed in totality | ||
self.buf.extend_from_slice(&buf[amt_buffered..]); | ||
} | ||
Ok(()) |
m-ou-se
Oct 30, 2020
Member
Lgtm.
This only changes behaviour in the case the buf
is smaller than the total capacity, but bigger than the available space in the buffer. Before: Flush the buffer, then add all of buf
to the buffer. After: Add as much of buf
to the buffer as fits before flushing, then put the remainder of buf
in the buffer.
Makes sense to me. All of buf
gets copied anyway and this still results in a single write
call, but this might reduce the overall number of write
calls.
// We assume that callers of `write` prefer to avoid split writes where | ||
// possible, so we pre-flush the buffer rather than doing a partial | ||
// write to fill it. | ||
if self.buf.len() + buf.len() > self.buf.capacity() { | ||
self.flush_buf()?; | ||
} | ||
// FIXME: Why no len > capacity? Why not buffer len == capacity? #72919 | ||
if buf.len() >= self.buf.capacity() { | ||
self.panicked = true; | ||
let r = self.get_mut().write(buf); | ||
self.panicked = false; | ||
r | ||
self.get_mut().write(buf) |
m-ou-se
Oct 30, 2020
Member
Lgtm.
This is just comments and removing unecessary panicked = true/false
.
let total_len: usize = bufs.iter().map(|buf| buf.len()).sum(); | ||
|
||
if total_len + self.buffer().len() > self.capacity() { | ||
self.flush_buf()?; | ||
} | ||
if total_len >= self.buf.capacity() { | ||
self.get_mut().write_vectored(bufs) | ||
} else { | ||
// Correctness note: we've already verified that none of these | ||
// will overflow the buffer, because total_len < capacity | ||
bufs.iter().for_each(|buf| self.buf.extend_from_slice(buf)); | ||
Ok(total_len) | ||
} |
I'm slightly worried about splitting things from The documentaion here says:
Strictly speaking it doesn't specify that it will call
The documentation says:
So I'd say yes, this |
- BufWriter now makes use of vectored writes when flushing; it attempt to write both buffered data and incoming data in a single operation when it makes sense to do so. - LineWriterShim takes advantage of BufWriter's new "vectored flush" operation in a few places - Fixed a failing test. No new tests yet.
- Added BufWriter::available; used it to simplify various checks - Refactored write to make it more simple; extensively recommented it
- Fixed bugs in write implementation; decompressed it to make the flow more clear. - Replaced several uses of .capacity() with .available(); in these cases they're identical (because they always occur after a completed flush) but available makes more sense conceptually.
} | ||
} | ||
|
||
fn is_write_vectored(&self) -> bool { | ||
self.get_ref().is_write_vectored() | ||
true |
the8472
Oct 31, 2020
Contributor
Is that really an improvement? It would result in the caller going through the effort of managing IoSlices only for BufWriter to disassemble them into the buffer, assuming they're consisting of small writes. And in case of large writes we would only try to pass through the first one.
The caller might as well call write()
multiple times to have things aggregate into the buffer.
Lucretiel
Oct 31, 2020
•
Author
Contributor
I'm certainly assuming that no one is going out of their way to use vectored writes if it isn't imminently convenient to do so, and in particular they probably aren't dynamically doing their own "buffering" of IoSlice objects to eventually forward. For instance, write_fmt
isn't going to be creating a buffer of IoSlice for writing the constitutent parts. I'm instead assuming that people are using vectored writes in cases where they happen to already have a fixed set of buffers available; for instance, an HTTP protocol might use it to forward HTTP headers, or a variant of println
might use it to append a newline to the base content.
This reflects my own usage of write_vectored
in BufWriter
; I only call it in cases where I conveniently have a fixed set of buffers I'm interested in writing.
The quoted documentation from @m-ou-se supports this "always return true" behavior: #78551 (comment)
This PR contains some proposed implementation updates to
BufWriter
, with a focus on reducing total devicewrite
calls in the average case. These updates are based on lessons learned from the design ofLineWriterShim
, as well as some discussions with @workingjubilee on this topic.There are three main changes to how
BufWriter
works:write_all
now fully fills the buffer before flushing. Previously, it would fill the buffer as much as possible without splitting any incoming writes. However, we assume that a caller ofwrite_all
is foremost interested in minimizing write calls to the underlying device, which we can best achieve by maximizing the size of buffers we flush. Of course, incoming data that exceeds the total size of the buffer is still forwarded directly to the underlying device.write
would prefer to have unbroken writes if possible (that is, would preferOk(n)
wheren == buf.len()
). The implementation ofwrite
is therefore left unchanged.write_vectored
is unchanged wheninner.is_write_vectored()
. However, in the case where the underlying device doesn't offer any specialization, we now take care to buffer together all the incoming sub-bufs, even if the total size exceeds our buffer, and only fall back to directly forwarded writes in the case that an individual slice exceeds our buffer size.flush_buf_vectored()
, a new internal method that is similar toflush_buf
, but additionally attempts to use vectored operations to send the new incoming data along with the existing buffered data in a single operation. I took care to ensure that this never results in more system calls than it would have already; in particular, it immediately stops forwarding as soon as the existing buffer is fully forwarded, even if 0 new bytes have been sent.write
andwrite_all
were refactored to take advantage of it, andwrite_vectored
was slightly modified to forward towrite
if exactly 1 buffer is being written.self.panicked
guard fences.BufWriter::panicked
is about preventing duplicate writes of buffered data, so it isn't necessary to fence writes to the underlying device when the buffer is known to be empty.Follow up items
Tasks
Open questions
Stuff I want to call out and ensure is addressed in review:
BufWriter
changes make sense.BufWriter
to unconditionally returntrue
foris_write_buffered
, since it specializes vectored writes by buffering them together, even when the underlying device offers no such specialization?This was split off to a separate PR from #78515