Re: [Libguestfs] [PATCH 1/2] spec: Recommend cap on NBD_REPLY_TYPE_BLOCK_STATUS length
On Fri, Apr 08, 2022 at 09:25:01AM +0200, Wouter Verhelst wrote:
> Hi Eric,
>
> On Thu, Apr 07, 2022 at 04:37:19PM -0500, Eric Blake wrote:
> > The spec was silent on how many extents a server could reply with.
> > However, both qemu and nbdkit (the two server implementations known to
> > have implemented the NBD_CMD_BLOCK_STATUS extension) implement a hard
> > cap, and will truncate the amount of extents in a reply to avoid
> > sending a client a reply larger than the maximum NBD_CMD_READ response
> > they are willing to tolerate:
> >
> > When qemu first implemented NBD_CMD_BLOCK_STATUS for the
> > base:allocation context (qemu commit e7b1948d51, Mar 2018), it behaved
> > as if NBD_CMD_FLAG_REQ_ONE were always passed by the client, and never
> > responded with more than one extent. Later, when adding its
> > qemu:dirty-bitmap:XXX context extension (qemu commit 3d068aff16, Jun
> > 2018), it added a cap to 128k extents (1M+4 bytes), and that cap was
> > applied to base:allocation once qemu started sending multiple extents
> > for that context as well (qemu commit fb7afc797e, Jul 2018). Qemu
> > extents are never smaller than 512 bytes (other than an exception at
> > the end of a file whose size is not aligned to 512), but even so, a
> > request for just under 4G of block status could produce 8M extents,
> > resulting in a reply of 64M if it were not capped smaller.
> >
> > When nbdkit first implemented NBD_CMD_BLOCK_STATUS (nbdkit 4ca66f70a5,
> > Mar 2019), it did not impose any restriction on the number of extents
> > in the reply chunk. But because it allows extents as small as one
> > byte, it is easy to write a server that can amplify a client's request
> > of status over 1M of the image into a reply over 8M in size, and it
> > was very easy to demonstrate that a hard cap was needed to avoid
> > crashing clients or otherwise killing the connection (a bad server
> > impacting the client negatively); unique to nbdkit's situation is the
> > fact that because it is designed for plugin server implementations,
> > not capping the length of extent also posed a problem to nbdkit as the
> > server (a client requesting large block status could cause the server
> > to run out of memory depending on the plugin providing the server
> > callbacks). So nbdkit enforced a bound of 1M extents (8M+4 bytes,
> > nbdkit commit 6e0dc839ea, Jun 2019).
> >
> > Since the limit chosen by these two implementations is different, and
> > since nbdkit has versions that were not limited, add this as a SHOULD
> > NOT instead of MUST NOT constraint on servers implementing block
> > status. It does not matter that qemu picked a smaller limit that it
> > truncates to, since we have already documented that the server may
> > truncate for other reasons (such as it being inefficient to collect
> > that many extents in the first place). But documenting the limit now
> > becomes even more important in the face of a future addition of 64-bit
> > requests, where a client's request is no longer bounded to 4G and
> > could thereby produce even more than 8M extents for the corner case
> > when every 512 bytes is a new extent, if it were not for this
> > recommendation.
>
> It feels backwards to me to make this a restriction on the server side.
> You're saying there are server implementations that will be inefficient
> if there are more than 2^20 extents, and therefore no server should send
> more than those, even if it can do so efficiently.
>
> Isn't it up to the server implementation to decide what can be done
> efficiently?
>
> Perhaps we can make the language about possibly reducing length of
> extens a bit stronger; but I don't think adding explicit limits for a
> server's own protection is necessary.
I agree, but for a different reason.
I think Eric should add language that servers can consider limiting
response sizes in order to prevent possible amplification issues
and/or simply overwhelming the client with work (bad server DoS
attacks against clients are a thing!), but I don't think it's
necessarily a "SHOULD" issue.
BTW attached is an nbdkit plugin that creates an NBD server that
responds with massive numbers of byte-granularity extents, in case
anyone wants to test how nbdkit and/or clients respond:
$ chmod +x /var/tmp/lots-of-extents.py
$ /var/tmp/lots-of-extents.py -f
$ nbdinfo --map nbd://localhost | head
0 1 3 hole,zero
1 1 0 data
2 1 3 hole,zero
3 1 0 data
4 1 3 hole,zero
5 1 0 data
6 1 3 hole,zero
7 1 0 data
8 1 3 hole,zero
9 1 0 data
$ nbdinfo --map --totals nbd://localhost
524288 50.0% 0 data
524288 50.0% 3 hole,zero
>From experimentation I found this really hurts my laptop :-(
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
nbdkit - Flexible, fast NBD server with plugins
https://gitlab.com/nbdkit/nbdkit
#!/usr/sbin/nbdkit python
API_VERSION = 2
def open(readonly):
return 1
def get_size(h):
return 1024*1024
def pread(h, buf, offset, flags):
buf[:] = bytearray(len(buf))
def typ(offset):
if offset % 2:
return 0 # data
else:
return 3 # zero + hole
def extents(h, count, offset, flags):
e = [(i, 1, typ(i)) for i in range(offset, offset + count + 1)]
return e
Reply to: