On 8/31/23 00:21, Eric Blake wrote:
I hit another transient failure in libnbd CI when a poorly-written
eval script did not consume all of stdin during .pwrite. As behaving
as a data sink can be a somewhat reasonable feature of a
quickly-written sh or eval plugin, we should not be so insistent as
treating an EPIPE failure as an immediate return of EIO to the client.
Signed-off-by: Eric Blake <eblake(a)redhat.com>
---
I probably need to add unit test coverage of this before committing
(although proving that I win the data race on a client process exiting
faster than the parent can write enough data to hit EPIPE is hard).
plugins/sh/nbdkit-sh-plugin.pod | 8 +++++++
plugins/sh/call.c | 38 ++++++++++++++++++++++-----------
2 files changed, 34 insertions(+), 12 deletions(-)
diff --git a/plugins/sh/nbdkit-sh-plugin.pod b/plugins/sh/nbdkit-sh-plugin.pod
index b2c946a0..8b83a5b3 100644
--- a/plugins/sh/nbdkit-sh-plugin.pod
+++ b/plugins/sh/nbdkit-sh-plugin.pod
@@ -505,6 +505,14 @@ Unlike in other languages, if you provide a C<pwrite> method
you
B<must> also provide a C<can_write> method which exits with code C<0>
(true).
+With nbdkit E<ge> 1.36, this method may return C<0> without consuming
+any data from stdin, and without producing any output, in order to
+behave as an intentional data sink. But in older versions, nbdkit
+would treat any C<EPIPE> failure in writing to your script as an error
+condition even if your script returns success; to avoid unintended
+failures, you may want to include C<"cat >/dev/null"> in a script
+intending to ignore the client's write requests.
+
=item C<flush>
/path/to/script flush <handle>
diff --git a/plugins/sh/call.c b/plugins/sh/call.c
index 888c6459..79c67a04 100644
--- a/plugins/sh/call.c
+++ b/plugins/sh/call.c
@@ -34,6 +34,7 @@
#include <assert.h>
#include <fcntl.h>
+#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
@@ -130,6 +131,7 @@ debug_call (const char **argv)
*/
static int
call3 (const char *wbuf, size_t wbuflen, /* sent to stdin (can be NULL) */
+ bool *pipe_full, /* set if wbuf not fully written */
string *rbuf, /* read from stdout */
string *ebuf, /* read from stderr */
const char **argv) /* script + parameters */
@@ -275,15 +277,8 @@ call3 (const char *wbuf, size_t wbuflen, /* sent to stdin (can be
NULL) */
r = write (pfds[0].fd, wbuf, wbuflen);
if (r == -1) {
if (errno == EPIPE) {
- /* We tried to write to the script but it didn't consume
- * the data. Probably the script exited without reading
- * from stdin. This is an error in the script.
- */
- nbdkit_error ("%s: write to script failed because of a broken pipe:
"
- "this can happen if the script exits without "
- "consuming stdin, which usually indicates a bug "
- "in the script",
- argv0);
+ *pipe_full = true;
+ r = wbuflen;
}
else
nbdkit_error ("%s: write: %m", argv0);
@@ -555,7 +550,7 @@ call (const char **argv)
CLEANUP_FREE_STRING string rbuf = empty_vector;
CLEANUP_FREE_STRING string ebuf = empty_vector;
- r = call3 (NULL, 0, &rbuf, &ebuf, argv);
+ r = call3 (NULL, 0, NULL, &rbuf, &ebuf, argv);
return handle_script_error (argv[0], &ebuf, r);
}
@@ -568,7 +563,7 @@ call_read (string *rbuf, const char **argv)
int r;
CLEANUP_FREE_STRING string ebuf = empty_vector;
- r = call3 (NULL, 0, rbuf, &ebuf, argv);
+ r = call3 (NULL, 0, NULL, rbuf, &ebuf, argv);
r = handle_script_error (argv[0], &ebuf, r);
if (r == ERROR)
string_reset (rbuf);
@@ -584,7 +579,26 @@ call_write (const char *wbuf, size_t wbuflen, const char **argv)
int r;
CLEANUP_FREE_STRING string rbuf = empty_vector;
CLEANUP_FREE_STRING string ebuf = empty_vector;
+ bool pipe_full = false;
- r = call3 (wbuf, wbuflen, &rbuf, &ebuf, argv);
+ r = call3 (wbuf, wbuflen, &pipe_full, &rbuf, &ebuf, argv);
+ if (pipe_full && r == OK) {
+ /* We allow scripts to intentionally ignore data, but they must
+ * have no output when doing so.
+ */
+ if (rbuf.len > 0 || ebuf.len > 0) {
+ nbdkit_error ("%s: write to script failed because of a broken pipe: "
+ "this can happen if the script exits without "
+ "consuming stdin, which usually indicates a bug "
+ "in the script",
+ argv[0]);
+ r = ERROR;
+ }
+ else
+ nbdkit_debug ("%s: write to script failed because of a broken pipe; "
+ "assuming this was an intentional data sink, although it
"
+ "may indicate a bug in the script",
+ argv[0]);
+ }
return handle_script_error (argv[0], &ebuf, r);
}
The patch appears to do what it says on the tin, but I'm unconvinced we
should relax the requirement. The "may indicate a bug" debug message is
easy to miss. A hard failure upon EPIPE is noticeable, and as you write
the script can just drain unwanted input with "cat >/dev/null".
I think I may not be appreciating this patch due to not understanding
how difficult it could be to update such a script. Can you elaborate on
that? Can you provide an example where adding "cat >/dev/null" to the
pwrite script is a disproportionate chore?
Thanks,
Laszlo