nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] proposed patch for shell metacharacter failure in nmh-1.7

2018-02-11 17:57:20
[I wrote this ages ago when catching up on unread list email and never
quite finished it.  But if it doesn't leave +drafts now then...  It's a
distraction from the current 1.7.x issue, but the list archive might
find it useful.]

Hi David,

If you want to try the attached patch to mhshowsbr.c, it works for me
with:

    mhshow-show-application/pdf: %pmime_helper %F %s
        "`fmttest -raw -format '%(decode{text})' '%{name}'`"
...
As far as whether this is the right approach, I'd like to hear from
others.

Replying to multiple branches here, having caught up.

I think correct quoting of the expansions is the right solution.  As
Paul said of Ken's suggestion to nuke dodgy characters, it's not the
least surprising.

The existing code around uip/mhshowsbr.c's parse_display_string()
obviously has problems.  In trying to deal with quoting it only spots
single quotes, not double or backticks, for example.  A bit of it had
the right idea at one time regarding single quoting, and currently says

    /* Escape existing quotes */
    while ((pp = strchr (pp, '\'')) && buflen > 3) {
        len = strlen (pp++);
        if (quoted) {
            /* Quoted.  Let this quote close that quoting.
               Insert an escaped quote to replace it and
               another quote to reopen quoting, which will be
               closed below. */
            memmove (pp + 2, pp, len);
            *pp++ = '\\';
            *pp++ = '\'';
            buflen -= 2;
            bp += 2;
            quoted = false;

Given the user wrote «'%{name}'», that expands to «'foo'bar'», the
comment is turning «'bar'» found in the quoted string into «'\''bar'»,
and that is correct.  But the code changed after the comment was written
to no longer insert the second single quote.  The comment didn't change,
nor did the buflen test at the top from 3 to 2.

Speaking of buflen, the code seems to struggle in places with tracking
what's going on.  We start off...

    static int
    parse_display_string(CT ct, char *cp, int *xstdin, int *xlist,
        char *file, char *buffer, size_t buflen, int multipart)
    {
        char *bp = buffer;

        bp[0] = bp[buflen] = '\0';

That looks like an overflow, as a buffer[4] with buflen 4 would have
indexes [0, 4), but the caller ensures buflen isn't `buffer length', but
`maximum index'.  Cognitively tracking this detracts attention, probably
leading to things like

    len = strlen(bp);
    bp += len;
    buflen -= len;
    *bp = '\0';

Finally, if the MIME's charset isn't the locale's, then another $MH
entry is consulted, e.g. `mhshow-charset-iso-8859-1', and its value is
another template.  This time, snprintf(3) does the work so the only
escape is `%s' and we assume the user complies.

    if (ct->c_termproc) {
        char term[BUFSIZ];
        strncpy(term, buffer, sizeof(term));
        snprintf(buffer, buflen, ct->c_termproc, term);
    }

But they need to make sure they don't attempt to quote that escape,
perhaps to pass it as a single argument, as it may already have quotes.
And if the main template's expansion was `foo | bar' then they need to
ensure their `charset' template can handle sh, e.g. not use «xterm -fn
iso-font -e %s».

BTW, buflen is actually used as a buffer length there, so shortening the
available space by one, or it would be except it's not the original
`maximum index' buflen passed into the function, but the whittled down
one as the buffer was consumed.  This suggests if more than roughly half
the buffer was consumed that the result would be truncated when copied
back under sprintf().

My point is, years of edits, for enhancement and fixes, have grown a bit
of a monster.  They also make the behaviour surprising sometimes, e.g.
if $MH has

    mhshow-show-text/foo: foo %p %q

then `foo' is passed just `%q'.  Neither is mentioned in mhshow(1), nor
does it say what happens to unknown escapes, but `%p' meant something
once so the code silently ignores it now, whereas never-used `%q' is the
default case of passing it on.  Old ones should be removed, with
deprecation if they're still likely to be around.

And the addition of « "$@"» by argsplit() at the end of the string given
to sh's -c is a bad idea for this use.  I realise no arguments are
passed to sh this time, but it can still be set by the template, e.g.
as a simple means of parsing output.  This intentionally runs the second
word from grep's output, and then, unintentionally, the whole line.

    mhshow-show-text/foo: set `grep %s subtypes`; $2;

Enough moaning.  Having studied it to see what it's trying to do,
I'm just trying to point out more than '-quoting needs considering.

An alternative to escapes that expand to content is environment
variables.  Uppercase, of course, not bastard ones.  They can hold all
bytes bar NUL, and it's then up to the user to refer to them in a
suitable manner for their unpredictable content, either in the template
or the program that runs.  Documentation could guide the user, but
they'd probably make mistakes sometimes.

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>
  • Re: [Nmh-workers] proposed patch for shell metacharacter failure in nmh-1.7, Ralph Corderoy <=