nmh-workers
[Top] [All Lists]

[Nmh-workers] Date Parsing Problems.

2017-04-23 04:32:46
Hi,

One rabbit hole led to sbr/dtimep.l, why is lost in time.  It looked odd
to me so I gave it a poke.

    $ uip/fmttest -date '2017-04-23 09:16:42 +0100 Sun'
    04 Sun 2023 09:16:42 +0100

Then I tried nmh 1.6-3's.

    $ fmttest -date '2017-04-23 09:16:42 +0100 Sun'
    04 context 2023 09:16:42 +0100

For comparison, here's the output when it can grok the input.

    $ fmttest -date 'Sun, 23 Apr 2017 09:16:42 +0100'
    Sun, 23 Apr 2017 09:16:42 +0100

The lexer is ignoring the leading `20' because it has a catchall rule
for regexp `.' when nothing else matches.  `17-04-23' is taken as the
reverse-podium 4th of month 17, except the 17 is stored as 16 because
months are zero based.  The array of month names is indexed with 16;
"context" and "Sun" just happen to obstruct a SEGV.

I didn't examine why "Sun, " isn't in the output;  I guess the
conversion from date isn't happy because 16 > 11 is spotted.

2017-04-23 isn't valid input, and I don't think it should be, but should
the lexer be less forgiving so the user learns that no date could be
parsed?

Who here knows about the lexer's

    /* If the following is #defined, a timezone given as a numeric-only
     * offset will be treated specially if it's in a zone that observes
     * Daylight Saving Time.  For instance, during DST, a Date: like
     * "Mon, 24 Jul 2000 12:31:44 -0700" will be printed as "Mon, 24 Jul
     * 2000 12:31:44 PDT".  Without the code activated by the following
     * #define, that'd be incorrectly printed as "...MST". */
    #define     ADJUST_NUMERIC_ONLY_TZ_OFFSETS_WRT_DST 1

As it's set, the lexer knocks an hour off the timezone if DST.

    tm = localtime (&tw->tw_clock);
    if (tm->tm_isdst) {
        tw->tw_flags |= TW_DST;
        tw->tw_zone -= 60;
    }

But I think formatting back to a string happens through dtimezone()
that spots the set TW_DST bit and bumps up the hours by one.

    /* Get the timezone for given offset.
     * This used to return a three-letter abbreviation for some offset
     * values.  But not many.  Until there's a good way to do that,
     * return the string representation of the numeric offset. */

    char *dtimezone (int offset, int flags)
    {
        int hours, mins;
        static char buffer[10];

        if (offset < 0) {
            mins = -((-offset) % 60);
            hours = -((-offset) / 60);
        } else {
            mins = offset % 60;
            hours = offset / 60;
        }

    #ifdef ADJUST_NUMERIC_ONLY_TZ_OFFSETS_WRT_DST
        if (flags & TW_DST)
            hours += 1;
    #endif /* ADJUST_NUMERIC_ONLY_TZ_OFFSETS_WRT_DST */
        snprintf (buffer, sizeof(buffer), "%s%02d%02d",
                    offset < 0 ? "-" : "+", abs (hours), abs (mins));
        return buffer;
    }

So are all ADJUST_NUMERIC_ONLY_TZ_OFFSETS_WRT_DST's effects cancelled
out?  Setting it to zero passes all tests and suggests it can be deleted
along with the code it wraps?  I don't think "PST", etc., will ever
return?  If a user wants something like that then they can %(tzone) in
the format string and then compare with known numeric offsets to produce
names instead?

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>