nmh-workers
[Top] [All Lists]

[nmh-workers] Downloading googlegroup messages

2019-05-07 17:15:05
I used https://github.com/icy/google-group-crawler to download
messages from a group I an persuing. Each message is put in a
separate file so it is easy to just link/copy them to numbered
files. I disovered that plain text message work fine but mime
messages don't.

For instance:

    $ cd $_GROUP
    $ find  mbox -type f|head|cat -n|awk '{print "ln ",$2,$1;}'|sh
    $ mhlist 3
     msg part  type/subtype              size description
       3       multipart/alternative     7888
    $ show 3
    mhshow: bogus multipart content in message 3
    ...

I finally tracked it down to this line:
uip/mhparse.c:1191:            if (strcmp (bufp + 2, m->mp_start))

mh->mp_start is "0000000000008a6f8e0585b620ff--\n"
while bufp+2 is "0000000000008a6f8e0585b620ff--\r\n"
 
So the test fails. Manually removing \r fixed this.

This seems to be a bug.  The boundary text as per the spec
doesn't include CRLF or LF or CR. What is interesting is that
the message header containing the boundary text also ends with
\r\n so nmh stripped that and then tacked on \n!

-- 
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>