nmh-workers
[Top] [All Lists]

[Nmh-workers] fgets bug in mhbuildsbr.c

2005-05-11 12:09:52
Hello,

I'm new to this list.  I've been a nhm user for almost 7 years now.  I love the
simplicity and flexibility of it's command line interface. 

I recently  came upon  a bug  in "mhbuildsbr.c" that  can cause  certain binary
attachment to  be detected as non-binary.   The result of this  is that mhbuild
will then  attempt to include a truncated  version of the binary  into the mime
composition as plain text. 

I stumbled upon this while trying to send a very small ZIP file.  I've attached
a  copy of  the  zip file  for inspection  and  verification of  the bug.   I'm
currently using "nmh 1.0.4".  I've reviewed  the latest version of the code and
this bug still appears to exist.   I've also searched through the mailing lists
and bug lists and don't see any mention of this issue.  Hopefully I'm not being
redundant.

The problem  is the use  of the "fgets"  function for reading files  inside the
"mhbuildsbr.c,scan_content" function.  A few lines  below the fgets call, a for
loop is used to  iterate over all the character read in,  in order to determine
if there are any 8-bit characters:

-------------------------------------------------------------------------------
                for (cp = buffer; *cp; cp++) {
                    if (!isascii (*cp)) {
                        contains8bit = 1;
                        check8bit = 0; /* no need to keep checking */
                    }
                    . . . 
-------------------------------------------------------------------------------

The expression "*cp" is used to terminate the loop.  But if the data happens to
be binary  data, then this  loop will terminate  upon the first  0x00 character
that  is hit.  For  sufficiently large  binary files,  the probability  of this
resulting in  a false  negative for "contains8bit"  is certainly low.   But the
binary file I was trying to send  was only 195 bytes long.  _And_ it happens to
have a  sequence of characters such  that there are no  8-bit characters before
the first occurrence of  a 0x00 character.  So it got detected  as a plain text
document.

I  patched  my version  of  mhbuildsbr.c  by adding  a  wrapper  to fgets  that
determines the  proper number of  characters read in.   Then I changed  the for
loop to  make use of this information  instead of simply looking  for the first
0x00 character.  This is probably an  inefficient solution.  My guess is that a
better solution  would be to  make use  of fread instead,  and do all  the line
detection stuff manually.

I've attached my patch to this email as well. 

To test the bug, include the zip file into a message using something like:

---[msg.txt]-------------------------------------------------------------------
To:
cc:
From: 
Subject:
--------
some text

Attachment: foo.zip
Description: Zip archive

-------------------------------------------------------------------------------

Then run "mhbuild msg.txt" and verify that the file was not included correctly.

Again, nmh is awesome and I hope this hasn't already been covered.

David E. West

Attachment: foo.zip
Description: Zip archive

--- mhbuildsbr.c.orig   Wed May 11 13:44:12 2005
+++ mhbuildsbr.c        Wed May 11 13:46:40 2005
@@ -3599,6 +3599,45 @@
     return OK;
 }
 
+/*
+ * Effects: fgets wrapper that insures that the buffer is always filled with
+ *   0xff characters before the call to fgets.  This is used later on in
+ *   determining exactly how many characters where read into the buffer using
+ *   fgets.  The implimentation here is probably not optimal. 
+ * Modifies: <*buffer>[size]
+ */
+char* myfgets( int *pnumread, char *buffer, int size, FILE* stream ) {
+    char *cp, *fgets_result;
+
+    // Dezero the buffer. 
+    for ( cp = buffer; cp < ( buffer + size ); cp++ ) {
+        *cp = 0xff;
+    }
+
+    // Call the fgets function. 
+    fgets_result = fgets( buffer, size, stream );
+
+    // If at least one character was read. 
+    if ( fgets_result ) {
+        // Scan from right to left looking for the first 0x00 character. 
+        for ( cp = buffer + size - 1; cp > ( buffer - 1 ); cp-- ) {
+            // If we found one,
+            if ( *cp == 0x00 ) {
+                // Calculate the number of characters read. 
+                *pnumread = ( cp - buffer );
+
+                // Quit searching. 
+                break;
+            }
+        }
+    } else {
+        // Zero characters where read in. 
+        *pnumread = 0;
+    }
+
+    // Return the origonal fgets result. 
+    return fgets_result;
+}
 
 /*
  * Scan the content.
@@ -3626,6 +3665,7 @@
     struct text *t;
     FILE *in;
     CE ce = ct->c_cefile;
+    int fgets_len;
 
     /*
      * handle multipart by scanning all subparts
@@ -3716,12 +3756,12 @@
            adios (ce->ce_file, "unable to open for reading");
        len = strlen (prefix);
 
-       while (fgets (buffer, sizeof(buffer) - 1, in)) {
+        while (myfgets( &fgets_len, buffer, sizeof(buffer) - 1, in)) {
            /*
             * Check for 8bit data.
             */
            if (check8bit) {
-               for (cp = buffer; *cp; cp++) {
+                for (cp = buffer; cp < ( buffer + fgets_len ); cp++) {
                    if (!isascii (*cp)) {
                        contains8bit = 1;
                        check8bit = 0;  /* no need to keep checking */
_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
http://lists.nongnu.org/mailman/listinfo/nmh-workers
<Prev in Thread] Current Thread [Next in Thread>