fetchmail-friends
[Top] [All Lists]

[fetchmail] [PATCH] fetching sizes of mails

2003-09-19 02:52:12
Currently, fetchmail gets the sizes of all mails right at the start.
This causes problems when the mailbox has too many mails (especially
when the connection is flaky). The current transaction goes as:

IMAP> A0005 SELECT "INBOX"
IMAP< * 10000 EXISTS                    # there are 10000 mails!
IMAP< A0005 OK [READ-WRITE] Ok
IMAP> A0007 FETCH 1:10000 RFC822.SIZE
IMAP< * 1 FETCH (RFC822.SIZE 1000)
IMAP< * 2 FETCH (RFC822.SIZE 2000)
IMAP< * 3 FETCH (RFC822.SIZE 3000)
...
IMAP< * 10000 FETCH (RFC822.SIZE 67890)
IMAP< A0007 OK FETCH completed.

The average size of a line is 33 bytes (including CRLF), so nearly 330
kbytes are transferred here before downloading the first mail! There
will also be a loss of bandwidth if a socket error occurs in the
meantime.

The situation for POP3 is similar.

POP3> STAT 
POP3< +OK 10000 99319873                # there are 10000 mails!
POP3> LIST 
POP3< +OK POP3 clients that break here, they violate STD53. 
POP3< 1 1000 
POP3< 2 2000 
POP3< 3 3000 
...
POP3< 10000 67890 
POP3< . 

The only difference is that due to smaller average length, the
total data transferred here before the first mail is around 110
kbytes.

There is no need for fetchmail to get all the sizes right at the
start. The attached patch changes this transaction in the following
way:

For IMAP, the following commands are sent:
(here keep is being assumed, actual transaction will depend on options)

SELECT INBOX
FETCH 1:100 RFC822.SIZE                 # get sizes of first hundred mails only
FETCH 1 RFC822.HEADER
FETCH 1 BODY.PEEK[TEXT]
FETCH 1 STORE 1 +FLAGS (\Seen)
...
FETCH 100 RFC822.HEADER
FETCH 100 BODY.PEEK[TEXT]
FETCH 100 STORE 1 +FLAGS (\Seen)
FETCH 101:200 RFC822.SIZE               # get sizes of next hundred mails
FETCH 101 RFC822.HEADER
FETCH 101 BODY.PEEK[TEXT]
FETCH 101 STORE 1 +FLAGS (\Seen)
...
FETCH 200 RFC822.HEADER
FETCH 200 BODY.PEEK[TEXT]
FETCH 200 STORE 1 +FLAGS (\Seen)
FETCH 201:300 RFC822.SIZE               # get sizes of next hundred mails
...

POP3 does not support getting sizes of range of mails. So, for POP3:

STAT
LIST 1                                  # get size of first mail
RETR 1
LIST 2                                  # get size of second mail
RETR 2
LIST 3                                  # get size of third mail
...

The advantages of this patch are:

* the first mail is downloaded quickly.

* huge memory allocation is not required to save the sizes and message
codes.

* with 'no fetchall', getting the size of old mails is avoided.

A configurable option "fetchsizelimit" has been added. The default
value is 100 (for POP3, any non-zero value is converted to 1). To get
all sizes right at the start (as is currently happening),
fetchsizelimit should be explicitly set to 0.

-- 
Sunil Shetye.

Attachment: fetchmail-6.2.4-partialsize.patch
Description: Text document

<Prev in Thread] Current Thread [Next in Thread>