namazu-users-en
[Top] [All Lists]

[Namazu-users-en] Fix for odd sorting of search results

2007-04-10 21:45:54

Tadamasa Teranishi <yw3t-trns(_at_)asahi-net(_dot_)or(_dot_)jp> [2007-04-09 
23:05]:
rex wrote:
 
Now, I'm trying to understand why sorting by date appears not to work
correctly with files created with MHonArc and indexed with mknmz with
the --mhonarc option.

Please retrieve the archive of ML of the UTC field sort about the 
date sorting. 

The date sorting is sorting of the time of the file of the stamp. 
It is not because of being sorted in date displayed in the 
retrieval result the order. 

After reading the thread I understand what it does, but not why it does
that with mbox-type files (especially if they have been converted with
MHonArc, which creates a new file with a new time stamp).


This is how I fixed it:

Added utc to .mknmzrc

# $SEARCH_FIELD = 
"utc|message-id|subject|from|date|uri|newsgroups|to|summary|size";


Added
<option value="field:utc:descending">by date in early order</option>
<option value="field:utc:ascending">by date in late order</option>

to NMZ.head


Ran (as root) mailutime to set all message modification times to time in
Date: field.

/usr/local/src/namazu-2.0.17/scripts/mailutime 
/srv/www/htdocs/ffarchive/ff_msgs/*

There were many warnings about "...is not rfc822 format! trying fuzzy
mode..." The timestamps were changed, but the time order does
not match the message order. That's not surprising considering that the
messages came from a Yahoogroups list, and messages from them often
arrive out of order.

ls -alF

[...]
... 4948 2007-01-30 20:59 msg01090.html  Date: Tue, 30 Jan 2007 23:59:35 EST
... 5119 2007-01-30 21:05 msg01091.html  Date: Wed, 31 Jan 2007 00:05:07 EST
... 8411 2007-01-31 06:21 msg01092.html  Date: Wed, 31 Jan 2007 06:21:47 
&#45;0000
... 2857 2007-01-31 19:05 msg01093.html  Date: Wed, 31 Jan 2007 19:05:38 
&#45;0000
... 4687 2007-01-31 16:55 msg01094.html  Date: Wed, 31 Jan 2007 16:55:22 
&#45;0800
... 4848 2007-01-31 18:54 msg01095.html  Date: Wed, 31 Jan 2007 18:54:28 
&#45;0800
... 4530 2007-01-31 19:12 msg01096.html  Date: Wed, 31 Jan 2007 19:12:59 
&#45;0800
... 3868 2007-01-31 19:16 msg01097.html  Date: Wed, 31 Jan 2007 19:16:20 
&#45;0800
[...]

Also the timestamp does not always match the time in the message. I've manually
added the Date line contained in each message above to make it easy to
compare them. Looking at the results, it appears the program is
adjusting the timestamp to PST (which is my local time). That's good if
the program does it correctly.

Unfortunately, mailutime does not recurse into subdirectories, so it was
necessary to cd into each month's directory to change the timestamps.

Then I created new index files:

/srv/www/htdocs/ffarchive/ff_msgs # mknmz -O /tmp/mknmz --mhonarc *

It now appears that sorts in chronological and reverse chronological
order work correctly.

Thanks,

-rex



_______________________________________________
Namazu-users-en mailing list
Namazu-users-en(_at_)namazu(_dot_)org
http://www.namazu.org/cgi-bin/mailman/listinfo/namazu-users-en

<Prev in Thread] Current Thread [Next in Thread>