[Namazu-users-en] Re: html data not indexed in text/html mails
2005-08-05 00:10:44
hello all,
Sorry for yet another mail.
But i just had a small doubt. When will the new version of namazu (
namazu-2.0.15 ) be formally released.
Thanks n regards
Swati
swati wrote:
Hello,
Thank you very much for that patch. I am able to index and seach on
those text/html mails now. And everything seems to work fine.
Thanks again
Regards,
Swati
Yukio USUDA wrote:
swati wrote:
Hello all,
This is a sample mail that i was trying to index and search on.
snip
In this mail I am able to search on the words like rober and streams, which
exists in the header part. But the words like fear or member or primers, which
exists inside the html part of the mail are not indexed or searched. I tried
the new verison of namazu (namazu-2.0.15pre1 ) with that also i am not able to
index/search this type of mail.
Can anyone give some suggestions as to how I can make these mails also indexd
and searched.
I made a patch for this type mail (from namazu-2.0.15pre1.)
bash$ diff -ub filter/mailnews.pl.org filter/mailnews.pl
--- filter/mailnews.pl.org Mon Jun 6 14:41:42 2005
+++ filter/mailnews.pl Thu Aug 4 21:13:53 2005
@@ -65,7 +65,7 @@util::vprint("Processing mail/news file
...\n");uuencode_filter($cont);
- mailnews_filter($cont, $weighted_str, $fields);
+ mailnews_filter($cont, $weighted_str, $headings,
$fields);mailnews_citation_filter($cont,
$weighted_str);gfilter::line_adjust_filter($cont);
@@ -79,11 +79,12 @@# Original of this code was contributed by
<furukawa(_at_)tcp-ip(_dot_)or(_dot_)jp>. sub mailnews_filter ($$$) {
- my ($contref, $weighted_str, $fields) = @_;
+ my ($contref, $weighted_str, $headings, $fields) = @_;my $boundary = "";my $line
= "";my $partial = 0;
+ my $htmlmail = "";$$contref =~ s/^\s+//;# Don't handle if first like
does'nt seem like a mail/news header.
@@ -125,6 +126,10 @@# contributed by Hiroshi Kato
<tumibito(_at_)mm(_dot_)rd(_dot_)nttdata(_dot_)co(_dot_)jp>$partial =
$1;util::dprint("((partial: $partial))\n");
+ } elsif ($line =~ m!text/html!i) {
+ # The simplest form of an HTML email message.
+ util::dprint("text/html mail\n");
+ $htmlmail = "yes";} elsif ($line !~ m!text/plain!i) {$$contref
= '';return;
@@ -161,6 +166,9 @@multipart_process($contref, $boundary, $weighted_str,
$fields);}
+ if ($htmlmail) {
+ html::html_filter($contref, $weighted_str, $fields, $headings);
+ }}# Prototype declaration for avoiding
Yukio USUDA
------------------------------------------------------------------------
_______________________________________________
Namazu-users-en mailing list
Namazu-users-en(_at_)namazu(_dot_)org
http://www.namazu.org/cgi-bin/mailman/listinfo/namazu-users-en
********** DISCLAIMER **********
Information contained and transmitted by this E-MAIL is proprietary to
Sify Limited and is intended for use only by the individual or entity to
which it is addressed, and may contain information that is privileged,
confidential or exempt from disclosure under applicable law. If this is a
forwarded message, the content of this E-MAIL may not have been sent with
the authority of the Company. If you are not the intended recipient, an
agent of the intended recipient or a person responsible for delivering the
information to the named recipient, you are notified that any use,
distribution, transmission, printing, copying or dissemination of this
information in any way or in any manner is strictly prohibited. If you have
received this communication in error, please delete this mail & notify us
immediately at admin(_at_)sifycorp(_dot_)com
_______________________________________________
Namazu-users-en mailing list
Namazu-users-en(_at_)namazu(_dot_)org
http://www.namazu.org/cgi-bin/mailman/listinfo/namazu-users-en
|
|