namazu-users-en
[Top] [All Lists]

Re: Symbol matching

To: namazu-users-en@namazu.org
Subject: Re: Symbol matching
From: knok@daionet.gr.jp (NOKUBI Takatsugu)
Date: Mon, 7 May 2001 10:09:48 JST
Reply-to: namazu-users-en@namazu.org
Message-id: <200105070352.MAA27220@ns1.eal.or.jp>
In article <Pine.LNX.4.21.0105041552380.31987-100000@bigears.ncst.ernet.in>
philip@konark.ncst.ernet.in writes:

However, looking through the source, I see no evidence of this
happening.  Could anyone provide pointers on where this would be done?

There is wordcount_sub() function in mknmz.

sub wordcount_sub ($$\%) {
    my ($text, $weight, $word_count) = @_;

    # Count frequencies of words in a current document.
    # Handle symbols as follows.
    #
    # tcp/ip      ->  tcp/ip,     tcp,      ip
    # (tcp/ip)    ->  (tcp/ip),   tcp/ip,   tcp, ip
    # ((tcpi/ip)) ->  ((tcp/ip)), (tcp/ip), tcp
    #
    # Don't do processing for nested symbols.
    # NOTE: When -K is specified, all symbols are already removed.
-- 
NOKUBI Takatsugu
E-mail: knok@daionet.gr.jp
        knok@namazu.org / knok@debian.org


<Prev in Thread] Current Thread [Next in Thread>