mhonarc-dev

[Bug #2971] spammode option interferes with iso-2022-jp

2003-03-28 22:26:21

=================== BUG #2971: LATEST MODIFICATIONS ==================
http://savannah.nongnu.org/bugs/?func=detailbug&bug_id=2971&group_id=1968

Changes by: Earl Hood <earl(_at_)earlhood(_dot_)com>
Date: Fri 03/28/03 at 23:26 (US/Central)

            What     | Removed                   | Added
---------------------------------------------------------------------------
            Severity | 5 - Major                 | 1 - Ordinary
          Resolution | None                      | Later
    Platform Version | Linux                     | All


------------------ Additional Follow-up Comments ----------------------------
[Limitation]
Currently, there are work-arounds to this.  SPAMMODE
is just a convienence to setting other resources.  I.e.
ADDRESSMODIFYCODE can be set specifically to work-around
iso-2022-jp encoding.

As for Unicode, the TEXTENCODE resource allows a user
to preconvert all data to UTF-8.  However, as the original
report suggests, Unicode is not used an intermediate
format for processing with the ability to then re-encode
to another format when writing pages.  I'd prefer to avoid
round-tripping at this time, but it something to consider
for the future.

Technically, I would consider this bug a limitation with
SPAMMODE, hence I dropped the severity since Japanese users
can explicitly set related resources to work-around the
problem.



=================== BUG #2971: FULL BUG SNAPSHOT ===================


Submitted by: kkawa                   Project: MHonArc                      
Submitted on: Thu 03/27/03 at 22:33
Category:  Character Sets             Severity:  1 - Ordinary               
Bug Group:  Undesired Behavior        Resolution:  Later                    
Assigned to:  None                    Status:  Open                         
Platform Version:  All                Perl Version:  5.6.0                  
Component Version:  2.6.2             Fixed Release:                        

Summary:  spammode option interferes with iso-2022-jp

Original Submission:  The iso-2022-jp encoding is the most commonly used 
encoding in Japan.
The problem is that the encoded text often contains the '@' mark,
which apparently triggers the spam mode filter and as a result,
surrounding characters will be incorrectly replaced by 'x'.


A "proper" fix would probably require the entire MHonARC to
work on Unicode --- the entire processing should operate on
Unicode, rather than raw inputs. But this change might be too
big.

The easy change is perhaps for the spammode to be made smarter.
The typical iso-2022-jp sequnce is something like "B(_at_)n8}9L2p",
so if you require a dot on the right hand side of the address,
this problem can be avoided.


Follow-up Comments
*******************

-------------------------------------------------------
Date: Fri 03/28/03 at 23:26         By: ehood
[Limitation]
Currently, there are work-arounds to this.  SPAMMODE
is just a convienence to setting other resources.  I.e.
ADDRESSMODIFYCODE can be set specifically to work-around
iso-2022-jp encoding.

As for Unicode, the TEXTENCODE resource allows a user
to preconvert all data to UTF-8.  However, as the original
report suggests, Unicode is not used an intermediate
format for processing with the ability to then re-encode
to another format when writing pages.  I'd prefer to avoid
round-tripping at this time, but it something to consider
for the future.

Technically, I would consider this bug a limitation with
SPAMMODE, hence I dropped the severity since Japanese users
can explicitly set related resources to work-around the
problem.


CC list is empty


No files currently attached


For detailed info, follow this link:
http://savannah.nongnu.org/bugs/?func=detailbug&bug_id=2971&group_id=1968

---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHONARC-DEV

<Prev in Thread] Current Thread [Next in Thread>