Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects

Hello Ken,

The Unix kernel stores filenames as a run of bytes, not including
`/' and NUL.


That's not universally true anymore.  Some newer filesystems are
mandating that filenames are UTF-8 and enforcing normalization rules
(MacOS X and Solaris are two notable examples).


Thanks, I didn't know.  Haven't used Solaris in years, and never bought
Apple.

The only way of resolving this is to use the normalization rules for
Unicode and do filename searching that way;


Sure.

MacOS X actually rewrites all of the filenames using Normalization
Form D (all characters in decomposed form, which means the regular
character followed by the combining accents) and I think that sucks,
but they didn't ask me.


I think I agree with you.

Solaris is better; the original bytes are preserved, but lookup is
done using normalized names so you can't have two filenames with the
same characters.


What about globbing, especially on Mac OS X?  Given your two examples on
Linux with bash,

    $ touch résumé résumé
    $ ls r?sum?
    résumé
    $ ls r?sum? | recode ..dump
    UCS2   Mne   Description

    0072   r     latin small letter r
    00E9   e'    latin small letter e with acute
    0073   s     latin small letter s
    0075   u     latin small letter u
    006D   m     latin small letter m
    00E9   e'    latin small letter e with acute
    000A   LF    line feed (lf)
    $
    $ ls r??sum??
    résumé
    $

Do you think NFKC would be better, so ? often matches what appears as a
single rune and fi matches ligature ﬁ?

Cheers, Ralph.

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread]	Current Thread	[Next in Thread>
Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, (continued) Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, norm Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ralph Corderoy Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, norm Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Jerrad Pierce Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Earl Hood Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ralph Corderoy Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ralph Corderoy <= Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein

<Prev in Thread]

Current Thread

[Next in Thread>

Previous by Date:	Re: [Nmh-workers] netbsd 7: test/mhbuild/test-attach failure, Paul Fox
Next by Date:	Re: [Nmh-workers] A permute command for nmh 1.7 ?, Ralph Corderoy
Previous by Thread:	Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein
Next by Thread:	Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein
Indexes:	[Date] [Thread] [Top] [All Lists]

Previous by Date:

Re: [Nmh-workers] netbsd 7: test/mhbuild/test-attach failure, Paul Fox

Next by Date:

Re: [Nmh-workers] A permute command for nmh 1.7 ?, Ralph Corderoy

Previous by Thread:

Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein

Next by Thread:

Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein

Indexes:

[Date] [Thread] [Top] [All Lists]