perl-unicode

Re: byte order mark

1999-10-06 15:47:08
On Wed, 6 Oct 1999 17:39:15 -0400 (EDT), Ilya Zakharevich
<ilya(_at_)math(_dot_)ohio-state(_dot_)edu> wrote:

That's what some newer file systems allow to achieve: green bits.  On
*nixish systems each file has a "contents", and several additional
bits (which are not intermixed with context in any way), like
timestamps and permission bits.  "Green bits" are additional arbitrary
bits of data which you may associate to a file, but which do not show
in "contents" (there is a separate API to access them).

Say, on HPFS you can associate an arbitrary hash to an arbitrary file
(similarly with FAT-under-OS/2).  I think DGUX has something similar
in its file systems.

"Multiple streams" are also a standard feature of NTFS (son-of-HPFS),
although few people seem to know about it:

    Microsoft(R) Windows NT(TM)
    (C) Copyright 1985-1996 Microsoft Corp.

    H:\>cat > test
    Some text
    ^Z
    H:\>cat > test:key1
    This is key1
    ^Z
    H:\>cat > test:key2
    The second key
    ^Z

    H:\>cat test
    Some text
    H:\>cat test:key1
    This is key1
    H:\>cat test:key2
    The second key

    H:\>ls -l test
    -rw-r--r--   1 544      everyone       11 Oct  6 23:20 test

One "problem" with this is that standard tools don't account for the size
taken by these additional streams (see C<ls> above).  This wasn't a
problem for just permission bits and timestamps because those where just a
few bytes.  But the streams can be arbitrarily large.  A more serious
problem is that many tools don't preserve the additional streams.
Microsoft's C<copy> copies them:

    H:\>copy test xyzzy
            1 Datei(en) kopiert.
    H:\>cat xyzzy
    Some text
    H:\>cat xyzzy:key1
    This is key1
    H:\>cat xyzzy:key2
    The second key


But stdio knows nothing about them:

    H:\>cp test abc
    H:\>cat abc
    Some text
    H:\>cat abc:key1
    cat: abc:key1: No such file or directory

Same problems with zip and tar :-(

They would still be useful to hold e.g. the compiled bytecode of Perl
modules (just like OS/2 stores compiled REXX code in extended attributes).
But to keep "precious" out-of-band information they need much better
general support.

-Jan

<Prev in Thread] Current Thread [Next in Thread>