Re: reproducible URLs

1998-09-10 10:29:31

8 base-46 characters is sufficient to have a minuscule collision
probability for archives of any reasonable size.

That's still only 44 bits of namespace. I guess it depends on
what you call reasonable risk; to me it feels a little high.

  Risk        # of Messages (approximately)
1:100000       18,000
1:3000         100,000   
3:100          1,000,000

A 100,000 message archive seems two orders of magnitude too high for
MHonArc's basic design; anything that large using a filesystem as its
database needs to be organized hierarchically.  That would add a
subdirectory namespace into the quota.

Two orders of magnitude? I am running two archives that will exceed
100,000 messages in the next two years, at the rate they are
growing. Their current size, 50,000 messages apiece, works fine under
the ext2 (linux native) filesystem. I think a statistical limit of one
million is better, as that better reflects the largest lists out there
stored over many years.

While many filesystems bog down with a large nuber of files in a
particular directory, not all do. Perfermance with lots of files in a
directory is not an inherent problem; it is directly tied to the
design of the filesystem. 

An arguement could be made that it doesn't make sense to compensate
for broken filesystems, whether due to some crazy 8.3 namespace
limitation, or due to braindead performance with lots of files. The
place for the fix would be in the filesystem and/or underlying OS, not
MHonArc. (Kind of like it didn't really make sense to convolute Java
applet code, just so the applet would work on a broken Netscape 2.01


<Prev in Thread] Current Thread [Next in Thread>