nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] GCC 8 pre-releases have escaped...

2018-02-07 20:42:53
On Wed, 07 Feb 2018 13:39:50 -0500 Ken Hornstein <kenh@pobox.com> wrote:
Ken Hornstein writes:
On creating a library of routines for other programs, that is
worth it only for things where performance is critical (for
other uses access via popen() may be good enough).

We have had people express an interest in that before; one problem
we run into is that reading a large mailbox just takes a while even
if the directory is cached in memory, because of the small buffer size used
by the readdir() call.

Such a
library can be written such that  it can be used from program
that use gc or not. BTW, gc can be used for leak detection too
(not sure how it compares with valgrind).

I am not sure how you could feasibly do this, but it's possible I am
misunderstanding things.  A LOT of nmh's work happens in the library;
what is defined as the "library" has changed a bit over time; for instance
the MIME routines are technically not part of libmh.a, but our plan is
to move that into there when we get full MIME integration.  A lot of the
issues in terms of defining memory allocation rules is not between
library and nmh programs, but between different parts of the library.

As I understand it, we either go "full memory management" model, where
we explicitly define who owns what memory and free() appropriate memory
at the right points, or basically malloc() to our heart's content and
let the GC system take care of cleaning it all up.  To me the the
advantage of going the GC route is we don't need to do all of the hard
work with explicit memory management; we can just allocate memory
and the garbage collector takes care of it all.  But if we support
BOTH, well ... we still have to do all of the hard work of defining
memory allocating policies and calling free() (I'm not sure if you're
thinking we should make GC use.  So I'm wondering what the gain is; now
if you were thinking that we should just use explict memory management
in the library and just use GC in the user programs, okay, that's
fine, and as I understand it nothing we are planning to do precludes any
of that.

Now if I am misunderstanding anything, then please correct me; I would
like to be sure my understanding is correct.

What I meant is this: Boehm GC provides plugin replacements
for malloc() & free(). You just link with the Boehm gc library
before libc. The main difference is that GC_malloc() will
attempt to reclaim inaccessible space automatically.

So your portable libray can continue using malloc/free and it
should be usable with or without gc. You can just have to
ensure it has no internal memory leaks and provides an API so
users can avoid leaking memory. I think this you have to do in
any case.

Now if this library is huge and complicated and you want to
use it with non GC code, then yes, in a sense you won't really
benefit from gc since making it leakproof requires all the
analysis you are talking about.

To find out if this is the case I did a git pull and rebuilt
everything.  I see there are mts/libmta.a and sbr/libmh.a
libraries. Corresponding .c files add up to about 29K lines.
uip/*.c is another 44K lines. That is a lot of code for manual
analysis. And valgrind or gc will only point out those leaks
they encounter during a run (as Dijkstra said, testing reveals
the presence of bugs but never their absence).

This is why I think it makes sense to a) just use boehm gc, b)
not expose libmh to third party apps (or insist they use gc),
c) provide a small shallow library that popens as needed. If
you are toching lots of files, the cost of popen will not be
significant.

-- 
Nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>