ietf
[Top] [All Lists]

Re: Best practice for data encoding?

2006-06-12 05:11:39
On 12-jun-2006, at 13:31, Carsten Bormann wrote:

in your original question to the list, you didn't quite make clear that your question was with respect to BGP-style transfer of large- scale routing information.

I didn't want to limit the scope of the discussion to one particular type of protocol.

Right now, you seem to focus on decoding performance. How much of the CPU time spent for BGP is decoding? Does the CPU time spent for the entirety of BGP even matter*? If yes, can a good data structure/encoding help with the *overall* problem?

I can't answer the first question, because the only BGP we have uses binary. But I'm pretty confident that doing the same thing using a text based encoding isn't going to do us any favors performance wise.

The results from your test programs are not at all surprising.
Of course, a hand-coded loop where all data already is in the right form (data type, byte order, number of bits), no decisions need to be made, and you even know the number of data items beforehand, is going to be faster than calling the generic, pretty much neglected, parameterized, tired library routine fscanf that doesn't get much use outside textbooks.

Byte order stuff and such isn't much of an issue compared to the time required for memory access. And I guess fscanf could be be a slow implementation, but this is just reading a value from a line of text, no hunting for tags and such that's required in HTML or XML. Also, the performance gap is just so huge, I don't think the details matter too much.

What this example shows nicely is that performance issues are non- trivial, and, yes, you do want to run measurements, but at the system level and not at the level of "test cases" that have little or no relationship to the performance of the real system.

Sure, but how are you going to do that kind of testing when designing a protocol? Creating two implementations just to see which variation is faster would be a good idea but I don't really see that happening...

If you really care about the performance of text-based protocols, you cannot ignore modern tools like Ragel.

Don't know it.

If, having used them, you still manage to find the text processing overhead in your profiling data, I'd like to hear from you.

The problem with text is that you have to walk through memory and compare characters. A LOT. This is pretty much the worst thing you can do to a modern CPU: you don't use the logical and hardly the physical word width, and all those compares are hard to predict so you get massive numbers of incorrectly predicted branches.

But I guess this discussion can go on forever...

_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf