Re: Best practice for data encoding?

On 12-jun-2006, at 13:31, Carsten Bormann wrote:

in your original question to the list, you didn't quite make clearthat your question was with respect to BGP-style transfer of large-scale routing information.

I didn't want to limit the scope of the discussion to one particulartype of protocol.

Right now, you seem to focus on decoding performance. How much ofthe CPU time spent for BGP is decoding?Does the CPU time spent for the entirety of BGP even matter*? Ifyes, can a good data structure/encoding help with the *overall*problem?

I can't answer the first question, because the only BGP we have usesbinary. But I'm pretty confident that doing the same thing using atext based encoding isn't going to do us any favors performance wise.

The results from your test programs are not at all surprising.
Of course, a hand-coded loop where all data already is in the rightform (data type, byte order, number of bits), no decisions need tobe made, and you even know the number of data items beforehand, isgoing to be faster than calling the generic, pretty much neglected,parameterized, tired library routine fscanf that doesn't get muchuse outside textbooks.

Byte order stuff and such isn't much of an issue compared to the timerequired for memory access. And I guess fscanf could be be a slowimplementation, but this is just reading a value from a line of text,no hunting for tags and such that's required in HTML or XML. Also,the performance gap is just so huge, I don't think the details mattertoo much.

What this example shows nicely is that performance issues are non-trivial, and, yes, you do want to run measurements, but at thesystem level and not at the level of "test cases" that have littleor no relationship to the performance of the real system.

Sure, but how are you going to do that kind of testing when designinga protocol? Creating two implementations just to see which variationis faster would be a good idea but I don't really see that happening...

If you really care about the performance of text-based protocols,you cannot ignore modern tools like Ragel.


Don't know it.

If, having used them, you still manage to find the text processingoverhead in your profiling data, I'd like to hear from you.

The problem with text is that you have to walk through memory andcompare characters. A LOT. This is pretty much the worst thing youcan do to a modern CPU: you don't use the logical and hardly thephysical word width, and all those compares are hard to predict soyou get massive numbers of incorrectly predicted branches.


But I guess this discussion can go on forever...

_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf