Re: [Asrg] About that e-postage draft [POSTAGE]


On Feb 21, 2009, at 4:07 PM, Bill Cole wrote:

Steve Atkins wrote, On 2/20/09 7:26 PM:
On Feb 20, 2009, at 3:25 PM, Bill Cole wrote:
The reliability (SPoF), economics, business and usability issuesare likely to be much more of a problem.
Those are all problems. It seems to me that any attempt to seriouslyaddress the SPoF problem makes the race resolution problem harder.

Almost by definition. If your definition of which of two redemptionattempts is valid is the one that arrives at a particular place first,then that place is a single point of failure. If that's not yourdefinition, things get a lot more complex.

At each redemption machine, look up the stamp an "I've seen this"associative array. If you've seen it, reject the stamp, otherwiseaccept it. This is arbitrarily scalable, just by adding enoughredemption machines that the memory access time to look up theentry in the associative array is enough to meet your throughputgoal, and the size of the number of outstanding stamps fits in thestorage space of the machine. Assuming there's a serial number ineach stamp, your associative array could simply be 250 gigabits ofRAM, so again it's not going to be many machines, maybe one, to doin software.
I think it is a bit of a hand-wave to call this arbitrarilyscalable, but I'm happy to stipulate for the test of my hypothesisthat the entire server-side decision process for any one redemptiontransaction can be reliably done correctly in uniform sub-millisecond time if the stamp is either available for redemption oralready redeemed. The problem is that logically you need a thirdintermediate state that will last for the RTT of the networkconnection to the redeeming client, and that state will defer thedecision for other attempts to redeem the stamp.
To me it feels like the hard bit of this is handling a millionpackets in and out per second reliably, along with the overhead ofproviding robustness and redundancy, rather than the redemptionitself.
That was my point, because it seems to me that a redemption cannotbe done with just one packet in and one out, but really needs two inand one out.

One in and one out will do it. The client MTA can include a nonce orserial number in the request. The redemption machine can keep track ofthe nonce value of the request that was accepted as valid, at a costof a few times the number of machines needed. We're already keepingthat state for a month anyway.

If any client doesn't receive a response (positive or negative) aftera while, it can retry with the same nonce and it will still get acorrect answer. (While the system may well need to deal with maliciousclients, it only needs to give correct results to non-maliciousclients). This eliminates any need to keep state for anythingresembling packet RTT time.

The similarities of this to DNS might just betray my background ratherthan being anything particularly meaningful.

A legitimate stamp needs to have 3 possible states in the server'smap: redeemed, unredeemed, and pending acknowledgment of redemption.If the server only has two states for a stamp, then it would end upwith one of two flaws by design:


Or "unredeemed" and "redeemed for nonce value X".

1. If the stamp is marked as redeemed when a successful redemptionattempt completes on the server, it is possible that the successwill not be successfully communicated to the client. If the clientthen retries the redemption, it will fail.
2. If the stamp is left as unredeemed while waiting for the clientack of success, stamp "reuse" becomes a question of how manyredemption decisions can be made per client RTT.
The server may have to defer many thousands of clients for scores ofmilliseconds while waiting for one to send an ack. Handling a fewdozen such events per second seems to me to be a really hard problemto address, but maybe I'm missing something. If the averagetransaction lifetime is 100ms, then a million-TPS system needs to beable to handle an average of 100k concurrent pending transactionsand spikes probably twice that. I think the ways to handle that allinclude dividing the front end between multiple machines, but thatcreates a tougher problem keeping the back end recordkeeping fastand coherent from the viewpoints all of the front ends.
I suspect that the "solution" that will be chosen if anyone tries tocreate a real e-postage system instead of hand-waving about it willbe to open it to lost packet damage as the cost of scalability.Network latency with clients becomes irrelevant if the serverassumes that its redemption messages are always delivered. Thatallows for a lot of optimization. Once in a while a stamp thatshould work will fail to do so, and if such a system ever gets intothe real world I'm sure its users will be shocked at how much higherthe real-world failures are than in their tests...

Don't forget that there is no state that needs to be kept ortransferred other than that keyed on the stamp value itself. You canspread the load needed across as many systems as you need to keep upwith demand. At an extreme, you can have the client MTAs use thecookie in the stamp to decide where to check the stamp, and spreadload across multiple networks and locations - I check this stamp bysending a packet to whatever IP address <last 16 bits ofcookie>.stampcheck.foo resolves to. (That's what I mean when Idescribe it as embarrassingly parallelizable.)


That might end up being expensive to operate, of course.

Cheers,
  Steve



_______________________________________________
Asrg mailing list
Asrg(_at_)irtf(_dot_)org
http://www.irtf.org/mailman/listinfo/asrg