ietf
[Top] [All Lists]

Re: [mpls] Last Call: <draft-ietf-mpls-in-udp-04.txt> (Encapsulating MPLS in UDP) to Proposed Standard

2014-01-16 10:08:01

In message 
<290E20B455C66743BE178C5C84F1240847E63346C9(_at_)EXMB01CMS(_dot_)surrey(_dot_)ac(_dot_)uk>
l(_dot_)wood(_at_)surrey(_dot_)ac(_dot_)uk writes:
 
Curtis
 
http://lmgtfy.com/?q=Jonathan+Stone+CRC+checksum

That is the Sigcomm 2000 paper that a number of people have said is no
longer relevant, in some cases giving quite a bit of detail about the
error causes in the paper and why these will be rare today.  I thought
you were refering to something more recent.

HDLC is just whatever is over the last hop. You said HDLC, I reused that as
an example.

I gave HDLC as an example because in 2000 it was in use and a source
of errors but in 2014 it is likely not to be in use just about
anywhere in North America and Europe at least.

Any link technology could be substituted - 10Mbps Ethernet, say, though
you'd criticise that as not being 10Gbps Ethernet and therefore out of date.

My point was that if you substituted 10Mbps Ethernet with a 32 bit
FCS, for HDLC which often just counted and passed along errored
packets, you would have far fewer errors, almost none.

Again, the point is that the link check is not end-to-end, and that errors
can creep in from the most unexpected places. By analogy with security,
if I have security across each hop, why would I need security end-to-end?
I already have it across each hop! Each link is highly and absolutely
unrbreakably secure! What is this end-to-end of which you speak?

You have security end to end because someone may have a motivation to
monitor or alter your packet.

If you have robust L2 FCS, then there is no one with a motivation to
disable that FCS and corrupt your data.  If they did for MPLS carrying
IP, then there is a check, albiet not a great one, that might catch
it.  If the MPLS payload is PW carrying a L2 carrying IP, same
applies.

Note that if the PW payload is TDM, Ethernet, or most other L2, at
least a checksum if not more is available in the payload so the errors
would be detected and the customer of the PW would complain.  Those
types of complaints have been a complete non-issue.

If you don't get that point because it's a bit abstract and timeless, that's
fine. (and if you have to explain the joke, the joke wasn't funny.)
The link CRC doesn't apply across the entire path; do the maths for
the path and a series of concatenated links.

The joke is that you didn't realize any mention of HDLC being used
now, just like X.25 being used now, was a joke.  It actually isn't
funny at all, except perhaps until someone doesn't get it.

[As it happens, I'm familiar with UDP/IP/HDLC internet infrastructure 
installed
within the decade and  in daily operational use to deliver imagery from orbit.
But the paper I wrote on that dates from 2007, so is old and
won't be of interest to you.]

That is clearly the exception.  TDM is still alive and well in the
third world and in access parts of the developed world that have not
been upgraded but elsewhere alternative exist for terestrial Internet
and TDM is gone from any non-third world network core that I am aware
of.

So don't run MPLS over UDP without a UDP checksum *on that
infrastructure* if you can avoid it.  But for most modern
infrastructures MPLS over UDP without a UDP checksum would be fine.

We are asking for a SHOULD, not a MUST NOT wrt UDP checksum.

Wait, this thread is all about putting 90s MPLS technology over UDP
technology specified in 1980. Clearly, if MPLS has to rely on an older
technology in this way, the MPLS crowd should give up and go home. 

And MPLS carried nothing but IP in those days so from a error
detection standpoint it was and still is a NOOP.

We've learned repeatedly that zero checksums are a bad idea. IPv6
RFC2460:
 
        Unlike IPv4, when UDP packets are originated by an IPv6 node,
         the UDP checksum is not optional.  That is, whenever
         originating a UDP packet, an IPv6 node must compute a UDP
         checksum over the packet and the pseudo-header, and, if that
         computation yields a result of zero, it must be changed to hex
         FFFF for placement in the UDP header.  IPv6 receivers must
         discard UDP packets containing a zero checksum, and should log
         the error.
 
which RFC6935 simply rewote as inconvenient to tunnelers without
considering how it affected everything else in the network.

How it affects other everything else in the network should be
considered before ignoring a SHOULD, as is always the case.

Those that do not learn from history are doomed to repeat it.
(Unless it's recent history, in which case they're doomed to reject it
as irrelevant.)
 
Lloyd Wood
http://about.me/lloydwood

Lloyd.  There are a lot of papers that are still relevant.  Almost all
of the causes of errors studied in the 2000 network.  Apparently an
exception is your HDLC infrastructure, and that would be gone too as
an issue if 32 bit CRC is enabled for HDLC and packets dropped on
error, not just counted.

That paper reported:

   2,209 M packets

   468,434 errors
   389,934 ACK-of-FIN bug (old Window NT bug, fixed)
    78,500 remaining
   Other errors cited but often not quantified
     CRLF replacement (thought to be Solaris bug)
     VJHC or other header comprssion bug
     possible PowerMac OS-X bug (fixed by publication)
     router memory error (ECC is used now)
     bad host DMA (PCIe has CRC32 today)

The paper has a breakdown by type of error and discussion of the
causes.  In some cases a particular type of error came mostly from a
small set of hosts but the cause is only likely to be host related
and the entire type of error category can't with certainty be
attributed to host error.

   Bad Hosts

      Another surprise in our traces was the large fraction of errors
      where are due to persistantly-misbehaving hosts.

In fact the paper says:

   In general, link errors should be caught by the CRC.  However there
   are cases where the link level protocol can interact to cause
   higher level checksum errors.  The most notable situation is header
   compression and we looked vigorously for errors of this sort.

Link errors are really not considered in the paper as a significant
source of errors.  Router memory and other hardware errors are cited
but with today's hardware those types of errors should also be gone.

I discussed the sources of errors in this paper previously in
  http://www.ietf.org/mail-archive/web/mpls/current/msg11247.html
You did respond to that but only by top posting and ignoring the
discussion in that email of the causes of errors in the paper.

"No longer relevant" does not mean "not good work".  It would be worth
it for the Stone/Partridge study to be repeated.  For this context it
would have to be with cooperation of a provider and over the type of
infrastructure that MPLS is intended to be run.

Curtis

________________________________________
From: Curtis Villamizar [curtis(_at_)ipv6(_dot_)occnc(_dot_)com]
Sent: 15 January 2014 01:42
To: Wood L  Dr (Electronic Eng)
Cc: curtis(_at_)ipv6(_dot_)occnc(_dot_)com; jmh(_at_)joelhalpern(_dot_)com; 
lars(_at_)netapp(_dot_)com; xuxiaohu(_at_)huawei(_dot_)com; 
mpls(_at_)ietf(_dot_)org; ietf(_at_)ietf(_dot_)org
Subject: Re: [mpls] Last Call: <draft-ietf-mpls-in-udp-04.txt> (Encapsulating 
MPLS in UDP) to Proposed Standard
 
In message 
<290E20B455C66743BE178C5C84F1240847E63346C4(_at_)EXMB01CMS(_dot_)surrey(_dot_)ac(_dot_)uk>
l(_dot_)wood(_at_)surrey(_dot_)ac(_dot_)uk writes:
 
The HDLC part here is last link, not the scope of the whole path.  Any
'low' bit error rate given actually becomes quite high once you
consider no of bits per packet and line rate...

Do read Jonathan Stone's papers on where errors creep in - not just in
the link, by at any point along the path, including regeneration

Lloyd Wood
 
 
Lloyd,
 
There is no HDLC hop.  No one has used HDLC for internet
infrastructure in ages.  It was a joke, like Scott's comment on
wanting to use X.25.  HDLC was disappearing when the Stone/Partridge
Sigcomm 2000 paper was written.
 
Links please.  And how old is that paper?  Not another 15 year old
work is it?
 
If you have one bit error per day, how many packets do you lose that
day?  (hint: one).
 
If you have one bit error per day, how many undetactable packet errors
do you have?  (hint: crc32 gets all one bit errors, therefore zero).
 
10^-12 bit errors is one per 10 second on 100 Gb/s, one per 100 second
on 10 Gb/s and is generally considered high enough to take a link down
immediately.  A 1500 byte packet is 12,000 bits, about ~10^4.  That
would yield a packet rate as high as 10^-8 if bit errors were mostly
one bit error per packet.  In that case all errors would be
detectable.  It is only when there are a lot of bit errors or more per
packet that the CRC can be defeated and then its about 10^-9 chance.
 
So at an error rate much less than 10^-8 packets (tightly bunched
errors with multiple bit errors per packet) some 10^-9 might be
undetectable with a CRC32.  One packet every 10^6 seconds at 100 Gb/s
could have an undetectable error.  About one undetectable error a day
or one a week for continuous full out 100 Gb/s link.
 
Note that the same low error rate does not apply to a GbE or 10GbE
over colored optics over ROADM in the metro since there is no FEC
there.  It also may not apply to the enterprise or campus Ethernets.
In those hops the error rate is likely to be higher.  Needless to say,
wireless hops can have very high error rates.
 
This is why it could make sense to have the UDP checksum optional in
MPLS over UDP.  It wouldn't hurt to provide the checksums but in some
cases it might be OK to disable them.  That is what SHOULD is for in
an IETF document.
 
Curtis
 
 
From: Curtis Villamizar [curtis(_at_)ipv6(_dot_)occnc(_dot_)com]
Sent: 14 January 2014 20:54
To: Wood L  Dr (Electronic Eng)
Cc: jmh(_at_)joelhalpern(_dot_)com; lars(_at_)netapp(_dot_)com; 
xuxiaohu(_at_)huawei(_dot_)com; mpls(_at_)ietf(_dot_)org; 
ietf(_at_)ietf(_dot_)org
Subject: Re: [mpls] Last Call: <draft-ietf-mpls-in-udp-04.txt> 
(Encapsulating MPLS in UDP) to Proposed Standard

In message 
<290E20B455C66743BE178C5C84F1240847E63346C3(_at_)EXMB01CMS(_dot_)surrey(_dot_)ac(_dot_)uk>
l(_dot_)wood(_at_)surrey(_dot_)ac(_dot_)uk writes:

It stands to reason that if tunnelers can turn off udp checksums
because their performance is degraded, they can turn off
congestion control because it will degrade their performance.

Rest of the internet getting congested and getting
misdelivered corrupted packets? Really not their problem.

There are important vendors trying to sell products here,
and they need performance to do so.
Get with the program!

Lloyd Wood
http://about.me/lloydwood


OK, perhaps if you are running MPLS/UDP/IP over HDLC and the HDLC
configuration is set to count FCS errors but not drop you will still
*really* need the UDP checksum.  Otherwise its isn't going to do much
for you.  Any checksum is really bad for some types of errors such as
chunk reordering and multiple bit errors.

Maybe on HDLC or PPP with 16 bit CRC you may see a low error rate, but
in theory that would be much less than 10^-5 since few multiple bit
errors will be coincidence match the CRC, even for a 16 bit CRC.

I suspect most routers would be able to do the checksum anyway and for
modern links if they come up with a zero error count that's fine.

<ot>

Modern OTN based transport networks use forward error correction FEC
which accounts for a fair amount of overhead and a lot of processing
gates on the receiving end.  The measure of effectiveness of given FEC
is in dB with 10 dB being a reduction of a factor of 10 in bit errors
and typical FEC in the high tens of dB.  The target corrected error
rate is often 10^-15 or one bit error in 24 hours for 10 Gb/s, one bit
error in 2.5 hours for 100 Gb/s.  Any link with corrected bit error
rates approaching 10^-12 is taken out of service.  This is roughly
equivalent to the old ES (errored seconds) and SES (severely errored
seconds) metric where a ES is one second with any bit errors and an
SES is one second with 10 or more errors (I think its 10).  More than
some number of ES or SES and a link is taken down.  The uncorrected
errors are passed through.

A packet may traverse an entire continent with 2-3 such links
separated by regeneration or could stop at a number of routers along
the way.  Typically today the router uses 10GbE or 100GbE (growing
use) which are then passed as a bit stream in the transport network.
At the other end the uncorrected errors from transport are picked up
by Ethernet 32 bit FCS.  Since a 32 bit FCS picks up 100% of single
bit errors and most instances where a small number of bits are in
error, and all but 1 in 2^32 where many bits are in error, few errors
are going to get through.  If GFP is used, the per packet FCS is
checked at each hop and for GFP-T also checked end to end.

A bad local ethernet is more likely to contribute an error (again
better than 1 in 2^32 detection is expected) due to something like a
bad CAT-{5,5e,6} connection or too many sharp turns.  A DSL or DOCSIS
link is also more likely to contribute an error.  With CRC32 on all
links and no bad hardware in between (ie: circa 1990s equipment with
no parity RAM and no correction on DMA, buses, etc) you would expect
on the order of 10^-8 errors (10^-9 per hop, a few errored hops).

For example, two hosts on my home LAN had non-zero tcp checksums.
Each had < 10^-6 packet error rate.  It is hard to tell if this is
host errors at the other end.  The only hosts I have with non-zero are
on the service provider DMZ LAN so that would include any bot attacks,
etc, where sending hosts could be old junk.  Host behind those have
zero UDP and TCP checksum errors.  This seems similar to Stewart's
quick check.

In the T1/T3 days the transport layer just had parity and just counted
parity errors.  Providers in those days were notorious for ignoring ES
and SES counters until the customer complained.  HDLC then had its 16
bit CRC, optional 32 bit.  If an ISP wasn't paying attention to their
HDLC error counters then it was up to the IP end customer to complain
and hope the problem got escallated rather than dropped.

</ot>

As to whether congestion control is in practice needed see
http://www.ietf.org/mail-archive/web/mpls/current/msg11222.html

Its fine to make them both optional and to make congestion control
mechanisms out of scope and the topic of a later document if needed.

Curtis



________________________________________
From: ietf [ietf-bounces(_at_)ietf(_dot_)org] On Behalf Of Joel M. 
Halpern [jmh(_at_)joelhalpern(_dot_)com]
Sent: 10 January 2014 15:36
To: Eggert, Lars; Xuxiaohu
Cc: mpls(_at_)ietf(_dot_)org; IETF
Subject: Re: Last Call: <draft-ietf-mpls-in-udp-04.txt> (Encapsulating 
MPLS in UDP) to Proposed Standard

Maybe I am completely missing things, but this looks wrong.
If the MPLS LSP is carrying fixed rate pseudo-wires, adding congestion
control will make it more likely that the service won't work.  Is that
really the goal?

We do not perform congestion control on MPLS LSPs.
Assuming that a UDP tunnel is carrying just MPLS and was established
just for MPLS, why would we expect it to behave differently than an MPLS
LSP running over the exact same path, carrying the exact same traffic?

Yours,
Joel

On 1/10/14 3:47 AM, Eggert, Lars wrote:
Hi,

that sounds good. What congestion control are you going to be 
specifying for your tunnel?

Lars

On 2014-1-10, at 4:46, Xuxiaohu <xuxiaohu(_at_)huawei(_dot_)com> wrote:

Hi Lars,

Thanks a lot for your comments.

I wonder whether the following modified text for Congestion 
Consideration section is OK from your point of view:

Since the MPLS-in-UDP encapsulation causes MPLS packets to be 
forwarded through "UDP tunnels", the congestion control guidelines for 
UDP tunnels as defined in Section 3.1.3 of [RFC5405] SHOULD be 
followed. Specifically, MPLS can carry a number of different protocols 
as payloads. When the payload traffic is IP-based and 
congestion-controlled, the UDP tunnel SHOULD NOT employ its own 
congestion control mechanism, because congestion losses of tunneled 
traffic will already trigger an appropriate congestion response at the 
original senders of the tunneled traffic. When the payload traffic is 
not known to be IP-based, or is known to be IP-based but not 
congestion-controlled, the UDP tunnel SHOULD employ an appropriate 
congestion control mechanism. Furthermore, because UDP tunnels are 
usually bulk-transfer applications as far as the intermediate routers 
are concerned, the guidelines as defined in Section 3.1.1 of [RFC5405] 
SHOULD apply.

Best regards,
Xiaohu

-----ÓʼþÔ­¼þ-----
·¢¼þÈË: mpls [mailto:mpls-bounces(_at_)ietf(_dot_)org] ´ú±í Eggert, 
Lars
·¢ËÍʱ¼ä: 2014Äê1ÔÂ8ÈÕ 18:22
ÊÕ¼þÈË: IETF
³­ËÍ: mpls(_at_)ietf(_dot_)org
Ö÷Ìâ: Re: [mpls] Last Call: <draft-ietf-mpls-in-udp-04.txt> 
(Encapsulating MPLS
in UDP) to Proposed Standard

Hi,

On 2014-1-2, at 16:14, The IESG <iesg-secretary(_at_)ietf(_dot_)org> 
wrote:
- 'Encapsulating MPLS in UDP'
<draft-ietf-mpls-in-udp-04.txt> as Proposed Standard


this document needs to describe how it addresses the issues raised in 
BCP145
(RFC5405). It already contains some text about messages sizes and 
congestion
considerations, which is great. Unfortunately, the text about 
congestion
considerations is not fully in line with RFC5405.

Lars

_______________________________________________
mpls mailing list
mpls(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/mpls

<Prev in Thread] Current Thread [Next in Thread>