right now, the folks doing the choosing pretty much have to guess what
the folks doing the using want/need. open discussion could eliminate
the guessing.
if instead the community feels that reliability for a core set of
functions is paramount, then new features can only be added after they
are viewed as stable and low-risk.
As if the community speaks with one mind...
After the *less than great* wireless deployment for IETF 55 the NOC
team did an internal crit/self crit which you may find interesting -
note the the first thing we talk about is the sometimes conflicting
requirements of the communities of interest:
*disclaimer* these notes are very old - and as all Marlowe fans know
"that was in another country, And besides, the wench is dead."
NOC55 crit/self-crit
Human Factors
- 4 groups of players/ 4 sets of needs
- Host
- responsible for providing the network, user connectivity (wireless
& TR), t-shirts & any toys, & hosting a Social if there is one.
- Expect to pay in excess of 100K and order local loop early!
- Needs:
- since the host name is attached to network performance, that
performance needs to be exceptional
- Given the amount of money & staff time needed, the host should
expect to be treated like a major donor and given enough
cooperation to create a positive outcome for everyone.
- Secretariat/Foretec
- responsible for booking hotel & arranging room access
- responsible for power distribution in meeting rooms and
deployment of power strips for end users
- hire AV staff & gear
- on site registration & temp staff
- all catering for reception/breaks/etc.
. - manage meeting schedule (agenda & rooms) - communicate needs
to the host
- Needs:
- cost containment?
- agenda & meeting requirements info from WG's
- direction from IETF on required services
- IESG/WG/etc.
- determine wg agendas & room needs (size, multicast, time frames
etc.)
- convey meeting requirements (services, network access, etc.) to
Foretec
- additional meetings (ISOC/IAB/IESG)
- Needs:
- infrastructure to support the work of the IETF
(network, power, meeting rooms, AV, lists, multicast, etc.)
- Support for admin functions (editor, meeting planners, etc.)
- accurate accounting (revenue generated, admin costs & meeting
expense) information for decision making.
- End Users
- participate in WGs
- network & TR users require support, need to provide USEFUL feed
back
- Needs:
- infrastructure
- enough meeting & hotel space
- Wants:
- ubiquitous network & power
- toys & food???
The interests & duties of the four groups overlap and there are some
tensions where those interests compete (i.e.. if the host wants to deploy
a well tested network, the Secretariat wants to keep hotel costs low there
will be scheduling problems that preclude early in-building deployment)
Constraints (55)
- non-local host
- no near site space to bench test network elements
- short time frames for deployment (room access)
- in building access to the NOC & TR on Friday (network required SAT)
- early access (Thurs.) to telco space to rack routers & core servers
- late access to meeting rooms (in some cases just hours before a
meeting)
http://darkwing.uoregon.edu/~llynch/noc55/mech/11-11-55thMeetingRoomSchedule.html
- AP selection
- based on internal factors & lack of support for some Nokia wireless
cards in the Foretec owned cisco APs. The (16) cisco APs were carried
as back ups.
- marginal vendor support for reported problems.
- Operational "dogfood" (worthy but experimental)
- dynamic DNS
- Bro/IDS
Problem Areas
- Host/Sec communication
- Noc/End user communication
- Loss of core staff (Randy/Fenner & then Rob to other duties)
Technical Problems
- Wireless
- long story - Joel & Bill?
- In building infrastructure/switch failure
- large room wireless failure
- first mile multicast problems
Marquis I-IV & the multicast sub-net both ran off two cat5 runs from the
basement electrical level to the Ballroom control booth on the lobby
level (3 floors). 2 [8 port] switches were then deployed in the control
booth and in building cat5 for Marquis was patched. Under load (Mon.
AM) these switches failed (too long a run?) They were replaced with 24
port cisco switch (managed) and the problem did not reoccur.
(Note: heas says "managed is always better")
- Juniper/DHCP relay issues
- bug reported, DHCP relay moved to server
- ARP white noise
- large amount of arp traffic seen through the routers from turn-up
never fully diagnosed, but didn't seem to effect performance
Strengths
- Noc Team depth -
- Enough folks to cover & second all key roles
- Fenner & co. (wireless debugging/monitoring)
- Joel/Cisco AP deployment under stress
- TR staff
- excellent volunteer participation (GA Tech & Nokia!)
Lessons Learned
- choose the best technical solution w/out regard to politics
- or be prepared to punt if things get messy
- understand & advertise that running experimental services will lead to
experimental
NOT production style service.
- test all existing in building infrastructure under load
- managed is better at the edge when using other peoples stuff
(managed is just better)
- Don't let outside considerations distract from inside problems
- multicast loss problems lost in the multicast deploy "noise"
- provide easily accessible data on network status
- Provide a direct channel for wireless users to help sort info
Nits
- more communications tools may not be better...
(list/web/RT/radios/cell phones etc.)
- NOC needs faster access to changes on the host page, or a nested NOC
page (user updates, publicizing monitoring data, etc.)
Needed in Future:
- better information about how & when wireless worked at past IETFs
(I suspect it worked when there were many fewer users)
- A "WiNUrs" BOF (Wireless Network Users) at either NANOG or IETF
- w/ a clinic to demonstrate dense deployment issues
(attractive nodes/sticky nodes/adhoc users/overloading failures/
association refused/etc.)
- & user education & tools
- meetings need a quick report mechanism that reports:
user mac address|OS (version)|card type & driver|settings
& an FAQ of known problems.
- users continue to send clear text passwords (Fenner data)
Lucy E. Lynch Academic User Services
Computing Center University of Oregon
llynch @darkwing.uoregon.edu (541) 346-1774
_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf