Re: Gen-art review of draft-hartman-mailinglist-experiment-01.txt

"Elwyn" == Elwyn Davies <elwynd(_at_)dial(_dot_)pipex(_dot_)com> writes:


    Elwyn> I was selected as General Area Review Team reviewer for
    Elwyn> this specification (for background on Gen-ART, please see
    Elwyn> http://www.alvestrand.no/ietf/gen/art/gen-art-FAQ.html).

Hi.
I'm sorry it has taken me so long to get back to your comments.
I've been busy trying to clear documents before the Dallas IETF.


    Elwyn> Summary: [Note: This is the first time that I have done a
    Elwyn> Process Experiment review and I will have to stretch the
    Elwyn> usual review criteria a bit.  Basically I believe I should
    Elwyn> be looking for internal self-consistency, consistency with
    Elwyn> associated processes and likelihood of damage to the
    Elwyn> functioning of the IETF.]

That seems like a good approach.  I'm also doing this for the first
time so your cooperation in helping me is greatly appreciated.
    Elwyn> I think this draft is NOT currently in a suitable state to
    Elwyn> produce a well-defined experiment.  The main reason for
    Elwyn> this conclusion is that the experiment consists of enabling
    Elwyn> a meta-mechanism and suggesting something the IESG might
    Elwyn> possibly do with this power rather than explicitly stating
    Elwyn> that this is what the IESG should put in place.  

I'd like to focus on your general comments about the draft not being
ready.  I think we can come back to specific textual comments after
we've reached general agreement.  I'd like to first take your issue of
the IESG charter and then come back to the meta issue that all this
experiment does is enable the IESG to do things.


    Elwyn> My reading
    Elwyn> of s7.2 of the IESG Charter [RFC3710] is that the IESG has
    Elwyn> the power to do this sort of thing already anyway.  

Note that the IESG charter is an informational document.  It cannot
give the IESG power that it does not otherwise have.

It was not clear to the participants on the IETF list that the IESG
has the power to create mechanisms between 3683 and 3934 without a BCP
or experiment.  If the consensus of the IETF community is that the
IESG already has that power and that the IESG's power is already well
documented then I will withdraw my draft.  My personal opinion is that
the power is not well documented and thus is subject to higher
probability of successful appeal.

    Elwyn> Were
    Elwyn> the suggested mechanisms eventually adopted, I would have
    Elwyn> some qualms about the possibility of indefinite bans being
    Elwyn> possible without allowing a wider (possibly IETF as opposed
    Elwyn> to IESG) consensus, but that point is currently moot as the
    Elwyn> actual proposals that would be put in place are not
    Elwyn> specified by this document.

I don't think the point is moot.  If there are specific limits on the
IESG's power that should be put in place, here is the place to do it.
Alternatively when we evaluate the experiment we could decide
additional limits are needed.

But let's come back to the question of whether meta-experiments are a
good idea.  I think that in order for 3933 to be a valuable tool many
of the experiments are going to be meta-experiments.  So let me first
explain why I think that's the case and then discuss how to evaluate a
meta experiment.


The primary reason you want to encourage meta-experiments is that a
lot of the hypotheses you want to test involve delegation.  For
example I want to test the hypothesis that the right way to solve the
mailing list mess is to delegate it to the IESG.  I could delegate it
to the IESG as part of an experiment along with a initial procedure.
But if I do that I'm testing a different hypothesis: is delegating
something to the IESG with the whole IETF designing the initial
conditions a way to solve the problem.  As you might imagine that's a
different hypothesis.  Since I'm on the IESG I'm actually in a
reasonably good position to negotiate an initial procedure that the
IESG will be happy with and that would be similar to a procedure the
IESG would come up with on its own.  However we want 3933
experiments--even experiments delegating things to the IESG--to be
documents that anyone can write.  So we should require that authors of
3933 experiments demonstrate stakeholder buy-in for experiments but
not require that they take actions as if they were the stakeholders.


The second reason that you want to allow meta-experiments is that we
want to encourage RFC 3933 as the first step in process change.
Process change often results in BCPs.  You want the 3933 experiment to
be reasonably similar to a BCP so that when appropriate you can easily
convert a successful experiment into a BCP.  You would probably
replace any evaluation criteria with results of the evaluation,
replace the sunset clause with something else.  However you want the
operative language to remain the same.  A significant result of the
mailing list discussion is the concern that our BCPs are too specific
and encode operational details.  If you require that meta-experiments
are not allowed you strongly push us in the direction of
overly-specific BCPs.  I think that would be a very bad idea.


Finally, we want 3933 experiments to be easy to write.  One of my
personal goals with this particular experiment was to see how easy I
could make it to write the experiment.  I think we want to come away
from this process with the conclusion that writing the document is
easy.  The hard part of process change should be building consensus,
recruiting stakeholders, educating the community and actually trying
to use a process to do superior technical work.  We should not make it
hard for people who clearly know what they want to try to express it.
So I'd like to resist the temptation to raise the bar for experiments
beyond what is necessary.  Bars I'm asked to meet will probably be at
least as high as future experiments.

In conclusion, the hypothesis I'm testing is meta, so my experiment is
meta.  i think allowing this is desirable because it allows us to test
the hypothesis, begins to align with an eventual BCP if the test is
successful and supports the cultural engineering goal of making
experimentation the preferred direction for process change.

You are absolutely right that we need to evaluate this at the end.
I'll first point to RFC 3933 and remind you that specific criteria for
evaluation are not required.  In that case the basic evaluation will
be based on community reaction.  Let us take a look at how that might
work in this case.

One possibility is that the IESG adopts no procedures based on this
experiment.  That sounds like a failed experiment to me: why is the
power needed if never used.  There might be some arguments that having
a power prevents future problems, but these arguments would be
unlikely to be compelling.

Another case is that the IESG adopts some procedures and these
procedures never get used.  We'd actually learn a lot in this case.
The IESG would get a feel for whether this specification was clear
enough to describe what procedures could be adopted and what steps
need to be followed to adopt a procedure.  It's a strong
understatement to say that there tend to be rough edges around any
process document the first time it is run.  There are things I'd word
more clearly in 3683, 3934, 3933, 3932 and other process documents.
These tend to pop out the first time you try and use them.  We would
get a significant part of that feedback from just adopting procedures
under this experiment.  We'd need to have a community discussion about
whether we wanted to extend the experiment, codify it in a BCP or drop
it.  If the reason no procedures were used was that no problems popped
up on any mailing lists, I'd argue against dropping it.  If the
community felt comfortable, we could publish a BCP.  Otherwise we
could renew the experiment.  If the discussion proved contentious then
the next round of the experiment would probably need more evaluation
criteria.


[As an aside, RFC 3933 seems to have a rough edge that you cannot
renew an experiment without publishing a new RFC.  I'm not at all sure
that's desirable.  However if you end up needing to add evaluation
criteria most times you renew and experiment, perhaps that is OK.]

If procedures are adopted and used, I think we're in a reasonable
position to evaluate the results.  I know one criteria I'd apply: was
the situation less disruptive to the work of the IETF and the IETF
leadership than an RFC 3683 action
If so, then we made progress.  Others might have different criteria to
apply.  One possible outcome of the evaluation is that we need to
renew the experiment with explicit criteria because we cannot agree on
an evaluation.  Another is that the experiment should be the basis for
a BCP.  Another is of course that the experiment made things worse and
should terminate.

In conclusion I've proposed evaluation criteria that I believe cover
the possible outcomes.  These criteria are fuzzy.  I believe it is no
more fuzzy than if a specific procedure is included.  Clearly such
fuzzy criteria are allowed by RFC 3933.

All that said, if the community wants changes then the draft will
change.  We're looking for an IETF consensus here.  If there is
significant demand for specific evaluation criteria we can include
those.  If there is a community requirement for a specific initial
procedure, I can include one.  I think that particular change would
set a harmful precedent but would make the change if that is what the
community wants.  However, having explained that I'm explicitly trying
to experiment with delegation and having proposed possible ways of
thinking about evaluating the result,I would want to see a much
stronger call for change than one review.  If the experiment actually
isn't viable then one comment should be sufficient.  However I don't
think that is the case.




    Elwyn> Review:

    Elwyn> s1: I think that this section of the last paragraph of s1
    Elwyn> overstates/mis-states what this document actually does:
    >>  This memo is an RFC 3933[RFC3933] experiment to provide the
    >> community with additional mechanisms to manage its mailing
    >> lists while the community decides what mailing list guidelines
    >> are appropriate.  IN particular this experiment creates a level
    >> of sanction between RFC 3934 and RFC 3683 for working group
    >> lists and creates sanctions other than RFC 3683 for
    >> non-working-group lists.
    >> 
    Elwyn> In practice all that s4 mandates is:
    >>  During the experiment period, the IESG MAY approve other
    >> methods of mailing list control besides those outlined in RFC
    >> 3683 and RFC 3934 to be used on a specified set of IETF mailing
    >> lists.
    >> 
    Elwyn> i.e., it enables a meta-mechanism.  s4 then goes on to
    Elwyn> suggest some things the IESG *might* adopt (second half of
    Elwyn> para 1 and all of para 2 of s4) and the procedures that it
    Elwyn> must go through to get them adopted.  The result of this is
    Elwyn> (in the first instance) merely the provision to allow the
    Elwyn> IESG to decide to do something.  I don't think this
    Elwyn> constitutes a well-formed experiment.


Often, the community creates mechanisms by delegating the power to
someone.  I'd be interested in other comments on whether the last
sentence of S1 over states things?

I think I'd rather hear your responses to these general comments
before going through the rest of your detailed comments.

--Sam


_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf