The second thing on my mind regards the role of a control condition
in experimental design. I wonder if I've done an effective job of
articulating just how important it is to make sure that the *only*
systematic difference between (an) experimental group(s) and a
control group is/are the independent variable(s). I'm thinking here
of the posting that suggested that control group addresses should
open spam, click-thru, etc.
*Exactly the opposite is true*. A control group is just like a
"sugar pill" condition in a drug study. The "job" of a control group
is to show the effects of doing *nothing*. Now, if one is interested
in testing the effects of click-thru, it's easy to add multiple
experimental conditions. But under no circumstances should the
control group "do" anything (except "sit there").
In the original thread wasn't this all to do with the difference between
permanent negative responses to spam and other (positive responses)?
The independent variable in this case is "550ing" / "not550ing" and
consequently both sets should behave otherwise identically.
It might seem that if the 550 set opens, clicks-thru, any spam received,
then so should the not550 set. But this assumes that both sets are
receiving the same spam. Which is where we came in (i.e. that's what we're
trying to determine).
It's clearly simpler if both sets behave identically by doing
nothing of the sort. Neither set should open, click-thru (or whatever).
There's a practical issue here, to get decent volumes of spam, it may be
thought useful to make both sets avid for spam, for a period before the
study starts, by accepting mail for both sets and clicking, replying, you
name it. Once the study starts it will be conducted as above. However, this
gives us two groups, one with stable 550 behaviour (not550->not550) and one
in which we have changed the behaviour (not550->550), so the hypothesis is
somewhat different.
(A) In the first instance, the null hypothesis is that 550ing does not give
us less spam.
(B) In the second instance, the null hypothesis is that applying a 550
strategy doesn't reduce spam.
These both seem perfectly good experiments, I don't know which one we
should be doing. Although "A" seems easier to implement, it may not produce
big enough numbers.
Or have I got this wrong?
--
_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg