ietf-smtp
[Top] [All Lists]

Re: [ietf-smtp] Dombox - A Zero Spam Mail System

2019-10-04 22:59:48
Viruthagiri,

What would you like to be extracted from your extensive work? Do you have a protocol in mind? a method? an Extended SMTP proposal? a rule to be applied at SMTP? At the MSA, the MDA, the MFA (Mail Filtering Agent) or the MUA?

There are SMTP documents for Sender Rewriting rules, to rewrite the reverse-path. I recall one was used with the idea to circumvent the SPF Node Transition problems when the IP changes but the reverse-path domain does not. The reverse-path is a persistent identity.

Keep in mind that SMTP does not do what your MUA is doing. SMTP is about the transport of mail. The filtering, the classification, the RBM, the once patented concept "Rule-Based Messaging," is generally an implementation (product) specific thing and would normally be something that separate us.

What you can do, in my opinion, is write an individual draft proposal with a status of "Informational" that describes your method. If you can isolate it to a general algorithm without mentioning your product and implementation method, i.e., using Chrome extensions for your UI, which makes it a web base UI, but your protocol, your rules, would be independent of that UI.

This would be one way I see some traction with this. Give us protocol rules for handing the SMTP Process Entities which are:

CIP - Connection IP
CDN - Client Domain Name (presented at EHLO/HELO)
RP  - Reverse Path
FP  - Forwarding Path
DATA  RFC5322 payload

Those are the SMTP process parameters. Present a proposal for using these parameters for the betterment of email-kind.


--
HLS

On 10/2/2019 1:59 AM, Viruthagiri Thirumavalavan wrote:
Sorry for the late reply. There was a family function which I had to
attend and I couldn't find enough time to respond.

    So how does your system classify this email you're reading right now?
    Is it spam, or bulk mail, or something the user wanted?


We decide whether it's a human-to-human mail or website-related-mail
based on the RCPT TO address type. Not based on sender.

MAIL FROM:<someuser(_at_)acme(_dot_)com <mailto:someuser(_at_)acme(_dot_)com>>
250 OK
RCPT TO:<quora(_dot_)com(_at_)test123(_dot_)example(_dot_)com
<mailto:quora(_dot_)com(_at_)test123(_dot_)example(_dot_)com>>
250 OK
RCPT TO:<normal(_at_)example(_dot_)com <mailto:normal(_at_)example(_dot_)com>>
250 OK
RCPT TO:<restricted(_at_)example(_dot_)com 
<mailto:restricted(_at_)example(_dot_)com>>
250 OK

Mail will be accepted for quora(_dot_)com(_at_)test123(_dot_)example(_dot_)com
<mailto:quora(_dot_)com(_at_)test123(_dot_)example(_dot_)com> only if acme.com
<http://acme.com> is whitelisted in _sad.quora..com
<http://sad.quora.com> => "v=sad1 acme.com <http://acme.com> -all"
Mail will be accepted for normal(_at_)example(_dot_)com
<mailto:normal(_at_)example(_dot_)com> without any issues.
Mail will be accepted for restricted(_at_)example(_dot_)com
<mailto:restricted(_at_)example(_dot_)com> only if acme.com <http://acme.com> is
a "verified stranger" [MX, SPF, A, +SPF]

So we decide how to proceed based on the RCPT TO address.

The owner of "restricted(_at_)example(_dot_)com 
<mailto:restricted(_at_)example(_dot_)com>"
agrees that he/she is gonna use that email address only for
"human-to-human" mails when they enable restricted mode.

While I can warn end users when they enable restricted mode, I can't
police them. What happens when the end user give the address
"restricted(_at_)example(_dot_)(_dot_)com 
<mailto:restricted(_at_)example(_dot_)com>" to a mailing
list like ietf and enabled restricted mode? I definitely don't want to
send challenge mails to a big mailing list. That's why I proposed
looking for "List-Unsubscribe" headers to detect non-human mails.

Unlike other C/R based system, we actually send our challenge mails to
the MAIL FROM address. In ietf.org <http://ietf.org> mails, MAIL FROM
address is "ietf-smtp-bounces(_at_)ietf(_dot_)org
<mailto:ietf-smtp-bounces(_at_)ietf(_dot_)org>". So in most cases, our challenge
mails not gonna cause any issues.

    That breaks one of the big advantages of having a fixed address -
    recognizability
    by others.  The address I'm using in this email has been in use
    for at least 2 decades
    now, and allows others to tell that the "me" posting on this list
    is the same "me" posting
    to a Linux mailing list.
    That's why '+' addressing was invented.


You are going to create an isolated mail address only for
website-related-mails. The proper term for "website-related-mails" in
our system is "service-mails". A service is identified via a domain.
e.g. facebook.com <http://facebook.com>

When you give an isolated mail address to a service, generally you
don't care about the issue "recognizability by others". You need this
functionality only in special cases like mailing list, public
directory website etc.

    1) Many users will *not* think to create a new address for each
    mailing list
    before they subscribe.


Only a tiny percentage of email users use mailing list. Our isolated
mail addresses are created for a service domain. IETF has multiple
mailing lists like SMTP, UTA etc. You don't have to create a separate
mail address for each mailing list. You create only one address here
which is tied to the domain ietf.org <http://ietf.org>

Your statement already says few users are okay with that. These few
users are security conscious folks. So the real question is "How can
we educate the rest?"

I have checked your email address "valdis(_dot_)kletnieks(_at_)vt(_dot_)edu
<mailto:valdis(_dot_)kletnieks(_at_)vt(_dot_)edu>" in haveibeenpwned.com
<https://haveibeenpwned.com/> Your email address is already found in
10 breaches. You either used the same password for all those 10
breached sites or used a unique password for each site. In the latter
case, you relied on a software to remember passwords.

If creating a unique password seems normal to you, then why not
creating a unique email address seems abnormal? Probably because you
don't see the benefits yet. My paper clearly talks about the benefits
of having a unique email address for each website.

My white paper is a plan for the next 10 years. One cannot change the
world overnight. If you take a look at Everett Rogers's "Diffusion of
innovations" theory, he classified the adopters like: innovators,
early adopters, early majority, late majority and laggards.

He defined them like this.

Innovators:
Innovators are willing to take risks, have the highest social status,
have financial liquidity, are social and have closest contact to
scientific sources and interaction with other innovators. Their risk
tolerance allows them to adopt technologies that may ultimately fail.
Financial resources help absorb these failures.

Early adopters:
These individuals have the highest degree of opinion leadership among
the adopter categories. Early adopters have a higher social status,
financial liquidity, advanced education and are more socially forward
than late adopters. They are more discreet in adoption choices than
innovators. They use judicious choice of adoption to help them
maintain a central communication position.

Early Majority:
They adopt an innovation after a varying degree of time that is
significantly longer than the innovators and early adopters. Early
Majority have above average social status, contact with early adopters
and seldom hold positions of opinion leadership in a system

Late Majority:
They adopt an innovation after the average participant. These
individuals approach an innovation with a high degree of skepticism
and after the majority of society has adopted the innovation. Late
Majority are typically skeptical about an innovation, have below
average social status, little financial liquidity, in contact with
others in late majority and early majority and little opinion leadership.

Laggards:
They are the last to adopt an innovation. Unlike some of the previous
categories, individuals in this category show little to no opinion
leadership. These individuals typically have an aversion to
change-agents. Laggards typically tend to be focused on "traditions",
lowest social status, lowest financial liquidity, oldest among
adopters, and in contact with only family and close friends.

So in the early days, we have to focus only on the Innovators
and Early adopters. They will be probably okay with creating a unique
mail address as long as they see the value. Then we have to improve
the user experience to bring rest of the adopters on board.

    1a) Sometimes, it's not directly under the user's control.  For
    example, I end
    up using the same address for email from Scouting USA National,
    and from my
    local troop (because the troop roster is fed from National's
    records..)


Restricted mode is an optional feature. You don't have to turn it on.
Also my system supports multiple mailboxes. You can enable restricted
mode only to selected mailboxes.


    2) You need to Do The Right Thing if somebody you've never
    interacted with does
    an off-list reply to a post to a list.


If necessary, We can provide an option like "Allow Off-List Replies"
in the mailbox settings.. If enabled, the system would treat
non-service mails like human-to-human mails. e.g. You create a dombox
address for ietf.org <http://ietf.org>. The isolated mail address
looks like ietf(_dot_)org(_at_)test123(_dot_)(_dot_)example(_dot_)com
<mailto:ietf(_dot_)org(_at_)test123(_dot_)example(_dot_)com>

ietf.org <http://ietf.org> - The system allows all mails from ietf.org
<http://ietf.org> + its alias domains by default.
When "Allow Off-List Replies" mode is OFF, it reject all other domain
mails.
When "Allow Off-List Replies" mode is ON, it treats all other domain
mails just like human-to-human mail. i.e. Sender may have to fill
CAPTCHA to deliver mails.

    3) If you want your scheme to work in the real world, it needs to
    play nice
    with browser form autofill.


Page 58 of my paper says,


    If the second structure is compatible with 99.99% of the domains,
    why do we
    need the first one? Why not just stick with the second?
    Well… We need the first structure to provide good user experience.
    Since the first address structure starts with the {Domkey}, we can
    offer autocomplete feature.
    When a user trying to login to a third party website, instead of
    typing the whole
    email address like “testkey123$twitter(_dot_)com(_at_)domboxmail(_dot_)com
    <mailto:twitter(_dot_)com(_at_)domboxmail(_dot_)com>”, the user now can type
    “testkey123$” and then press tab.
    Our autocomplete feature will behave like this.
    Form Email Field => Alphanumeric characters + The dollar symbol +
    Tab key press =
    capture the current domain and then autocomplete the isolated
    email address
    Although we can add “Autocomplete” feature in subdomain-based
    structure too, it won’t
    be as good as Dollar-based structure.


I was already focusing on the user experience issues. That's the
reason why we have dollar-based address structures.

Password managers like 1Password offers extensions
<https://chrome.google.com/webstore/detail/1password-extension-deskt/aomjjhallfgjeglblehebfpbcfeobpgk>
for browsers. We can bring such extensions to our system. Password
manager focusing on the "password" field. We are focusing on the
"email" field. Our address structures are not based on random string.
It follows a standardised structure. So you don't have to rely on
browser extensions to remember our email address.

Besides our system already offers two buttons to provide better user
experience. Teleport and Telescribe. Teleport and "SignIn With Apple"
functionality are the same. So Teleport button focus on the user
signups/login without remembering email address and passwords.
"Telescribe" focus on the email newsletter subscriptions.

    So... is there actually an architecture or proposal in that 300
    page document,
    or is it just a cargo-cult collection of things that people have
    used in the
    past, to varying levels of success?


This is what you posted 8 months back when I asked for feedback.

    /(a) every aspect we could understand from your writing has
    already been tried and failed, and (b) you've repeatedly proven
    that you're totally unaware of the state of the art on both the
    spammer side and the anti-spammer side.. Oh, and (c) you appear to
    be totally unaware of just how little you know./


It's been 8 months. Nothing changed much. You still have no idea
what's in my paper. Yet posting insulting comments. I'm surprised that
you are coming from an educational background. "Spray and Pray" is not
a quality of a teacher.

To answer your question, If my words have any value, you would have
already gone though my paper.

    And are there any production workload numbers showing that it (a)
    scales and
    (b) actually gets anywhere closer to "Zero" spam than most
    providers are
    already managing?


I don't have any numbers. But I have an outdated prototype video
<https://www.youtube.com/watch?v=VK2eSfCurx4> which I uploaded last
year. That video may not make any sense unless you read my white paper.

PS: I have gone through the fast-flux technique.

To quote wikipedia, "The basic idea behind Fast flux is to have
numerous IP addresses associated with a single fully qualified domain
name, where the IP addresses are swapped in and out with extremely
high frequency, through changing DNS records".

Since my system is a domain-based reputation system, fresh domains
will be rate limited. Aged domains probably will be blacklisted
quickly in the fast-flux technique.

On Mon, Sep 30, 2019 at 3:47 AM Valdis Klētnieks
<valdis(_dot_)kletnieks(_at_)vt(_dot_)edu 
<mailto:valdis(_dot_)kletnieks(_at_)vt(_dot_)edu>> wrote:

    On Sun, 29 Sep 2019 11:27:15 +0530, Viruthagiri Thirumavalavan said:

    > You are talking about bulk mailing here. Our "injection" phase is not
    > designed for bulk mails. That's why we throw this warning when the user
    > enabled restricted mode.

    So how does your system classify this email you're reading right now?

    Is it spam, or bulk mail, or something the user wanted?

    > I want them to create an isolated mail address which would look like
    >ietf(_dot_)(_dot_)org(_at_)test123(_dot_)example(_dot_)com
    <mailto:ietf(_dot_)org(_at_)test123(_dot_)example(_dot_)com> and then give 
this address
    while signing up to
    > ietf mailing list.

    That breaks one of the big advantages of having a fixed address -
    recognizability
    by others.  The address I'm using in this email has been in use
    for at least 2 decades
    now, and allows others to tell that the "me" posting on this list
    is the same "me" posting
    to a Linux mailing list.

    That's why '+' addressing was invented.

    A few user experience issues:

    1) Many users will *not* think to create a new address for each
    mailing list
    before they subscribe.

    1a) Sometimes, it's not directly under the user's control.  For
    example, I end
    up using the same address for email from Scouting USA National,
    and from my
    local troop (because the troop roster is fed from National's
    records..)

    2) You need to Do The Right Thing if somebody you've never
    interacted with does
    an off-list reply to a post to a list.

    3) If you want your scheme to work in the real world, it needs to
    play nice
    with browser form autofill.

    So... is there actually an architecture or proposal in that 300
    page document,
    or is it just a cargo-cult collection of things that people have
    used in the
    past, to varying levels of success?

    And are there any production workload numbers showing that it (a)
    scales and
    (b) actually gets anywhere closer to "Zero" spam than most
    providers are
    already managing?

--
Best Regards,

Viruthagiri Thirumavalavan
Dombox, Inc.



_______________________________________________
ietf-smtp mailing list
ietf-smtp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf-smtp