[Glyph-discuss] meta spam

Glyph Lefkowitz glyph-discuss@twistedmatrix.com
Fri, 15 Nov 2002 14:03:15 -0600 (CST)


----Security_Multipart(Fri_Nov_15_14:03:15_2002_747)--
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


Lately I've been getting a lot of e-mail forwards and IMs about UCE.  Rather
than respond to each one individually I figured I'd post a little something
here.

Spam is a solveable problem.  It works because the cost to receive email is
entirely with the receiver, and not the sender.  Despite the stories of
incredible wealth that results from this (all media outlets seem to focus on
spammers who retire before 35) spammers' margins are razor-thin.  They make
money because their cost of operations is effectively zero.

Let's say a spammer sends out a hundred messages, which takes about 1 second on
their DSL (which they are buying for home use anyway, so it effectively costs
nothing beyond their normal cost of living) and they get 0 responses.  So, they
try again: this time they send out a hundred million messages: it takes a
little longer... maybe a few days, if they've got a slow connection.  Still,
they've basically given up nothing, since they were not using their bandwidth
for that day.

Now they get 500 responses to buy a $50 product: that's $25,000, in the space
of a few days.  This is a vanishingly small rate of return: about 0.0005%.
Still, it's half a year's salary, in one mailing.  Most of those messages don't
have to get through, in order to justify that kind of payback, so the spammers
will keep spamming.

Once the transaction of sending a message costs them anything, anything at all,
multiplying it by a hundred million becomes a serious enterprise.  For example,
if it cost a penny, it would cost a million dollars to do that mailing: far
more than the $25,000 that they recouped.  Charging a penny has problems with
it, since financial transactions on the 'net are not trivial, but if you can
translate that cost into, say, a minute of human time, you need somebody
working for $5/hr to sit there and do whatever task it is that takes a minute.
That person has to work for a hundred million minutes (or about 190 years) to
get out the mailing.  Interest in the product will most likely have declined by
2192.

This is old news, though.  More interesting is the economic model behind spam
*prevention*, and the lack thereof.  In reality, it is in no-one's interests to
effectively and permanently block spam.

The reason SPAM exists in the first place is that there is a basic flaw in the
internet mail infrastructure.  I believe it's possible (ahem, with Twisted) to
solve this problem pretty handily.  In order to demonstrate why I think the
infrastructure is not hard to fix, I will outline my design for a solution.  It
is an email server with a whitelist that works according to a few simple rules:

   1. Your "main" email address always bounces messages.  The bounce goes
      something like this: "Hi!  This is the first time you've sent mail to
      glyph@no-spam.com.  Please visit http://no-spam.com/email/glyph and enter
      your authentication info here."  On that URL is a fairly simple
      non-machine-readable image with some pseudo-random noise on it that makes
      it dirty enough that OCR doesn't work, and a text-entry field.  The user
      is prompted to copy the sequence of letters and numbers into that field,
      proving that they are in fact a person.

   2. Every user that goes through this approval procedure gets a different
      semi-secret email address for you.  So, you don't send email to
      glyph@no-spam.com, you send email to x884nfygj2@no-spam.com.  Assuming
      you have a semi-sane addressbook client-side you can't tell the
      difference: you just type "glyph" and it works.  This means that there is
      sanity checking, and spammers can't simply discover either an address
      that sends mail to you or an email address you have listed: they have to
      match them up.

   3. There is one exception to the main-email-address bounce: any message that
      is correctly signed by a PGP key that a trust-path can be calculated to
      automatically gets through.

#1 prevents any automated messages from getting through in the simple way they
do now.

#2 allows you to still get notifications from stupid web services that will
never provide any security (e.g. amazon).

#3 makes secure email more convenient than non-secure email: when all servers
(and services) obey this rule, then there will be no need for either of the
first two (though it may be necessary to issue time-limited addresses for
obnoxious stores that you nevertheless want to order from).

So let's say I'm right and this plan could in fact eliminate all spam on the
internet.  Why isn't anybody doing it?  I am sure that I'm not the only person
who has figured it out.

The answer is simple: if we fixed the internet mail infrastructure in this way,
we would be destroying not one, but *two* industries.  Spammers would obviously
be hurt, but less obvious is that spam-*blockers* would be hurt.  If
NoMoreSpam, Inc charges you $5000 for a server blacklist, they're sure as heck
not going to want to give you a whitelist for free.  And, unless a few
offensive spams a year get through their blocking software, you'll forget about
Spam-blocking entirely and their service will seem much less important.  With a
SPAM epidemic, spam blocking is a highly competitive industry.

This is similar to the problem with virus-protection software.  Most people
with a clue about computer security know that the fact that viruses exist is
largely due to problems with the operating system.  The operating system
manufacturers don't have an economic incentive to fix the OS: that's not what
people pay them for.  They get paid on the basis of features and network
effects, not security.  The reason that Linux is generally more proof against
viruses than Windows owes more to the weakening of these economic effects than
any real benefit to the system's design.

In a world where the operating system core is free and anyone could fix it,
virus protection companies are similarly not motivated to provide fixes for the
core OS; it is far more profitable to release new virus definition files every
few months.

In both cases, a group of experts is profiting by creating large databases
about a problem and selling the database, rather than fixing the problem.  On
the face of it, that might sound fraudulent, but can you really blame them?
Creating a real solution would instantly devalue all of their expertise, and
they would have to find other jobs.  In fact, the people with the expertise to
really fix the problem are probably employed by an industry that works under
this economic model, so there's no way for them to develop a real solution and
get their company to accept it.

DISCLAIMER: this is just a theory.  If my as-yet-unverified assumption that
spam is *easily* and reliably blockable turns out to be false, then it's really
a problem of not being able to fund the research because the problem isn't big
enough yet.  I think that the solution will cost about $50,000 to build and
deploy on a reasonable scale, so it's not "trivial" by the standards of an
individual's budget, although it's certainly less than the amount of money
spent on spam prevention in the USA every year.

The current mail situation on the 'net isn't a good one though, and it's the
consumer's responsibility to demand a real solution to the problem rather than
a stream of half-measures.

And now back to the code-mines for me.  I really didn't have the time today to
write this email :-)

-- 
 |    <`'>    |  Glyph Lefkowitz: Traveling Sorcerer   |
 |   < _/ >   |  Lead Developer,  the Twisted project  |
 |  < ___/ >  |      http://www.twistedmatrix.com      |

----Security_Multipart(Fri_Nov_15_14:03:15_2002_747)--
Content-Type: application/pgp-signature
Content-Transfer-Encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQA91VMFvVGR4uSOE2wRAuhNAJ93b0HN0gLx8tOGkpaaJPDPbCLnRgCeLOXQ
FqGlOPgeAVqBDJzEomtAzNo=
=3x3W
-----END PGP SIGNATURE-----

----Security_Multipart(Fri_Nov_15_14:03:15_2002_747)----