Internet Epidemiology

by Jacob Malthouse on March 20, 2007

figure14_snowmap.gif

In 1854 London Dr. John Snow, by mapping incidence of cholera, was able to make the connection to that city’s water system.

The recent New York Times Article “Research Tracks Down a Plague of Fake Web Pages” considers research by Microsoft and the University of California, Davis tracking down some of the roots of spam on the World Wide Web and Internet. Conclusions included:

The two top non-commercial TLD spam sources are .edu and .gov

Additional TLD spam sources are as follows:

Registry Percentage of spam
.com 4%
.org 11%
.net 12%
.biz 53%
.info 68%

Additional results of the paper included:

That for doorway domains, that the free blog-hosting site blogspot.com had an-order-of-magnitude higher spam appearances in top search results than other hosting domains in both benchmarks, and was responsible for about one in every four spam appearances (22% and 29% in the two benchmarks respectively, to be exact).

That over 60% of unique .info URLs in our search results were spam, which was an-order-of-magnitude higher than the spam percentage number for .com URLs.

That the domain topsearch10.com was behind over 1,000 spam appearances in both benchmarks, and the 209.8.25.150 ~ 209.8.25.159 IP block where it resided hosted multiple major redirection domains that collectively were responsible for 22-25% of all spam appearances.

That for aggregators two IP blocks 66.230.128.0~66.230.191.255 and 64.111.192.0~64.111.223.255 appeared to be responsible for funneling an overwhelmingly large percentage of spam-ads clickthrough traffic.

That for advertisers even well-known website ads had significance presence on spam pages.

{ 3 comments }

John McCormac 03.22.07 at 1:27 pm

Fascinating research. The .eu ccTLD is a good example of how such practices can nearly destroy a TLD. From work I’ve been doing recently mapping .eu domains and websites, some of the same activity can be seen in .eu ccTLD. The ineptly structured .eu registrar system and the incompetence of EURid in dealing with bogus registrations has reduced the credibility of .eu considerably. The onus was on the registrars to check the entitlement to a domain registration. Many did not. It is not uncommon to see complete US postal addresses along with an an obligatory EU country for some registrant details.

The top ten IPs for websites based, on a sample of approximately 2 Million .eu domains are as follows:

217.111.100.213 DE 68336 (United A&G)
212.227.34.3 DE 66538 (Schlund)
83.149.74.172 NL 51152 (Ovidio Syndicate)
85.186.159.109 RO 43649 (Cybersquatting/Parking)
212.79.243.140 NL 39617 (Blixem.nl)
216.34.131.135 AU 35980 (Fabulous.com)
62.149.128.40 IT 26553 (Technorail ISP)
195.110.124.133 IT 21070 (Register.it)
68.178.232.100 US 18012 (Godaddy parking page)
217.111.100.215 DE 17694 (United A&G)

Some operations are clearly parking/monetisation operations. Others are just where the website is pointing to the hoster’s “coming soon page”. But further on down the list, the patterns of parking on US monetisation sites becomes more apparent. The natural development of .eu websites has effectively been stunted by the sheer size of the parking problem.

jacob malthouse 03.28.07 at 4:14 pm

Hi John,

I’m not to sure about the mechanics about how this works and it’s impact on the Internet as a whole. I know Joi Ito was looking into it some time back. It would be interesting to see a concerted global approach to collating and developing this type of research. Do you know of any groups who are working on this at the moment?

Marc 09.08.07 at 9:23 am

I had my blog on a .info site for years, but when my friend’s SpamAssassin flagged my email for having a .info TLD, that was the last straw. About 10 minutes ago, I registered a .com site.

Comments on this entry are closed.