Definitive Guide to Removing All Google Analytics Spam

This is a PROVEN WORKING SOLUTION with filter expressions updated regularly! There are a lot of partial solutions and misinformation out there about clearing out so-called referral spam (and organic search and event spam too), so here’s the Definitive Guide to removing all of that junk! This article has been constantly updated since January 2015 and has helped over 250,000 people clean up their Google Analytics reporting.

All the information you need is provided below in this step-by-step guide, but you will need to update the filters as they change. There is also a free utility from Quantable.com that implements filters similar to what is described here, but you will need to update them over time.

Don’t Want To Deal With It? 12 months spam-free: only $75
I will install and maintain the filters! Have a professional do it!


The Google Analytics Referral Spam Solution

The process in summary, proven effective for over 18 months:

  1. Have a new website? Use a ‘-2’ or higher property
  2. Implement a Valid Hostname Filter to eliminate ghost visits
  3. Implement Spam Crawler Filters to eliminate the targeted spam visits
  4. Create a Custom Segment with these filters to use for reporting
  5. Turn on Google’s bot & spider filter option

Urban Myths and Bad Advice:

  • DO NOT use the Referral Exclusion List – Why?
  • Google Analytics bounce rate DOES NOT affect search rankings
  • Using .htaccess rules or WordPress plugins will NOT eliminate any of the ghost referrals

This is a long article — I have included a lot of background and detail to explain why this solution is so effective, and to dispel a number of urban myths around referral spam. Heed the advice to set up an unfiltered view before you begin — there is no recovery from a bad filter; do not risk your analytics to a typo.

Share on LinkedInTweet about this on TwitterShare on FacebookPin on PinterestShare on Google+Email this to someone


NEW: spam now appears as organic search keywords
like ‘www.get-free-social-traffic.com’ and ‘eu-cookie-law.info’

Most Google Analytics spam is created by ‘Ghost Visits‘ that can be prevented
with a single ‘Valid Hostname’ filter described below. 

Top spammers past 30 days (2016-07-11)

GHOST REFERRALS:
bestchoice.cf
bestofferswalkmydogouteveryday.gq
botd.wordpress.com
eu-cookie-law.blogspot.com
exchangeit.gq
forum.topic#.ilovevitaly.xyz
free-share-buttons.blogspot.com
free-share-buttons-aaa.xyz
free-share-buttons-bbb.xyz
free-share-buttons-ccc.xyz
free-share-buttons-ddd.xyz
free-share-buttons-eee.xyz
free-share-buttons-fff.xyz
kiwi237au.tk
law-enforcement-check-nine.xyz
law-enforcement-five.xyz
law-enforcement-four.xyz
law-enforcement-nine.xyz
law-enforcement-seven.xyz
law-enforcement-six.xyz
law-enforcement-three.xyz
law-five.xyz
law-four.xyz
law-six.xyz
law-three.xyz
law-two.xyz

luxmagazine.cf
monetizationking.net
pokemongooo.ml
ranking2017.ga
site-auditor.online
social-buttons-bbb.xyz
social-buttons-ccc.xyz
social-buttons-ddd.xyz
social-buttons-eee.xyz
social-buttons-fff.xyz
social-buttons-ggg.xyz
social-buttons-hhh.xyz

ORGANIC KEYWORDS:
eu-cookie-law.info eu cookie law
sharebutton.org share buttons
eu cookie law eu-cookie-law.info
share buttons sharebutton.org

SPAM CRAWLERS:
checkpagerank.net
fix-website-errors.com
free-video-tool.com
seo-2-0.com

A full historical listing is included at the end of the article.

Share on LinkedInTweet about this on TwitterShare on FacebookPin on PinterestShare on Google+Email this to someone

Background: The Many Faces of Referral Spam

spam-201606The problem of fake references in Google Analytics has changed significantly over the past 18 months. In 2014, we had some bots from semalt and buttons-for-website that visited your website and left fake referrals in your analytics. In December 2014, the ilovevitaly attacks began, taking advantage of a weakness in Google’s new Measurement Protocol that allowed direct attacks on the Google Analytics tracking servers without having to actually visit your website. The attacks rotated quickly through many referral sources, leading to a lot of confusion in the industry.

Ranksonic joined in on the fun in March 2015, and they all enjoyed playing with new domains and techniques. We have had fake organic search terms (www.get-free-social-traffic.com) and fake events (event-tracking) injected into our analytics, too. Enterprising individuals popped up on Fiverr offering hundreds or even thousands of visits from real webmasters obtained using these techniques.

At the beginning of 2016, there were still lots of players pushing through Google’s defenses, even if only for a few days (see image). As of June, 2016, the trend continues with most ghost spam changing in a few days, but the crawlers seem to run for months.

Why hasn’t Google stopped it all yet? Well, they have said they are working on it, but it is a tough fight. Trust me — it would be a LOT worse than it is if they were doing nothing. They just don’t talk about what they are doing, and that is a widely adopted security best practice: never talk about what protections you have in place. It makes it harder for the bad guys to work around your protection systems.

Share on LinkedInTweet about this on TwitterShare on FacebookPin on PinterestShare on Google+Email this to someone

 Preface: Follow Best Practices

Before you start hacking away at your Google Analytics settings, the best practice for implementing new filters starts like this – create an UNFILTERED VIEW

create-new-view-in-ga

1. Make sure you always have an Unfiltered view in your property — that has absolutely zero filters.

2. Don’t implement it immediately in your main view. Create a new Test view that mirrors your main one in every other respect, and then add the filter(s) there first.

3. If you’re happy with the new filter based on this test, then go ahead and implement it in your main view.

Share on LinkedInTweet about this on TwitterShare on FacebookPin on PinterestShare on Google+Email this to someone

1. New Website? Use a ‘-2’ Property

I coined the term “ghost referral” to identify the worst offenders like darodar because they actually NEVER VISIT YOUR SITE. Using some software magic, they post fake hits to Google’s tracking service using a random series of tracking IDs. When they pick a series that includes your tracking ID, Google records a referral visit from their source in your reports.

When you create a Google Analytics Account, you also create a Property in the Account, and a View in that Property. The Property gives you the tracking id (e.g. UA-1234567-1) that you use to in the code snippet on your website. You can create 50 Properties in your Account, and they are given -1, -2, 3, … -50 extensions. Most ghost referral spam hits the default ‘-1’ Property.

You can significantly reduce the spam simply by creating and using a second, third or fourth (or tenth) Property. You don’t have to use them all. Changing your tracking code will lose any previous traffic, so this is really only useful for new websites or if you are willing to abandon your old data.

new-property

Share on LinkedInTweet about this on TwitterShare on FacebookPin on PinterestShare on Google+Email this to someone

2. Implement a Valid Hostname Filter for Ghost Visits

THIS IS THE SINGLE MOST EFFECTIVE SOLUTION TO ELIMINATE  FAKE SPAM TRAFFIC!

Some variants of these ghost visits use fake google / organic search visits with keywords for your to investigate (like “google officially -recommends ilovevitaly.com search shell“). Some are even showing up as direct visits and events

Since they never actually visited your site, you can’t block their visits at the server using any website Javascript (WordPress plugins) or .htaccess methods. You have no choice but to create a filter to exclude them. The biggest problem with these ghost referrals is that they change as quickly as they appear, so you could be continuously building filters for them.

Source versus Hostname

Real visits to your website from a referral link have TWO server names available: the Source that the link is from, and the Hostname that the landing page is pointing to (your server). In most cases, the Hostname should always be your server, regardless of where the traffic came from.

source-hostname

For example, here is a sample of the Source and Referral Path (page with the link on it) pointing to this article. Notice the Hostname is always my server.

referral-source-link-hostname-page

Ghost visits send traffic to a random series of tracking ID numbers — they don’t know your server name! They use blank (“(not set)“) or fake hostname values (like ‘google.com’). That means you can eliminate ALL of them simply by filtering to INCLUDE only the valid hostname — your server.

A. Identify Your Valid Hostnames

STEP CAREFULLY.  Valid hostnames are websites that you have configured to use your Google Analytics tracking ID (e.g. UA-12345678-1). They may include ecommerce shopping carts or call tracking services linked from your website.

valid-hostname-identificationStart with a multi-year report showing just hostnames (Audience > Technology > Network > hostname), then identify the valid ones — the servers where you REMEMBER configuring with your tracking ID (hint: google.com is NOT one of them).

UPDATE: if you have alternate domains that redirect to your main website domain, do NOT include those redirected domains. If you can type in one hostname/URL and it changes to display a page on a different domain, then it is NOT a valid hostname.

Many people have a problem with this step; here’s what I picked and why:

  • www.analyticsedge.com – my main website
  • help.analyticsedge.com – my help site configured with the same tracking ID
  • www.youtube.com – I have a YouTube channel with videos that I track using Google Analytics. I had to configure my tracking code in YouTube (what a pain that was! and it only tracks channel visits, not videos). NOT RECOMMENDED for general use.
  • sites.fastspring.com – I use FastSpring as my eCommerce provider to process payments. I configured it with my tracking code.

There are a number of translation and proxy services that may also record visits to your tracking ID because they display your original content through their servers. If the traffic is low, IGNORE THEM. Spammers have started using translate.googleusercontent.com as a hostname to bypass people’s filters, so don’t add that one. Most recent example is organic search spam with keywords “beat with a shovel the weak google spots addons.mozilla.org/en-us/firefox/addon/ilovevitaly/

FYI – googleweblight is a new service from Google that servers your pages to mobile networks in some parts of the world. It usually appears with your hostname in front, and it’s ok.

I do not have any tracking codes installed on google.com, mozilla.org, huffingtonpost.com or any of the other sites that appear in the report. I never configured my tracking code ON those sites — they are ghost visits!

IMPORTANT: If you see GOAL CONVERSIONS or REVENUE from (not set) hostnames, you need to dig into why. Maybe they are Event-based call logging and are not associated with pageview (which has a hostname value). You may need to adjust your filters and/or tracking code snippets. 

only-my-hostnamesB. Create the Filter Expression

Create a filter expression that captures all of the domains that you consider to be valid. TEST, TEST, TEST! Then move to production when you are sure you have it all. You may find it easier to play with an Advanced Segment, so you can see the effect of your filter without risking any data loss. See #3 below.

Many people have a problem composing the filter expression because it is Regex (regular expressions), so lets keep it really simple in this case.

For your filter expression, simply enter your valid hostname. If you have more than one, separate them by a vertical bar ( | ). If you have a third-party payment service like checkout.shopify.com, you may need to enter it as well.

It is not necessary to enter all of the subdomains (like www and help) – Regex will perform a partial match by default, so I keep the expression shorter by simplifying to just the root domains.

Note: in proper Regex, you should ‘escape’ the dots (\.), but since a dot matches any character and the likelihood and impact of a false match is negligible, I sometimes leave them out to keep it simple.

analyticsedge.com|youtube.com|fastspring.com

IMPORTANT: do NOT end the expression with a vertical bar ( | ); use them only between domains.

Share on LinkedInTweet about this on TwitterShare on FacebookPin on PinterestShare on Google+Email this to someone

3. Implement Spam Crawler Filters

Some spammers actually crawl the web and visit your site, and others have figured out what your hostname is, so the Valid Hostname filter won’t keep everything out. For these, you will need to specifically exclude their visits by naming them in a filter.

Note: if you are technically capable, you could block these sources using classic spam blocking techniques like using .htaccess rules. To learn a little more about these alternatives, you can read the article by Carlos Escalera.

Do NOT visit the referring site, since this is an invitation to get a virus or Trojan infection on your computer, or otherwise satisfy the desire of the spammer. I recommend you do a quick Google search first, to see if you can trust it. Spammers are quickly identified, and you’ll usually see indications in the first page of search results.

Creating a New Filter

spam-referral-filterYou can exclude them from your reports in Google Analytics by creating a filter. You identify a “unique signature” that identifies them (and only them), and then create a filter based on that.

Most spam can be eliminated by filtering on Campaign Source. Most people try filtering on Referral, and that filter doesn’t always work because some spammers have used utm codes to stuff values into the Source and Medium, imitating a referral.

campaign-source

Read Google’s instructions on making filters. If you use Google Tag Manager, Lunametrics has a nice option.

The latest filter expression I recommend at at the top of this article. Note that I take some shortcuts in my expressions to save space (there is a limit to the number of characters). I have not yet found any false matches for valid referrals in any of the web properties I have worked with, but you should be cautious of being too aggressive — I have seen some people recommend filtering on simple words like buy|cheap|motor|money|seo which will simply match far too many valid domains to be recommended.

As you discover new spammers you will have to add to the filters. Remember: these filters will exclude everything that matches, so be careful with your expressions, and TEST, TEST, TEST first.

Current Spam Crawler Filter Expressions

Spam Crawlers Filter 1: [ 2015-06-01]

semalt|anticrawler|best-seo-offer|best-seo-solution|buttons-for-website|buttons-for-your-website|7makemoneyonline|-musicas*-gratis|kambasoft|savetubevideo|ranksonic|medispainstitute|offers.bycontext|100dollars-seo|sitevaluation|dailyrank

Spam Crawlers Filter 2: [2015-12-07]

videos-for-your-business|success-seo|rankscanner|doktoronline.no|adviceforum.info|video--production|sharemyfile.ru|seo-platform|justprofit.xyz|127.0.0.1|nexus.search-helper.ru|rankings-analytics.com|dbutton.net|o00.in|wordpress-crew.net

Spam Crawlers Filter 3: [2016-06-12]

fast-wordpress-start.com|top1-seo-service.com|^scripted.com|uptimechecker.com|uptimebot.net|rankings-analytics.com|^uptime.com|.responsive-test.net|dogsrun.net|free-video-tool.com|keywords-monitoring(-your)?-success.com|a.pr-cy.ru

Spam Crawlers Filter 4: [2016-07-11]

fix-website-errors.com|seo-2-0.com

Editorial note: please do not take the presence of a domain in this article to mean the related businesses are all ‘spammers’. In several cases, unsuspecting businesses have had their domains referenced by spammers. e.g. scripted.com 

What is YOUR time worth?
Let me maintain your filters
for a whole year for only $75.

Share on LinkedInTweet about this on TwitterShare on FacebookPin on PinterestShare on Google+Email this to someone

4. Create a Custom Segment

To eliminate spam immediately from all of your reports, even historical reporting, you need to use a Custom Segment. If you have prepared the filter expressions above, you’ve already done all the hard work.

In Google Analytics, open your Reporting view, and click +Add Segment.

add-segment

Then click New Segment and enter a name like “All Sessions (No Spam)“. If you have multiple websites in your account, you should include the website in the name, like “All Sessions (AnalyticsEdge)“.

eliminate-spam-create-segment-new

Select the Advanced > Conditions tab on the left. Create a new entry for the valid hostnames:

  • Sessions > Include
  • Hostname > matches regexyour valid hostnames expression (#1 above)

Then click + Add Filter and add two expressions for the Spam Crawlers:

  • [+Add Filter]
  • Sessions > Exclude
  • Source > matches regexspam crawler expression #1    OR
  • Sessions > Exclude
  • Source > matches regexspam crawler expression #2    OR
  • repeat for the rest of the filter expressions

Save and Apply the Segment

The easiest way to test is to use your new segment in combination with the default All Sessions segment, comparing the Sessions counts. You can find your new segment listed in the Custom grouping. You can select BOTH your new segment AND the All Sessions segment to compare.

Sharing your Segment Definition

Segments are associated with the account you log in with, not in the web property. If you want others to share the segment you made, they need to make their own copy of it.

Here’s a copy of mine from the Google Analytics Solution Gallery. 2016-06-06

You can do this by Sharing your segment to them. On the segment you want to share, click the little down arrow and pick Share from the menu, then copy and send the link to your associate. When they click the link, they will be asked what Google Analytics View they want the segment to be associated with. If they have access to multiple web properties, they should only associate it with views for the property that matches the Valid Hostname filter.

share-segment

Share on LinkedInTweet about this on TwitterShare on FacebookPin on PinterestShare on Google+Email this to someone

5. Turn On Google’s Bots & Spiders Option

bots-and-spidersGoogle Analytics has a simple checkbox you can use to exclude easy-to-identify bots and spiders, but you have to enable it for every View you use. In your Google Analytics Admin section, navigate to each View you use, select View Settings, and check the box to Exclude all hits from known bots and spiders.

This feature has recently started affecting referral spam as well (e.g. horoskop-baran.pl / referral), so TURN IT ON!

.

.

That’s It – You Are Spam-Free For Now!

Share on LinkedInTweet about this on TwitterShare on FacebookPin on PinterestShare on Google+Email this to someone

What made me an expert about spam?

My name is Mike Sullivan; I am a Google Analytics Certified professional, and a Top Contributor in the Google Analytics community forum. I have been working extensively with the Google Analytics API since 2010, providing customized reporting solutions. I founded Analytics Edge in 2013, making a suite of free and inexpensive Excel report automation add-ins and connectors.

Spam was hounding my customers, so I dug into the problem with all the tools at my disposal and thought I’d share what I learned. I wrote this Definitive Guide, coining the term “ghost referrals”, to help resolve the confusion surrounding the various spam types and the different techniques required to deal with them.  I hope this article has helped you, too.

Trust your Google Analytics to the expert – I can solve your spam problem for you.


Historical list of spam sources detected:

  • 2016-07-28 – biketank.ga / referral ghost spam
  • 2016-07-27 – doyouknowtheword-flummox.ml / referral ghost spam
  • 2016-07-26 – ######.social-s-bbb-xyz / referral ghost spam
  • 2016-07-25 – itrevolution.cf / referral ghost spam
  • 2016-07-21 – kiwi237au.tk / referral ghost spam
  • 2016-07-21 – social-buttons-aaa.xyz, social-buttons-bbb.xyz, social-buttons-ccc.xyz, social-buttons-ddd.xyz, social-buttons-eee.xyz, social-buttons-fff.xyz, social-buttons-ggg.xyz, social-buttons-hhh.xyz, social-buttons-iii.xyz / referral ghost spam
  • 2016-07-21 – law-enforcement-one.xyz, law-enforcement-two.xyz, law-enforcement-three.xyz, law-enforcement-four.xyz, law-enforcement-five.xyz, law-enforcement-six.xyz, law-enforcement-seven.xyz, law-enforcement-eight.xyz, law-enforcement-nine.xyz, law-enforcement-ten.xyz/ referral ghost spam
  • 2016-07-20 – ranking2017.ga / referral ghost spam
  • 2016-07-20 – bestofferswalkmydogouteveryday.gq / referral ghost spam
  • 2016-07-18 – luxmagazine.cf / referral ghost spam
  • 2016-07-17 – ranking2017.ga / referral ghost spam
  • 2016-07-16 – exchangeit.gq / referral ghost spam
  • 2016-07-16 – “eu-cookie-law.info …” organic keyword ghost spam
  • 2016-07-15 – pokemongooo.ml / referral ghost spam
  • 2016-07-13 – bestchoice.cf / referral ghost spam
  • 2016-07-11 – seo-2-0.com / referral SPAM CRAWLER
  • 2016-07-08 – botd.wordpress.com / referral ghost spam
  • 2016-07-08 – eu-cookie-law.blogspot.* / referral ghost spam
  • 2016-07-08 – free-share-buttons.blogspot.* / referral ghost spam
  • 2016-07-04 – law-one.xyz, law-two.xyz, law-three.xyz, law-four.xyz, law-five.xyz, law-six.xyz, law-seven.xyz, law-eight.xyz, law-nine.xyz, law-ten.xyz / referral ghost spam
  • 2016-07-01 – free-share-buttons-???.xyz / referral ghost spam
  • 2016-06-27 – forum.topic#.ilovevitaly.xyz / referral ghost spam
  • 2016-06-19 – slow-website.xyz / referral ghost spam
  • 2016-06-19 – law-enforcement-check-*.xyz / referral ghost spam
  • 2016-06-19 – free-social-buttons-???.xyz / referral ghost spam
  • 2016-06-17 – site-auditor.online / referral ghost spam
  • 2016-06-16 – eu-cookie-law.info organic keywords ghost spam
  • 2016-06-11 – law-enforcement-bot-??.xyz / referral ghost
  • 2016-06-08 – social-buttons-??.xyz / referral ghost spam
  • 2016-06-04 – law-enforcement-??.xyz / referral ghost spam
  • 2016-06-04 – fix-website-errors.xyz / referral SPAM CRAWLER
  • 2016-05-31 – cookie-law-enforcement-ii.xyz / referral ghost spam
  • 2016-05-31 – cookie-law-enforcement-hh.xyz / referral ghost spam
  • 2016-05-31 – cookie-law-enforcement-gg.xyz / referral ghost spam
  • 2016-05-29 – cookie-law-enforcement-ff.xyz / referral ghost spam
  • 2016-05-28 – magicdiet.gq / referral ghost spam
  • 2016-05-28 – forum.topic#.ghostvisitor.com / referral ghost spam
  • 2016-05-28 – ghostvisitor.com / referral ghost spam
  • 2016-05-27 – burn-fat.ga / referral ghost spam
  • 2016-05-27 – cookie-law-enforcement-??.xyz / referral ghost spam
  • 2016-05-27 – eu-cookie-law-enforcement#.xyz / referral ghost spam
  • 2016-05-27 – http://link.web-list.xyz/ / referral ghost spam
  • 2016-05-27 – keywords-monitoring-success.com / referral SPAM CRAWLER
  • 2016-05-27 – monetizationking.net / referral ghost spam
  • 2016-05-27 – popads.net / referral
  • 2016-05-25 – www.get-free-social-traffic.com organic keywords ghost spam
  • 2016-06-23 – ownshop.cf / referral ghost spam
  • 2016-05-20 – a.pr-cy.ru / referral spam crawler
  • 2016-05-20 – eu-cookie-law-enforcement-#.xyz / referral ghost spam
  • 2016-05-18 – getlamborghini.ga / referral ghost spam
  • 2016-05-16 – dominateforex.ml / referral ghost spam
  • 2016-05-08 topquality.cf / referral ghost spam
  • 2016-05-05 share-button.xyz / referral ghost spam
  • 2016-05-04 marketland.ml / referral ghost spam
  • 2016-04-30 unpredictable.ga / referral ghost spam
  • 2016-04-28 increasewwwtraffic.info / referral ghost spam
  • 2016-04-26 website-stealer-warning-alert.hdmoviecams.com / referral ghost spam
  • 2016-04-25 ‘i came up with a method and 1,5 years forcing…’ ghost organic keywords
  • 2016-04-25 lots of other ghost organic keywords, including share-button.xyz, m-google.xyz, socialbutton.xyz, and others
  • 2016-04-25 social-traffic-#.xyz / referral ghost spam
  • 2016-04-18 smartphonediscount.info / referral ghost spam
  • 2016-04-18 free-social-buttons#.xyz / referral ghost spam (# is 2, 3, 6, 7, etc)
  • 2016-04-17 forum.topic#.6hopping.com and free-social-buttons6.xyz / referral ghost spam
  • 2016-04-14 keywords-monitoring-your-success.com / referral crawler
  • 2016-04-11 makeprogress.ga / referral ghost spam
  • 2016-04-10 m-google.xyz and fuck-paid-share-buttons.xyz / referral ghost spam
    2016-04-09 free-video-tool.com / referral spam crawler added to filter expressions
  • 2016-04-08 getrichquickly.info / referral ghost spam
  • 2016-04-01 getrichquick.ml /referral and яндех-херня.рф / referral ghost spam
  • 2016-03-23 magnet-to-torrent.com / referral and torrent-to-magnet.com / referral added to spam crawler filters
  • 2016-03-23 adtiger.tk / referral ghost spam
  • 2016-03-21 wordpresscore.com/ referral ghost spam
  • 2016-03-18 feedback.sharemyfile.ru /referral ghost spam
  • 2016-03-15 rank-checker.online / referral ghost spam
  • 2016-03-10 dogsrun.net / referral spam crawler
  • 2016-03-05 #.responsive-test.net / referral spam crawler
  • 2016-03-03 why.does.spacebarnot.work? / organic search from o-o-11-o-o.com hostname ghost spam
  • 2016-03-03 uptime.com / referral spam crawler
  • 2016-02-28 hostgator.com  / referral ghost spam
    stablehost.com  / referral ghost spam
    digitalfaq.com  / referral ghost spam
    bluehost.com  / referral ghost spam
    site5.com  / referral ghost spam
    cutalltheshit.com  / referral ghost spam
    veerotech.com  / referral ghost spam
    mddhosting.com  / referral ghost spam
    siteground.com  / referral ghost spam
  • 2016-02-27 domain-tracker.com  / referral ghost spam
  • 2016-02-16 go.ekatalog.xyz / referral ghost spam
  • 2016-02-10 китай.с.новым.годом.рф / referral ghost spam
  • 2016-01-26 how.to.travel.and.make.money.with.maps.ilikevitaly.com / referral ghost spam
  • 2016-01-22 web-revenue.xyz / referral ghost spam (traffic2cash.xyz)
  • 2016-01-22 free-traffic.xyz / referral ghost spam (sharebutton.to)
  • 2016-01-20 social-widget.xyz / referral ghost spam (sharebutton.to)
  • 2016-01-20 free-social-buttons.xyz / referral ghost spam (sharebutton.to)
  • 2016-01-20 net-profits.xyz / referral ghost spam (sharebutton.to)
  • 2016-01-20 traffic-cash.xyz / referral ghost spam (traffic2cash.xyz)
  • 2016-01-13 rankings-analytics.com / referral spam crawlers
  • 2016-01-03 share-buttons.xyz / referral ghost spam (sharebutton.to)
  • 2016-01-01 с.новым.годом.рф / referral (ilovevitaly) ghost spam
  • 2015-12-31 happy.new.yeartwit.com / referral (ilovevitaly) ghost spam
  • 2015-12-29 build-a-better-business.2your.site / referral ghost spam (ontraport.com)
  • 2015-12-25 build-audience.for-your.website/ referral ghost spam  (easyvideosuite.com / easywebinar.com)
  • trafficgenius.xyz / referral ghost spam (publishvault.com)
  • new-look.for-your.website / referral ghost spam (teslathemes.com)
  • onlinetvseries.me / referral ghost spam
  • uptimechecker.com / referral spam crawlers
  • uptimebot.net / referral spam crawlers
  • topseoservices.co / referral ghost spam (www.semrush.com)
  • website-analyzer.info/ referral ghost spam (ranksonic.com)
  • trafficgenius.xyz / referral ghost spam (publishvault.com)
  • smarter-content.for-your.website / referral ghost spam (scribecontent.com)
  • 2015-12-22 traffic2cash.xyz / referral ghost spam
  • 2015-12-21 w3javascript.com  / referral ghost spam
  • 2015-12-20 website-stealer.nufaq.com / referral ghost spam
  • 2015-12-19 website-stealer-warning.hdmoviecamera.net / referral ghost spam
  • 2015-12-14 ^scripted.com /referral added to spam crawlers filters
  • 2015-12-09 googlemare.com / referral (ilovevitaly) ghost spam
  • 2015-12-08 boost-my-site.com / referral (ranksonic.com) ghost spam
  • 2015-12-08 top1-seo-service.com / referral (semalt.com) added to spam crawler filters
  • 2015-12-04 santasgift.ml / referral ghost spam
  • 2015-12-04 rusexy.xyz / referral ghost spam
  • 2015-12-02 quit-smoking.ga / referral ghost spam
  • 2015-12-01 o-o-8-o-o.com / referral ghost spam
  • 2015-11-27 cyber-monday.ga /referral ghost spam
  • 2015-11-27 fast-wordpress-start.com / referral (startwp.org) added to spam crawler filters
  • 2015-11-27 lsex.xyz / referral (http://work-from-home-earn-money-online.com/) ghost spam
  • 2015-11-27 traffic2cash.org / referral ghost spam
  • 2015-11-26 black-friday.ga / referral ghost spam
  • 2015-11-26 kiwe-analytics.com / referral ghost spam
  • 2015-11-24 adf.ly / referral ghost spam
  • 2015-11-23 hosting-tracker.com / referral (syfonix.com) ghost spam
  • 2015-11-20 wordpress-crew.net / referral added to spam crawler filters
  • 2015-11-19 get-your-social-buttons.info / referral (sharebutton.to) ghost spam
  • 2015-11-18 traffic2cash.net / referral ghost spam
  • 2015-11-18 ranksonic.net / referral (ranksonic.com) ghost spam
  • 2015-11-17 snip.to / referral (snip.ly) ghost spam
  • 2015-11-16 alibest.com /referral ghost spam
  • 2015-11-16  claim#######.copyrightclaims.org / referral (ilovevitaly.com) ghost spam
  • 2015-11-16 dbutton.net / referral spam crawler
  • 2015-11-16 o00.in / referral spam crawler
  • 2015-10-08 rankings-analytics.com /referral spam crawler
  • 2015-10-04: nexus.search-helper.ru / referral spam crawler
  • 2015-09-21: rednise.com / referral ghost spam
  • 2015-09-21: 127.0.0.1:80## / referral: some spammers don’t know what they are doing
  • 2015-09-16: best-seo-software.xyz / referral ghost spam
  • 2015-09-15: justprofit.xyz added to spam crawler filters
  • 2015-09-01: qualitymarketzone.com which redirects to www.tkqlhce.com
  • 2015-09-01: seo-platform.com which redirects to affiliate.ranksonic.com
  • 2015-08-26: ghost spam is free from the politics, we dancing like a paralytics / organic keywords
  • 2015-08-15: how-to-earn-quick-money.com / referral
  • 2015-08-13: sexyali.com / referral**
  • 2015-08-13: hongfanji.com / referral
  • 2015-08-09: free-floating-buttons.com / referral
  • 2015-08-09: get-free-social-traffic.com / referral
  • 2015-08-05: satellite.maps.ilovevitaly.com / referral
  • 2015-08-05: chinese-amezon.com / referral
  • 2015-07-29: pops.foundation / referral
  • 2015-07-24: traffic2money.com / referral
  • 2015-07-20: e-buyeasy.com / referral
  • 2015-07-02 site#.floating-share-buttons.com / referral
  • 2015-06-26 erot.co / referral
  • 2015-06-18 webmonetizer.net / referral
  • 2015-06-09 howtostopreferralspam.eu / referral and organic
  • 2015-06-04 trafficmonetizer.org / referral
  • 2015-06-03 непереводимая.рф / referral
  • 2015-06-01 непереводимая.рф / organic
  • 2015-05-29 sanjosestartups.com / organic
  • 2015-05-27 websites-reviews.com / referral
  • 2015-05-26 sanjosestartups.com / referral
  • 2015-05-21  ilovevitaly.com / organic
  • 2015-05-19 s.click.aliexpress.com / organic
  • 2015-05-15 site4.free-share-buttons.com / referral and free-social-buttons.com / referral and webmaster-traffic.com / referral
  • 2015-05-06 www.event-tracking.com / referral, www.kabbalah-red-bracelets.com / referral, guardlink.org / referral and some spikes of direct traffic (direct) / (none).
  • 2015-04-28 (google / organic) search spam with keyword “vitaly rules google…”
  • 2015-04-24 (free-share-buttons.com / referral, pornhub-forum.ga / referral, youporn-forum.ga / referral, rapidgator-porn.ga / referral, domination.ml / referral, torture.ml / referral, www.Get-Free-Traffic-Now.com / referral, buy-cheap-online.info / referral, theguardlan.com / referral)
  • 2015-04-06 (editors.choice#######.hulfingtonpost.com / referral) and (googlsucks.com / referral) and Get-Free-Traffic-Now.com
  • 2015-04-02 addons.mozilla.org / referral
  • 2015-03-26 4webmasters.org / referral
  • 2015-02-23 www1.social-buttons.com / referral
  • 2015-03-16  s.click.aliexpress.com / referral and simple-share-buttons.com
  • 2015-03-11 ranksonic.org
  • 2015-03-04 humanorightswatch.org / referral
  • 2015-02-25: o-o-6-o-o.com / referral
  • 2015-02-11: message####.cenokos.ru
  • 2015-02-04: bestwebsitesawards.com / referral
  • 2015-01-27:  cenoval.ru / referral
  • 2015-01-19:  “google officially -recommends ilovevitaly.com search shell” and “resellerclub scam” organic
  • 2015-01-15: “hulfingtonpost.com / referral
  • …and more…

Comments: (moderated, no spam)

  1. Sam

    Thank you very much for the information and tutorials! This has really helped us out in determining actual traffic and behaviour! Much appreciated!

    Reply
  2. John

    This is awesome. I don’t need the service you provide, but this is such a valuable resource. I really hope that lots of people out there who don’t have the technical capacity to do this themselves take you up on your offer. Well done! John

    Reply
  3. Jodi

    Great info. Question for you… I have implemented this and am noticing that some valid referrals (backlinks) are filtered out. What can I do about that?

    Reply
    1. mike_sullivan

      There is one of two ways that valid referrals can be affected: first, your valid hostname filter does not include the hostname that the referral is landing on. Maybe a subdomain or alternate domain you actively use (not a redirected domain) has been forgotten. Check the hostname for those visits.
      Second, one of the spam crawler filters is excluding the valid source by mistake. If this is the case,let me know what the valid source is and I can correct the filter expressions.

      Reply
  4. LynnW

    VERY helpful article, thank you!
    For allowing clients to view past history sans the spam, is there a wordpress analytics dashboard plugin to display a SEGMENT of a view? (I’ve searched and am not finding one…)

    Reply
  5. Donna Duncan

    My sites get a lot of visits from amazon.aws. Do you recommend adding them to your list of exclusions? If not, why?

    Reply
    1. mike_sullivan

      A number of people use amazon.aws to build bots that scan the web for specific information. These are targeted visits — they actually hit your site. Look at the traffic and if there is no valid user intent behind them, go ahead and create a specific exclude filter for that source. I do not recommend adding to an existing spam filter — keep them separate.

      You should revisit these one-of filters from time to time, checking your unfiltered view to see if the traffic is still there, and it still have no real user behavior.

      Reply
      1. Donna Duncan

        Thx Mike. I’ve tried to exclude amazon.aws via a hostname filter and that made no difference. Is there another dimension I should be using?

      2. mike_sullivan

        If you see amazon.aws in your hostname report, then a hostname filter on the same expression should work. If you see amazon.aws in your source / medium report, then filter the campaign source. If you don’t actually see amazon.aws in any report, then there is no reason to filter.

  6. Donna Duncan

    I want to thank you for this. Your solution goes one step farther than any others I have seen thus far in that it offers a mechanism by which readers can: (a) keep their profiles clean moving forward; and (b) remove spam immediately from all of reports, even historical ones. Bravo!

    Reply
  7. Magnus

    Hi Mike,
    First of all, thank you for your great information about spam in Analytics.

    I have added the Spam Crawler Filter and Valid Hostname Filter. Still I can see some spammers that are in the filters in my referral report i.e “social-buttons-gg.xyz”. When I try to verify my filter in Analytics with that spammer added I get “This filter would not have changed your data. Either the filter configuration is incorrect, or the set of sampled data is too small.”. Am I doing something wrong?
    Best regards,
    Magnus

    Reply
    1. mike_sullivan

      The ‘Verify’ function on the filter page usually does not work unless you have a LOT of traffic in your report. The message you got is a warning, not an error. You can ignore it.

      Note that filters do not remove existing spam – the prevent new spam from being collected. Use the segment to see clean reports previous to the day you installed the filter.

      A common problem with the valid hostname filter is that people put a vertical bar at the start or end of the expression — this causes the filter to do NOTHING. Remove the extra bar and it will start to work. If you are unsure, TRY THE EXPRESSION IN A SEGMENT FIRST.

      Reply
  8. may007

    Hi
    btw – I love this piece!

    When setting up the filters do you add in:

    free-social-buttons.com / referral|
    or do you need the www in front?

    Is this the correct way of setting it up!
    www3.free-social-buttons.com|www4.free-social-buttons.com|site4.floating-share-buttons.com

    Thanks!
    m

    Reply
    1. mike_sullivan

      First of all, the free-social-buttons stuff is ghost spam – use a valid hostname filter or don’t bother. Most of it lasts only a few days, so by the time you install the filter it has already moved on to a variant. With a valid hostname filter, you will never see it.

      As for the filters, I update the ones I provide within a day or two of new crawlers being identified; just use them as is. If you want to build your own, be specific enough to avoid false-matches from valid sites, and generic enough to keep your expression small. ‘free-social-buttons.com’ would be good enough.

      ALSO be very careful — do NOT end your expression with a vertical bar |

      Reply
  9. Alberto

    I am new to Goole Analytics, I have recently put on line my website and started analyzing it and discovering the results infested by the usual spammers, so I apologize if my question is naive.
    I liked the very clear and instructive post but before putting it into practice I have the following question:
    If I build an Include Filter with all the known Hostnames and somebody after that inserts a link to my website in his/her website without telling me and some traffic starts coming from this source, from what I have understood, this website will not show up in the referral list because it is not in the Include Filter list. If I am correct is there a way to avoid this ?

    Accessorily in the Filter Pattern field when writing the hostname do I have to write the slash “\” before the TLD as in “example\.com” ?

    Thank you a lot

    Reply
    1. mike_sullivan

      The “\” is optional in regular expressions. It ‘escapes’ the dot (period) character, which in regex means ‘any character’. Although that means ‘example.com’ could match ‘examplexcom.xyz’, in practiceif a spammer matches the ‘example’ part, they can match the rest so there is no real need to be overly specific.

      A referral visit comes FROM a referrer TO your website. The ‘hostname’ is your website; the ‘source’ is the referrer. Filtering on your hostname does not stop any (real) referral traffic from being recorded, regardless of where they come from.

      Reply
  10. Laura

    This post has helped me immensely! I wanted to clarify, I created these as segments first to test and see what gets filtered out, but you want to put these filters in place in the view settings, yes? To not even collect this data to begin with, right?

    Reply
    1. mike_sullivan

      Yes, the objective is to place the filters in the views so you can use other segments in your reports.

      Reply
  11. danny

    Just wanted to ask how I can verify that it’s working… once I enable it, should I go to my referrals and even if I look 30 days in the past they shouldn’t appear or will only take affect from the day I put the filter in place?

    Reply
    1. mike_sullivan

      Filters have NO IMPACT on historical reporting — they take effect only on new visits from the day they are installed. The only way you can verify their effect is to compare your unfiltered view with your filtered view after a week or so of operation (which is why I recommend you use a test view first). You can instantly check the filter expressions you are using by putting them in to a segment, as described in section 4.

      Reply
  12. Fred Pike

    Hi Mike – just to be clear, in step 3 you should create one filter, using your most recent list of spam sites (spam crawler filter 4), not a filter for each of the three previous filters (spam crawler filter 1-3). Is that correct?
    Thanks for maintaining this list and your great and authoritative article!
    Fred

    Reply
    1. mike_sullivan

      Fred, you actually need all 4 spam crawler filters. There is a limit to the length of a filter expression, and the list is too long to fit into a single filter, so you need 4 filters to get all of the known spam crawlers.

      Reply
  13. George Plumley

    A new member of the Semalt clan I’ve just started filtering:
    fix-website-errors.com

    Reply
  14. Heidi Wise

    Brilliant. Loved and followed every word. Looking forward to having meaningful analytics now!

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *