Definitive Guide to Removing All Google Analytics Spam

This is a PROVEN WORKING SOLUTION with complete filter expressions [2017-05-18].

How to Prevent and Remove Spam:

  1. BEFORE YOU START: Make an Unfiltered View!
  2. Implement a Valid Hostname Filter to eliminate ghost visits (like piulatte.cz and track-rankings.online). Also eliminates fake keywords like cdn site:nodepm.com.
  3. Implement Spam Crawler Filters to eliminate the targeted spam visits (like share-buttons-for-free.com)
  4. Create a Custom Segment with these filters to use for reporting

All the information you need (and more!) is provided below in this step-by-step guide. Filter expressions are updated within a day or two as needed.


Tired of fighting spam? Let a professional do it for you!

Daily monitoring and filter updates: only $75/year. Start Today!


Current Spam Filter Expressions

Read the rest of the article below to learn more about how spam has changed over time and how to implement these filters properly. See the bottom of the article for a running list of spam referrals these filters block (updated almost daily).

Valid Hostname Filter: customize to suit your web server domain (see discussion below)

mydomain.com

Spam Crawlers Filter 1: [ 2015-06-01]  Custom > Exclude > Campaign Source

semalt|anticrawler|best-seo-offer|best-seo-solution|buttons-for-website|buttons-for-your-website|7makemoneyonline|-musicas*-gratis|kambasoft|savetubevideo|ranksonic|medispainstitute|offers.bycontext|100dollars-seo|sitevaluation|dailyrank

Spam Crawlers Filter 2: [2017-03-26]  Custom > Exclude > Campaign Source

videos-for-your-business|success-seo|rankscanner|doktoronline.no|adviceforum.info|video--production|sharemyfile.ru|seo-platform|justprofit.xyz|127.0.0.1|nexus.search-helper.ru|dbutton.net|o00.in|wordpress-crew.net|amazon-seo-service|enbersoft.com

Spam Crawlers Filter 3: [2017-03-20]  Custom > Exclude > Campaign Source

fast-wordpress-start.com|top1-seo-service.com|uptimechecker.com|uptimebot.net|rankings-analytics.com|^uptime.com|.responsive-test.net|dogsrun.net|free-video-tool.com|keywords-monitoring(-your)?-success.com|a.pr-cy.ru|share-buttons-for-free

Spam Crawlers Filter 4: [2017-03-01]  Custom > Exclude > Campaign Source

fix-website-errors.com|seo-2-0.com|platezhka.net|timer4web.com|1-99seo.com|1-free-share-buttons.com|uptime-alpha.net|3-letter-domains.net|datract.com|lifehacĸer.com|top10-way.com|google-liar.ru|motherboard.vice.com|petitions.whitehouse.gov|^vc.ru

Spam Crawlers Filter 6: [2017-05-15]  Custom > Exclude > Campaign Source

slifty.github.io|foxweber.com

Spam Crawlers Filter 5: [2016-12-13]  Custom > Exclude > Language Settings

.{13,}|\.

Editorial note: please do not take the presence of a domain in this article to mean the related businesses are all ‘spammers’. In several cases, unsuspecting businesses have had their domains referenced by spammers or have unintentionally left referrals and since corrected their practices.

spam-language-filter

spam-language-filter

spam-referral-filter

spam-referral-filter

 

The Google Analytics Referral Spam Solution

How to Filter Referral Spam in #GoogleAnalytics Click To Tweet

There are a lot of partial solutions and misinformation out there about clearing out so-called referral spam (and organic search and event spam too), so here’s the Definitive Guide to removing all of that junk! This article has been constantly updated since January 2015 and has shown over 330,000 people how to get rid of spam in Google Analytics reporting.

2017-01-01: the spammers continue to adapt to various spam fighting techniques, which has recently resulted in at least one free service (referrerspamblocker.com) to give up due to the effort required to maintain a working solution. I would like to say thank you to Stijlbreuk for their efforts over the past couple of years. We all believe it is time for Google to make our services unnecessary.  For my part, I will continue to maintain this article for as long as it is needed — Mike Sullivan

Urban Myths and Bad Advice:

  • DO NOT use the Referral Exclusion List – Why?
  • Google Analytics bounce rate DOES NOT affect search rankings
  • Using .htaccess rules or WordPress plugins will NOT eliminate any of the ghost referrals

This is a long article — I have included a lot of background and detail to explain why this solution is so effective, and to dispel a number of urban myths around referral spam. Heed the advice to set up an unfiltered view before you begin — there is no recovery from a bad filter (filtered traffic is gone forever); do not risk your analytics to a typo.


Background: The Many Faces of Referral Spam

spam-201606The problem of fake references in Google Analytics has changed significantly over the past 2 years. In 2014, we had some bots from semalt and buttons-for-website that visited your website and left fake referrals in your analytics. In December 2014, the attacks began taking advantage of a weakness in Google’s new Measurement Protocol that allowed direct attacks on the Google Analytics tracking servers without having to actually visit your website. This is a lot easier than crawling the web looking for new websites. There were a lot of different types of attacks, from many referral sources, leading to a lot of confusion in the industry.

Ranksonic joined in on the fun in March 2015, and the spammers enjoyed playing with new domains and techniques. We have had fake organic search terms (www.get-free-social-traffic.com) and fake events (event-tracking) injected into our analytics, too. Enterprising individuals popped up on Fiverr offering hundreds or even thousands of visits from real webmasters obtained using these techniques.

At the beginning of 2016, there were still lots of players pushing through Google’s defenses, even if only for a few days (see image). As of June, 2016, the trend continues with most ghost spam changing in a few days, but the crawlers seem to run for months.

In late 2016, the spam has evolved again, this time focused on inserting a fake Language, and using a rotating series of fake and real sources. This latest blitz also uses valid hostnames on some of the traffic, indicating the spammer is working to get around the common protections people have deployed.

Why hasn’t Google stopped it all yet? Well, they have said they are working on it, but it is a tough fight. Trust me — it would be a LOT worse than it is if they were doing nothing. They just don’t talk about what they are doing, and that is a widely adopted security best practice: never talk about what protections you have in place. It makes it harder for the bad guys to work around your protection systems.


BEFORE YOU BEGIN: Create an Unfiltered View

Before you start hacking away at your Google Analytics settings, the best practice for implementing new filters starts like this – create an UNFILTERED VIEW, and a TEST VIEW.

create-new-view-in-ga

1. Make sure you always have an Unfiltered view in your property — that has absolutely no filters. This will ensure you always have the raw, unmodified data should things go wrong. There is no ‘undo’ for a bad filter.

2. Don’t create new filters directly in your main view. Create a new Test view that mirrors your main view in every other respect, and then add the filter(s) there first. Watch it for a few days and compare with the Unfiltered view to make sure it is doing what it should.

3. If you’re happy with the new filter based on this test, then go ahead and add the ‘existing filter’ to your main view.


1. New Website? Use a ‘-2’ Property

In early 2015, I coined the term “ghost referral” to identify the worst offenders like darodar because they actually NEVER VISIT YOUR SITE. Using some software magic, they post fake hits to Google’s tracking service using a random series of tracking IDs. When they pick a series of numbers that includes your tracking ID, Google records a referral visit from their source in your reports, even though they know nothing about your website and never visited it. A number of people have seen ‘traffic’ to Google Analytics accounts that have never been used…

When you create a Google Analytics Account, you also get a Property and a View. The Property gives you the tracking id (e.g. UA-1234567-1) that you use in the code snippet on your website. You can create 50 Properties in your Account, and they are given -1, -2, -3, … -50 extensions. Most ghost referral spam hits the default ‘-1’ Property, although some are now hitting -2, and -3 properties as well.

You can significantly reduce the spam simply by creating and using a second, third or fourth (or tenth) Property. You don’t have to actually use them all. Caution: changing your tracking code on your website will leave the historical data in the old property, so this is really only useful for new websites, or if you are willing to abandon your old data.

new-property


2. Implement a Valid Hostname Filter for Ghost Visits

THIS IS THE SINGLE MOST EFFECTIVE SOLUTION TO ELIMINATE  FAKE SPAM TRAFFIC!

“Ghost” traffic never actually visits your website — it is injected into the Google Analytics tracking servers and appears in their reports. Javascript filters, WordPress plugins and .htaccess methods are useless at blocking the traffic because there is no traffic to your website. You have no choice but to create a Google Analytics filter to exclude them because they ONLY exist in Google Analytics. The biggest problem with this ghost traffic is that they change as quickly as they appear, so you could be continuously building filters for them.

Source versus Hostname

Real visits to your website from a referral link have TWO server names available: the Source that the link is from, and the Hostname that the landing page is pointing to (your server). In most cases, the Hostname should always be your server, regardless of where the traffic came from.

source-hostname

For example, here is a sample of the Source and Referral Path (page with the link on it) pointing to this article. Notice the Hostname is always my server.

referral-source-link-hostname-page

Ghost visits send traffic to a random series of tracking ID numbers — they don’t know your server name! They use blank (“(not set)“) or fake hostname values (like ‘google.com’). That means you can eliminate ALL of them simply by filtering to INCLUDE only the valid hostname — your server.

A. Identify Your Valid Hostnames

STEP CAREFULLY.  Valid hostnames are websites that you have configured to use your Google Analytics tracking ID (e.g. UA-12345678-1). They may include ecommerce shopping carts or telephone call tracking services linked from your website.

valid-hostname-identificationStart with a multi-year report showing just hostnames (Audience > Technology > Network > hostname), then identify the valid ones — the servers where you REMEMBER configuring with your tracking ID (hint: google.com is NOT one of them).

UPDATE: if you have alternate domains that redirect to your main website domain, do NOT include those redirected domains. If you can type in one hostname/URL and it changes to display a page on a different domain, then it is NOT a valid hostname.

Many people have a problem with this step; here’s what I picked and why:

  • www.analyticsedge.com – my main website
  • help.analyticsedge.com – my help site configured with the same tracking ID
  • www.youtube.com – I have a YouTube channel with videos that I track using Google Analytics. I had to configure my tracking code in YouTube (what a pain that was! and it only tracks channel visits, not videos). NOT RECOMMENDED for general use.
  • sites.fastspring.com – I use FastSpring as my eCommerce provider to process payments. I configured it with my tracking code.

There are a number of translation and proxy services that may also record visits to your tracking ID because they display your original content through their servers. If the traffic is low, IGNORE THEM. Spammers have started using translate.googleusercontent.com as a hostname to bypass people’s filters, so don’t add that one. Most recent example is organic search spam with keywords “beat with a shovel the weak google spots addons.mozilla.org/en-us/firefox/addon/…

FYI – googleweblight is a new service from Google that servers your pages to mobile networks in some parts of the world. It usually appears with your hostname in front, and it’s ok.

I do not have any tracking codes installed on google.com, mozilla.org, huffingtonpost.com or any of the other sites that appear in the report. I never configured my tracking code ON those sites — they are ghost visits!

IMPORTANT: If you see GOAL CONVERSIONS or REVENUE from (not set) hostnames, you need to dig into why. Maybe they are Event-based call logging and are not associated with pageview (which has a hostname value). You may need to adjust your filters and/or tracking code snippets. 

only-my-hostnamesB. Create the Filter Expression

Create a filter expression that captures all of the domains that you consider to be valid. TEST, TEST, TEST! Then move to production when you are sure you have it all. You may find it easier to play with an Advanced Segment, so you can see the effect of your filter without risking any data loss. See #3 below.

Many people have a problem composing the filter expression because it is Regex (regular expressions), so lets keep it really simple in this case.

For your filter expression, simply enter your valid hostname. If you have more than one, separate them by a vertical bar ( | ). If you have a third-party payment service like checkout.shopify.com, you may need to enter it as well.

Note: if you can’t see the “Include” radio button in the Filter page, look BELOW the Exclude section (which is expanded when it is selected). When you select Include, the Exclude section will collapse and the Include section will expand as in the image.

It is not necessary to enter all of the subdomains (like www and help) – Regex will perform a partial match by default, so I keep the expression shorter by simplifying to just the root domains.

Note: in proper Regex, you should ‘escape’ the dots (\.), but since a dot matches any character and the likelihood and impact of a false match is negligible, I sometimes leave them out to keep it simple.

analyticsedge.com|youtube.com|fastspring.com

IMPORTANT: do NOT end the expression with a vertical bar ( | ); use them only between domains.


3. Implement Spam Crawler Filters

Some spammers actually crawl the web and visit your site, and others have figured out what your hostname is, so the Valid Hostname filter won’t keep everything out. For these, you will need to specifically exclude their visits by naming them in a filter.

Note: if you are technically capable, you could block these sources using classic spam blocking techniques like using .htaccess rules. To learn a little more about these alternatives, you can read the article by Carlos Escalera.

Do NOT visit the referring site, since this is an invitation to get a virus or Trojan infection on your computer, or otherwise satisfy the desire of the spammer. I recommend you do a quick Google search first, to see if you can trust it. Spammers are quickly identified, and you’ll usually see indications in the first page of search results.

Creating a New Filter

spam-referral-filterYou can exclude them from your reports in Google Analytics by creating a filter. You identify a “unique signature” that identifies them (and only them), and then create a filter based on that.

Most spam can be eliminated by filtering on Campaign Source. Most people try filtering on Referral, and that filter doesn’t always work because some spammers have used utm codes to stuff values into the Source and Medium, imitating a referral. Note that some of the spam now requires you to filter on the Language Settings field  (Spam Crawler 5 below).

campaign-source

Read Google’s instructions on making filters.

The latest filter expression I recommend at at the top of this article. Note that I take some shortcuts in my expressions to save space (there is a limit to the number of characters). I have not yet found any false matches for valid referrals in any of the web properties I have worked with, but you should be cautious of being too aggressive — I have seen some people recommend filtering on simple words like buy|cheap|motor|money|seo which will simply match far too many valid domains to be recommended.

As you discover new spammers you will have to add to the filters. Remember: these filters will exclude everything that matches, so be careful with your expressions, and TEST, TEST, TEST first.

 


4. Create a Custom Segment

To eliminate spam immediately from all of your reports, even historical reporting, you need to use a Custom Segment. If you have prepared the filter expressions above, you’ve already done all the hard work. If you skipped to this section, go back and start at Step 1.

Start with a copy of my segment from the Google Analytics Solution Gallery [2016-12-22], and modify it to suit, or follow these instructions:

In Google Analytics, open your Reporting view, and click +Add Segment.

add-segment

Then click New Segment and enter a name like “All Users (No Spam)“. If you have multiple websites in your account, you should include the website in the name, like “All Users (AnalyticsEdge)“.

eliminate-spam-create-segment-new

Select the Advanced > Conditions tab on the left. Create a new entry for the valid hostnames:

  • Sessions > Include
  • Hostname > matches regexyour valid hostnames expression (#1 above)

Then click + Add Filter and add the expressions for the Spam Crawlers:

  • [+Add Filter]
  • Sessions > Exclude
  • Source > matches regexspam crawler expression #1    OR
  • Sessions > Exclude
  • Source > matches regexspam crawler expression #2    OR
  • repeat for the rest of the filter expressions

Save and Apply the Segment

The easiest way to test is to use your new segment in combination with the default All Users segment, comparing the Sessions counts. You can find your new segment listed in the Custom grouping. You can select BOTH your new segment AND the All Users segment to compare.

Sharing your Segment Definition With Coworkers

segment-collaborationGoogle Analytics segments are normally account-specific, but a new feature allows you to share it with other people that have access to the same Google Analytics view. When editing the segment, click the link in the upper right corner and allow Collaborators to apply/edit the segment,


5. Turn On Google’s Bots & Spiders Option

bots-and-spidersGoogle Analytics has a simple checkbox you can use to exclude easy-to-identify bots and spiders, but you have to enable it for every View you use. In your Google Analytics Admin section, navigate to each View you use, select View Settings, and check the box to Exclude all hits from known bots and spiders.

This feature has recently started affecting referral spam as well (e.g. horoskop-baran.pl / referral), so TURN IT ON!

.

.

That’s It – You Are Spam-Free For Now!

Share on LinkedInTweet about this on TwitterShare on FacebookPin on PinterestShare on Google+Email this to someone

What made me an expert about spam?

My name is Mike Sullivan; I am a Google Analytics Certified professional, and a Top Contributor in the Google Analytics community forum. I have been working extensively with the Google Analytics API since 2010, providing customized reporting solutions. I founded Analytics Edge in 2013, making a suite of free and inexpensive Excel report automation add-ins and connectors.

Spam was hounding my customers, so I dug into the problem with all the tools at my disposal and thought I’d share what I learned. I wrote this Definitive Guide, coining the term “ghost referrals”, to help resolve the confusion surrounding the various spam types and the different techniques required to deal with them.

Due to popular demand, I started a service to update the filters in client’s accounts, and continue to this day, monitoring daily for new spam sources and updating filters as required. If you want to subscribe, click here for instructions.

I hope this article has helped you.


Top spammers past 30 days (2017-04-11)

ACTIVE SPAM REFERRALS:

vc.ru
free-fb-traffic.com
ecommerce-seo.org
amazon-seo-service.com
e-commerce-seo1.com
share-buttons-for-free.com
track-rankings.online
Election.Interferencer.Ru
timer4web.com
slifty.github.io
www1.enbersoft.com

ACTIVE LANGUAGE SPAM:

life.ru/t/%D1%82%D0%B5%D1%85%D0%BD%D0%BE%D0%BB%D0%BE%D0%B3%D0%B8%D0%B8/970904/vladieliets_domiena_googlecom_obvinil_google_inc_v_naghloi_lzhi

COMMON SEARCHES:

amazon-seo-service.com / referral
analytics referral exclusion
analytics referral exclusion list
analytics spam
analytics spam filter
block referral spam google analytics
com.google.android.gm / referral
ecommerce-seo.org / referral
exclude referral google analytics
filter analytics spam
filter google analytics spam
filter out spam google analytics
filter referral spam google analytics
filter spam google analytics
free share buttons top
fuck xyz
google analytics filter referral spam
google analytics filter spam
google analytics referral exclusion list
google analytics referral spam
google analytics spam
google analytics spam filter
google analytics spam referral filter
how to exclude spam from google analytics
how to filter out spam in google analytics
how to filter referral spam in google analytics
how to remove spam from google analytics
referral exclusion list
referral exclusion list analytics
referral exclusion list in google analytics
referral spam blocker
referral spam filter
referral spam google analytics
referral spam list
remove analytics spam
remove referral spam from google analytics
remove spam from google analytics
remove spam google analytics
removing spam from google analytics
spam analytics
spam filter analytics
spam filter google analytics
spam google analytics
spam in google analytics
spam referral google analytics
spam traffic in google analytics
www.fuck-paid-share-buttons.xyz


Historical list of spam sources detected

  • 2017-05-18  <words> cdn site:nodepm.com / keyword spam
  • 2017-05-15 foxweber.com / referral spam crawler
  • 2017-04-27 piulatte.cz /referral ghost spam
  • 2017-04-11 election.interferencer.ru / referral ghost/crawler spam using www.donaldjtrump.com hostname and Election.Interferencer.Ru language settings
  • 2017-04-09 free-fb-traffic.com / referral ghost spam
  • 2017-04-05 track-rankings.online / referral ghost spam
  • 2017-04-02 ecommerce-seo.org / referral ghost spam
  • 2017-03-30 slifty.github.io / referral spam crawler
  • 2017-03-26 enbersoft.com / referral spam crawler
  • 2017-03-20 e-commerce-seo1.com / referral ghost spam
  • 2017-03-20 amazon-seo-service.com / referral spam crawler
  • 2017-03-20 share-buttons-for-free.com / referral spam crawler
  • 2017-03-15 ^vc.ru / referral ghost/crawler spam temporarily added
  • 2017-03-09 e-commerce-seo.com / referral ghost spam
  • 2017-03-01 petitions.whitehouse.gov / referral ghost/crawler spam temporarily added
  • 2017-02-24 motherboard.vice.com / referral ghost/crawler spam temporarily added
  • 2017-02-22 google-liar.ru / referral crawler/ghost spam
  • 2017-02-20 abcacaiberrypills.com / referral ghost spam
  • 2017-02-20 elidadaidaihua.com / referral ghost spam
  • 2017-02-20 lidasale.com / referral ghost spam
  • 2017-02-20 12345678.com / referral ghost spam
  • 2017-02-20 buy2daydietpills.com / referral ghost spam
  • 2017-02-20 mztpills.com / referral ghost spam
  • 2017-02-20 slimming800.com / referral ghost spam
  • 2017-01-25 beepollen-zixitang.com / referral ghost spam
  • 2017-01-25 zixiutangstore.com / referral ghost spam
  • 2017-01-05 #-#.insider.pro / referral ghost spam
  • 2017-01-01 “youtu.be/7td18i0wyey – youtu.be/biblcys8a5i – youtu.be/mmcg3yyacz8 – all world watching these videos” keyword and landing page spam
  • 2016-12-20 top10-way.com / referral SPAM CRAWLER
  • 2016-12-20 “Congratulations to Trump and all americans” language spam using washingtonpost.com referral
  • 2016-12-09 website-analytics.online / referral ghost spam
  • 2016-12-09 “Google officially recommends o-o-8-o-o.com search shell!”  / language spam may have valid hostname, valid or fake referral sources
  • 2016-12-05 “o-o-8-o-o.com search shell is much better than google!”  / language spam may have valid hostname, valid or fake referral sources
  • 2016-11-30 datract.com / referral SPAM CRAWLER
  • 2016-11-30 lifehacĸer.com / referral SPAM CRAWLER
  • 2016-11-29 eyeserp.com / referral ghost spam
  • 2016-11-06 “Secret.ɢoogle.com You are invited! Enter only with this ticket URL. Copy it. Vote for Trump!” / language spam may have valid hostname, valid or fake referral sources
  • 2016-10-26 /sharebutton.to landing page spam
  • 2016-10-26 /www1.free-share-buttons.top landing page spam
  • 2016-10-26 various keyword combinations of sharebutton.to with social, share, buttons, website, this, html, linked, in, add, and other words
  • 2016-10-25 3-letter-domains.net / referral SPAM CRAWLER
  • 2016-10-24 24×7-server-support.site / referral ghost spam
  • 2016-10-12 “cdn front.to” organic keyword ghost spam
  • 2016-10-11 #-#.site-speed-check.site, #-#.site-speed-checker.site, #-#.site-speed-up.site, #-#.site-speed-up.top, #-#.website-speed-check.site, #-#.website-speed-checker.site, #-#.website-speed-up.top / referral ghost spam
  • 2016-10-02 – uptime-alpha.net / referral SPAM CRAWLER
  • 2016-09-24 – golden-catalog.pro / referral ghost spam
  • 2016-09-10 – scanner-alex.top, scanner-alexa.top, scanner-andrew.top, scanner-barak.top, scanner-brian.top, scanner-don.top, scanner-donald.top, scanner-elena.top, scanner-fred.top, scanner-george.top, scanner-ivan.top, scanner-jack.top, scanner-jane.top, scanner-jess.top, scanner-jessica.top, scanner-john.top, scanner-josh.top, scanner-julia.top, scanner-julianna.top, scanner-margo.top, scanner-mark.top, scanner-mary.top, scanner-marwin.top, scanner-nelson.top, scanner-olga.top, scanner-viktor.top, scanner-walt, scanner-walter, scanner-willy.top / referral ghost spam
  • 2016-08-30 – compliance-alex.top, compliance-alexa.top, compliance-andrew.top, compliance-barak.top, compliance-brian.top, compliance-don.top, compliance-donald.top, compliance-elena.top, compliance-fred.top, compliance-george.top, compliance-ivan.top, compliance-jack.top, compliance-jane.top, compliance-jess.top, compliance-jessica.top, compliance-john.top, compliance-josh.top, compliance-julia.top, compliance-julianna.top, compliance-margo.top, compliance-mark.top, compliance-mary.top, compliance-nelson.top, compliance-olga.top, compliance-viktor.top, compliance-willy.top / referral ghost spam
  • 2016-08-27 – homemade.gq / referral ghost spam
  • 2016-08-23 – gq-catalog.gq / referral ghost spam
  • 2016-08-22 – eyes-on-you.ga / referral ghost spam
  • 2016-08-18 – familyholiday.ml, executehosting.com, ussearche.cf, bugof.gq / referral ghost spam
  • 2016-08-18 – 1-free-share-buttons.com / referral SPAM CRAWLER
  • 2016-08-18 – 1-99seo.com / referral SPAM CRAWLER
  • 2016-08-12 – timer4web.com / referral SPAM CRAWLER
  • 2016-08-12 – globalscam.ga, wowas31.ucoz.ru, cookielawblog.wordpress.com, expdom.com / referral ghost spam
  • 2016-08-09 – spin2016.cf, turkeyreport.tk, nyfinance.ml / referral ghost spam
  • 2016-08-03 – fashionindeed.ml / referral ghost spam
  • 2016-08-02 – platezhka.net / referral SPAM CRAWLER
  • 2016-08-02 – californianews.cf / referral ghost spam
  • 2016-07-30 – pogodnyyeavarii.gq / referral ghost spam
  • 2016-07-30 – asacopaco.tk / referral ghost spam
  • 2016-07-28 – biketank.ga / referral ghost spam
  • 2016-07-27 – doyouknowtheword-flummox.ml / referral ghost spam
  • 2016-07-26 – ######.social-s-bbb-xyz / referral ghost spam
  • 2016-07-25 – itrevolution.cf / referral ghost spam
  • 2016-07-21 – kiwi237au.tk / referral ghost spam
  • 2016-07-21 – social-buttons-aaa.xyz, social-buttons-bbb.xyz, social-buttons-ccc.xyz, social-buttons-ddd.xyz, social-buttons-eee.xyz, social-buttons-fff.xyz, social-buttons-ggg.xyz, social-buttons-hhh.xyz, social-buttons-iii.xyz / referral ghost spam
  • 2016-07-21 – law-enforcement-one.xyz, law-enforcement-two.xyz, law-enforcement-three.xyz, law-enforcement-four.xyz, law-enforcement-five.xyz, law-enforcement-six.xyz, law-enforcement-seven.xyz, law-enforcement-eight.xyz, law-enforcement-nine.xyz, law-enforcement-ten.xyz/ referral ghost spam
  • 2016-07-20 – ranking2017.ga / referral ghost spam
  • 2016-07-20 – bestofferswalkmydogouteveryday.gq / referral ghost spam
  • 2016-07-18 – luxmagazine.cf / referral ghost spam
  • 2016-07-17 – ranking2017.ga / referral ghost spam
  • 2016-07-16 – exchangeit.gq / referral ghost spam
  • 2016-07-16 – “eu-cookie-law.info …” organic keyword ghost spam
  • 2016-07-15 – pokemongooo.ml / referral ghost spam
  • 2016-07-13 – bestchoice.cf / referral ghost spam
  • 2016-07-11 – seo-2-0.com / referral SPAM CRAWLER
  • 2016-07-08 – botd.wordpress.com / referral ghost spam
  • 2016-07-08 – eu-cookie-law.blogspot.* / referral ghost spam
  • 2016-07-08 – free-share-buttons.blogspot.* / referral ghost spam
  • 2016-07-04 – law-one.xyz, law-two.xyz, law-three.xyz, law-four.xyz, law-five.xyz, law-six.xyz, law-seven.xyz, law-eight.xyz, law-nine.xyz, law-ten.xyz / referral ghost spam
  • 2016-07-01 – free-share-buttons-???.xyz / referral ghost spam
  • 2016-06-27 – forum.topic#.ilovevitaly.xyz / referral ghost spam
  • 2016-06-19 – slow-website.xyz / referral ghost spam
  • 2016-06-19 – law-enforcement-check-*.xyz / referral ghost spam
  • 2016-06-19 – free-social-buttons-???.xyz / referral ghost spam
  • 2016-06-17 – site-auditor.online / referral ghost spam
  • 2016-06-16 – eu-cookie-law.info organic keywords ghost spam
  • 2016-06-11 – law-enforcement-bot-??.xyz / referral ghost
  • 2016-06-08 – social-buttons-??.xyz / referral ghost spam
  • 2016-06-04 – law-enforcement-??.xyz / referral ghost spam
  • 2016-06-04 – fix-website-errors.xyz / referral SPAM CRAWLER
  • 2016-05-31 – cookie-law-enforcement-ii.xyz / referral ghost spam
  • 2016-05-31 – cookie-law-enforcement-hh.xyz / referral ghost spam
  • 2016-05-31 – cookie-law-enforcement-gg.xyz / referral ghost spam
  • 2016-05-29 – cookie-law-enforcement-ff.xyz / referral ghost spam
  • 2016-05-28 – magicdiet.gq / referral ghost spam
  • 2016-05-28 – forum.topic#.ghostvisitor.com / referral ghost spam
  • 2016-05-28 – ghostvisitor.com / referral ghost spam
  • 2016-05-27 – burn-fat.ga / referral ghost spam
  • 2016-05-27 – cookie-law-enforcement-??.xyz / referral ghost spam
  • 2016-05-27 – eu-cookie-law-enforcement#.xyz / referral ghost spam
  • 2016-05-27 – http://link.web-list.xyz/ / referral ghost spam
  • 2016-05-27 – keywords-monitoring-success.com / referral SPAM CRAWLER
  • 2016-05-27 – monetizationking.net / referral ghost spam
  • 2016-05-27 – popads.net / referral
  • 2016-05-25 – www.get-free-social-traffic.com organic keywords ghost spam
  • 2016-06-23 – ownshop.cf / referral ghost spam
  • 2016-05-20 – a.pr-cy.ru / referral spam crawler
  • 2016-05-20 – eu-cookie-law-enforcement-#.xyz / referral ghost spam
  • 2016-05-18 – getlamborghini.ga / referral ghost spam
  • 2016-05-16 – dominateforex.ml / referral ghost spam
  • 2016-05-08 topquality.cf / referral ghost spam
  • 2016-05-05 share-button.xyz / referral ghost spam
  • 2016-05-04 marketland.ml / referral ghost spam
  • 2016-04-30 unpredictable.ga / referral ghost spam
  • 2016-04-28 increasewwwtraffic.info / referral ghost spam
  • 2016-04-26 website-stealer-warning-alert.hdmoviecams.com / referral ghost spam
  • 2016-04-25 ‘i came up with a method and 1,5 years forcing…’ ghost organic keywords
  • 2016-04-25 lots of other ghost organic keywords, including share-button.xyz, m-google.xyz, socialbutton.xyz, and others
  • 2016-04-25 social-traffic-#.xyz / referral ghost spam
  • 2016-04-18 smartphonediscount.info / referral ghost spam
  • 2016-04-18 free-social-buttons#.xyz / referral ghost spam (# is 2, 3, 6, 7, etc)
  • 2016-04-17 forum.topic#.6hopping.com and free-social-buttons6.xyz / referral ghost spam
  • 2016-04-14 keywords-monitoring-your-success.com / referral crawler
  • 2016-04-11 makeprogress.ga / referral ghost spam
  • 2016-04-10 m-google.xyz and fuck-paid-share-buttons.xyz / referral ghost spam
    2016-04-09 free-video-tool.com / referral spam crawler added to filter expressions
  • 2016-04-08 getrichquickly.info / referral ghost spam
  • 2016-04-01 getrichquick.ml /referral and яндех-херня.рф / referral ghost spam
  • 2016-03-23 magnet-to-torrent.com / referral and torrent-to-magnet.com / referral added to spam crawler filters
  • 2016-03-23 adtiger.tk / referral ghost spam
  • 2016-03-21 wordpresscore.com/ referral ghost spam
  • 2016-03-18 feedback.sharemyfile.ru /referral ghost spam
  • 2016-03-15 rank-checker.online / referral ghost spam
  • 2016-03-10 dogsrun.net / referral spam crawler
  • 2016-03-05 #.responsive-test.net / referral spam crawler
  • 2016-03-03 why.does.spacebarnot.work? / organic search from o-o-11-o-o.com hostname ghost spam
  • 2016-03-03 uptime.com / referral spam crawler
  • 2016-02-28 hostgator.com  / referral ghost spam
    stablehost.com  / referral ghost spam
    digitalfaq.com  / referral ghost spam
    bluehost.com  / referral ghost spam
    site5.com  / referral ghost spam
    cutalltheshit.com  / referral ghost spam
    veerotech.com  / referral ghost spam
    mddhosting.com  / referral ghost spam
    siteground.com  / referral ghost spam
  • 2016-02-27 domain-tracker.com  / referral ghost spam
  • 2016-02-16 go.ekatalog.xyz / referral ghost spam
  • 2016-02-10 китай.с.новым.годом.рф / referral ghost spam
  • 2016-01-26 how.to.travel.and.make.money.with.maps.ilikevitaly.com / referral ghost spam
  • 2016-01-22 web-revenue.xyz / referral ghost spam (traffic2cash.xyz)
  • 2016-01-22 free-traffic.xyz / referral ghost spam (sharebutton.to)
  • 2016-01-20 social-widget.xyz / referral ghost spam (sharebutton.to)
  • 2016-01-20 free-social-buttons.xyz / referral ghost spam (sharebutton.to)
  • 2016-01-20 net-profits.xyz / referral ghost spam (sharebutton.to)
  • 2016-01-20 traffic-cash.xyz / referral ghost spam (traffic2cash.xyz)
  • 2016-01-13 rankings-analytics.com / referral spam crawlers
  • 2016-01-03 share-buttons.xyz / referral ghost spam (sharebutton.to)
  • 2016-01-01 с.новым.годом.рф / referral (ilovevitaly) ghost spam
  • 2015-12-31 happy.new.yeartwit.com / referral (ilovevitaly) ghost spam
  • 2015-12-29 build-a-better-business.2your.site / referral ghost spam (ontraport.com)
  • 2015-12-25 build-audience.for-your.website/ referral ghost spam  (easyvideosuite.com / easywebinar.com)
  • trafficgenius.xyz / referral ghost spam (publishvault.com)
  • new-look.for-your.website / referral ghost spam (teslathemes.com)
  • onlinetvseries.me / referral ghost spam
  • uptimechecker.com / referral spam crawlers
  • uptimebot.net / referral spam crawlers
  • topseoservices.co / referral ghost spam (www.semrush.com)
  • website-analyzer.info/ referral ghost spam (ranksonic.com)
  • trafficgenius.xyz / referral ghost spam (publishvault.com)
  • smarter-content.for-your.website / referral ghost spam (scribecontent.com)
  • 2015-12-22 traffic2cash.xyz / referral ghost spam
  • 2015-12-21 w3javascript.com  / referral ghost spam
  • 2015-12-20 website-stealer.nufaq.com / referral ghost spam
  • 2015-12-19 website-stealer-warning.hdmoviecamera.net / referral ghost spam
  • 2015-12-14 ^scripted.com /referral added to spam crawlers filters
  • 2015-12-09 googlemare.com / referral (ilovevitaly) ghost spam
  • 2015-12-08 boost-my-site.com / referral (ranksonic.com) ghost spam
  • 2015-12-08 top1-seo-service.com / referral (semalt.com) added to spam crawler filters
  • 2015-12-04 santasgift.ml / referral ghost spam
  • 2015-12-04 rusexy.xyz / referral ghost spam
  • 2015-12-02 quit-smoking.ga / referral ghost spam
  • 2015-12-01 o-o-8-o-o.com / referral ghost spam
  • 2015-11-27 cyber-monday.ga /referral ghost spam
  • 2015-11-27 fast-wordpress-start.com / referral (startwp.org) added to spam crawler filters
  • 2015-11-27 lsex.xyz / referral (http://work-from-home-earn-money-online.com/) ghost spam
  • 2015-11-27 traffic2cash.org / referral ghost spam
  • 2015-11-26 black-friday.ga / referral ghost spam
  • 2015-11-26 kiwe-analytics.com / referral ghost spam
  • 2015-11-24 adf.ly / referral ghost spam
  • 2015-11-23 hosting-tracker.com / referral (syfonix.com) ghost spam
  • 2015-11-20 wordpress-crew.net / referral added to spam crawler filters
  • 2015-11-19 get-your-social-buttons.info / referral (sharebutton.to) ghost spam
  • 2015-11-18 traffic2cash.net / referral ghost spam
  • 2015-11-18 ranksonic.net / referral (ranksonic.com) ghost spam
  • 2015-11-17 snip.to / referral (snip.ly) ghost spam
  • 2015-11-16 alibest.com /referral ghost spam
  • 2015-11-16  claim#######.copyrightclaims.org / referral (ilovevitaly.com) ghost spam
  • 2015-11-16 dbutton.net / referral spam crawler
  • 2015-11-16 o00.in / referral spam crawler
  • 2015-10-08 rankings-analytics.com /referral spam crawler
  • 2015-10-04: nexus.search-helper.ru / referral spam crawler
  • 2015-09-21: rednise.com / referral ghost spam
  • 2015-09-21: 127.0.0.1:80## / referral: some spammers don’t know what they are doing
  • 2015-09-16: best-seo-software.xyz / referral ghost spam
  • 2015-09-15: justprofit.xyz added to spam crawler filters
  • 2015-09-01: qualitymarketzone.com which redirects to www.tkqlhce.com
  • 2015-09-01: seo-platform.com which redirects to affiliate.ranksonic.com
  • 2015-08-26: ghost spam is free from the politics, we dancing like a paralytics / organic keywords
  • 2015-08-15: how-to-earn-quick-money.com / referral
  • 2015-08-13: sexyali.com / referral**
  • 2015-08-13: hongfanji.com / referral
  • 2015-08-09: free-floating-buttons.com / referral
  • 2015-08-09: get-free-social-traffic.com / referral
  • 2015-08-05: satellite.maps.ilovevitaly.com / referral
  • 2015-08-05: chinese-amezon.com / referral
  • 2015-07-29: pops.foundation / referral
  • 2015-07-24: traffic2money.com / referral
  • 2015-07-20: e-buyeasy.com / referral
  • 2015-07-02 site#.floating-share-buttons.com / referral
  • 2015-06-26 erot.co / referral
  • 2015-06-18 webmonetizer.net / referral
  • 2015-06-09 howtostopreferralspam.eu / referral and organic
  • 2015-06-04 trafficmonetizer.org / referral
  • 2015-06-03 непереводимая.рф / referral
  • 2015-06-01 непереводимая.рф / organic
  • 2015-05-29 sanjosestartups.com / organic
  • 2015-05-27 websites-reviews.com / referral
  • 2015-05-26 sanjosestartups.com / referral
  • 2015-05-21  ilovevitaly.com / organic
  • 2015-05-19 s.click.aliexpress.com / organic
  • 2015-05-15 site4.free-share-buttons.com / referral and free-social-buttons.com / referral and webmaster-traffic.com / referral
  • 2015-05-06 www.event-tracking.com / referral, www.kabbalah-red-bracelets.com / referral, guardlink.org / referral and some spikes of direct traffic (direct) / (none).
  • 2015-04-28 (google / organic) search spam with keyword “vitaly rules google…”
  • 2015-04-24 (free-share-buttons.com / referral, pornhub-forum.ga / referral, youporn-forum.ga / referral, rapidgator-porn.ga / referral, domination.ml / referral, torture.ml / referral, www.Get-Free-Traffic-Now.com / referral, buy-cheap-online.info / referral, theguardlan.com / referral)
  • 2015-04-06 (editors.choice#######.hulfingtonpost.com / referral) and (googlsucks.com / referral) and Get-Free-Traffic-Now.com
  • 2015-04-02 addons.mozilla.org / referral
  • 2015-03-26 4webmasters.org / referral
  • 2015-02-23 www1.social-buttons.com / referral
  • 2015-03-16  s.click.aliexpress.com / referral and simple-share-buttons.com
  • 2015-03-11 ranksonic.org
  • 2015-03-04 humanorightswatch.org / referral
  • 2015-02-25: o-o-6-o-o.com / referral
  • 2015-02-11: message####.cenokos.ru
  • 2015-02-04: bestwebsitesawards.com / referral
  • 2015-01-27:  cenoval.ru / referral
  • 2015-01-19:  “google officially -recommends ilovevitaly.com search shell” and “resellerclub scam” organic
  • 2015-01-15: “hulfingtonpost.com / referral
  • …and more…

Comments: (moderated, no spam)

  1. Paul de Fombelle

    Hello!
    Fantastic work, thank you. I read a lot of incomplete content about this issue before landing here.

    There is one identified source of Google Analytics spam that you didn’t mention:

    -Direct traffic
    -Correct hostname
    -100% new users, 100% bounce, avg. session duration 00:00:00
    -Service provider: ovh hosting inc
    -All traffic comes from specific cities: in my case Farmington (USA) and Macapa (Brazil)
    -I read that IP address seems to be 158.69.229.6, but I don’t know how to check this (seems to be a known IP: https://www.abuseipdb.com/check/158.69.229.6?page=1#report)

    I’m affected by this, and it looks like a lot of other people are. Are you aware of it?

    Again, thank you for your amazing work.
    Best,
    Paul

    Reply
    1. mike_sullivan

      Yes, I am aware of it, and a few of my customers have specific filters to remove it, but as I said to Donna, it may be worth blocking the traffic as opposed to just filtering it out of GA. Some of the ecommerce bots do more than just collect data — some try to reserve inventory by putting things into shopping carts. This is all outside of my area of specialty.

      Reply
  2. Donna Duncan

    Hi Mike,
    I’m wondering if you have a suggestion for how to deal with Amazon’s bots that artificially inflate traffic and bounce rates.

    Reply
    1. mike_sullivan

      First, I think you mean the amazonaws bots which are running on their cloud server farms. Many are targeting specific websites for various reasons. You can try filtering by network domain or service provider, but you would be better to try blocking by .htaccess file and IP address ranges since these bots can sometimes be used for not-so-nice purposes to mess with your ecommerce platform. This is all outside the scope of this article, and can be a lot more technical.

      Reply
  3. Dandelion

    Here’s a followup question. I’ve had an issue with sharebutton.to. I see that you reference them on your Historical List several times. Yet they aren’t in your filter expressions. I assume because they aren’t a .com? Should I just create a stand alone exclude by campaign source filter for this one and others that have different domain extensions?

    Reply
  4. Dandelion

    I’ve wanted to write for a long time and say “THANK YOU” for taking the time to write up this guide! It has meant so much to have reliable information with step by step instructions I can actually apply!!

    I have created new properties for all my young sites and I’ve applied the hostname include filter (first to test view, then to master). Now I’m ready to apply step 3 to my sites but I am a little confused.

    Should I create exclude filters for all 6 of the expressions you have at the top of the article?

    And, is there a reason why they aren’t in any particular order? That confused me a bit. First there’s one from 2015, then a few from March 2017, but not always following reverse chronology. So I thought I might be misunderstanding something.

    Reply
    1. mike_sullivan

      Yes, all 6 are needed. No order…some were modified to drop old temporary entries and add newer ones.

      Reply
  5. Paul

    I am getting a number of referrals from a domain adspreview.simpli.fi but it does not seem to be included in the filters. Simpli.fi appears to be an ad platform, but the referrals are 100% bounces. Is this a spam referrer that others are seeing.

    Reply
    1. mike_sullivan

      I have not seen this in any of the accounts I monitor, so it is not widespread — it could be targeted. Contrary to popular opinion, 100% bounce rate does not automatically mean spam. You need to look at landing pages, city/country of origin, and browser/browser version. Another telltale is whether the referral left in your account takes you somewhere that does not have a link to your site. Since Simpli.fi is an advertising platform, and the server is ‘adspreview’, it could be from an ad that you (a customer?) or an affiliate is running. It is very common for poorly constructed ad campaigns to produce 100% bounce rates.

      Reply
  6. Tim Rowley | Web 4 Panama

    I’ve looked through many and this is absolutely the best guide to Analytics spam filtering out there. This is so helpful.

    Reply
  7. Phil

    The last few nights I have been doing some serious research on this topic. All of the sudden my daily visits shot up and I could tell something wasn’t right about the traffic. It is irritating that they started all of a sudden. Oh well. Thank you for the write up and suggestions.

    Reply
  8. Jamie

    Why are alternate domains/domains that I redirect to my main website not valid hostnames that I should include in my reports? Even though they are not the primary domain, don’t I still want to track those hits in Google Analytics?

    Reply
    1. mike_sullivan

      If you type in the alternate domain and your browser automatically redirects you to your main domain (as shown in the browser address bar), then no one would ever actually ‘visit’ the alternate domain — they ‘land’ on your main domain. Your alternate domain should not be included in the valid hostname filter.

      If you type in the alternate domain and your browser shows your website content with the alternate domain in the browser address bar, then people can visit that domain, and it should be included in the valid hostname filter.

      Reply
  9. Murray Finlayson

    Hi Mike,
    Wondering what you can tell us about MJ12Bot?
    Most of my clients are hosted on Shopify and they all seem to have this listed in the robots.txt. I have tried removing it, but it won’t seem to go away.
    It seems to hammer the (Shopify) server, so when ahrefs (fully legitimate) tries to crawl the site they receive an error: ‘429 Too Many Requests’. The problem is ahrefs then reports no backlinks, which subsequently impacts ahrefs (and I believe) MOZ metrics.
    My theory may be complete BS, so would appreciate your views on MJ12Bot? Perhaps it is completely harmless?
    Murray

    Reply
    1. mike_sullivan

      Not my area of specialty, but I understand you can’t change the Shopify robots.txt. Crawlers are the responsibility of the organization running them — it is their problem to deal with the 429 error, not yours. If they fail, they should return to crawl another day – that is generally how they work. If they get 429 errors, they should back off and crawl your site slower. If that were the whole situation, it should resolve itself in time. There is nothing to do, especially if the bot is for link metrics usage only (as opposed to search engine discovery).

      But…there are a number of not-so-nice bots hammering ecommerce servers for a variety of reasons, and Shopify likely throttles them to prevent real people from experiencing delays. From a hosting perspective, bots that do not respect the robots.txt file should get throttled or blocked.

      Reply
  10. thomas

    Mike fantastic Blog thx.

    I just implemented a language spam filter. However the filter only allowed my to use 255 characters. So do I have to set up several language filters?
    Further, is it just my feeling or did Analytics get rid of quite a lot of language spam by January 1, 2017?

    Reply
    1. mike_sullivan

      Setting up a language spam filter is a waste of time right now –it all stopped before New Years.

      Yes, there is a limit to the length of a filter, but why would you need a long one for language? The short little regex expression I provide is just as (if not more) effective.

      Reply
    2. Alex Denne

      Hey Thomas,

      The language filter is just “.{13,}|\.”

      So you shouldn’t have an issue with a character limit there.

      Best,

      Alex

      Reply
  11. Chip

    After applying this, the only issue now is that ALL referral traffic has been removed from my reports and I know that there is legitimate referral traffic being filtered out. How do we rectify this? Thank you.

    Reply
    1. mike_sullivan

      What can I say….you did something wrong. You have added an extra vertical bar ( “||” ) in an expression, or ended an expression with one (“…domain.com|”) – those are the common mistakes. The expressions in the article are in active use in hundreds or properties and work.

      Reply
  12. hatwar

    Hi Thank for the post
    I applied all the rules mentioned. After that something strange has happened my hostname source is showing zero from the very beginning of the site, more than a year. Has no one ever visited the site

    add up to this.. when i set the host name to (not set) it shows 100% sessions with the website url 0%. Are they any hosting issues .. Unfiltered view and master view are providing same statistics for one day after it has been set.
    Are there any problem with hosting?

    Reply
    1. mike_sullivan

      Re-read the section on hostname filter carefully. If you get it wrong, it will let nothing through – practice with the Segment as described in the last section.

      If your hostname report ONLY shows (not set) listed as a hostname, then yes, there is NO REAL TRAFFIC being recorded in that Google Analytics account. This does happen to people. The usual reason is that the tracking code snippet required is not on your website. That is a Google Analytics setup requirement — it often gets dropped or forgotten during website updates.

      Reply
  13. Michael

    Just recognised you cant set a 2 dimensional filter for a view.
    That force me to exclude the USA at all.

    Reply
    1. mike_sullivan

      If the traffic is actually ‘spam’ it would leave something behind that points you to a website. Since this traffic does not seem to do that, I think it may be ‘real’ traffic that is somehow directed to your website…possibly in error, or possibly by someone else for unknown reasons. Facebook traffic typically appears as spikes because of the nature of the Facebook feeds — things appear at the top for a brief time then quickly get pushed down the page.

      To filter any traffic you don’t want, first identify the UNIQUE combination of things — in your case, you mentioned a campaign ID behind a query parameter. I would filter on that. I do not recommend filtering by country unless you are a local-only business. Make sure to keep an UNFILTERED view just in case this turns out to exclude real traffic by accident.

      Reply
      1. Michael

        Thanks for the answer.

        This FB Traffic has a spike only in 1 hour. If it would be normal FB Traffic it should have gone over some more hours.
        Unfortunatelly combining does not work.
        But I am afraid that is is real traffic, but false one.
        I cant use the parameter, because some seems to be ok.
        I will try to find a way to get rid of it. Normally I wont filter a whole country but it is a local store so the collateral damage should be very low.

  14. Michael

    Hello Mike,

    thanks for caring about this topic.
    I recognised a lot of facebook referring traffic. It appears in spikes. The bouncrate is nearly 100%. The host name is valid. The traffic come from the USA, while the website is in Europe and has no business in the USA. First it was just one city Oshkosh. Nowadays there are 7 citites left, after I filtered Oshkosh.
    When a spike hits, it is about a third of the daily traffic.
    What also marks it as spam is that the traffic appear in midnight.

    The only special about this traffic that it puts a Facebook campagin ID behind a query parameter (‘s=’).
    Do you know anything about that? Maybe it is some kind of spam you dont mentioned yet.

    I will start filter the parameter in combination with the USA. Do you have any other suggestions?

    Reply
  15. Hans_F

    Is there an easy way to do automated reporting with these spam filter segments? I work with a lot of different accounts and changing the spam segments manually to include the specific hostname is taking up quite some time. I want to automate my reporting so that me and my colleagues can work more efficiently.

    I’ve been looking for an answer but couldn’t really find a good solution to fully automate this proces.

    Reply
  16. Gabriela

    I just loved this post, is the best spam removing tutorial i’ve ever read. Thanks for it and for updating it all the time.

    I have a question: when you create a custom segment and you write REGEX you aren’t scaping dots and – … is it right to write it like this:

    127.0.0.1|nexus.search-helper.ru|rankings-analytics.com|dbutton.net

    or it would be correct to scape characters like this:

    127\.0\.0\.1|nexus\.search\-helper\.ru|rankings\-analytics\.com|dbutton\.net….

    I’m not sure how the regex will work if I don’t scape those characters.

    Reply
    1. mike_sullivan

      The proper term is ‘escape’ the dots, which are special characters in Regular Expressions. By default, they mean ‘any character’. In a lot of my filters, I do bother with the escape character (“\.” instead of “.”) because there is a very, very remote possibility that a character other than the dot itself could result in a match. The filter expression has a character limit, so those extra characters would force extra filters to be made. To be correct and safe, escape the dots.

      Reply
  17. Daniel Ndukwu

    I really appreciate this post, That Donald Trump Spam was really skewing my analytics data.

    I feel like a spam fighting boss right now.

    Reply
  18. Jan

    Hi Mike, thanks a lot for the good work!
    I imported the segment from the Analytics Solutions Gallery and noticed that sometimes you use the top level domain extension (.com, .ru) and sometimes you don’t. Is there a specific reason for this difference?
    Regards!

    Reply
    1. mike_sullivan

      The expression needs to match spam but not real sources. Sometimes there is a valid source domain with a different TLD.

      Reply
  19. Anita

    Hello Mike,
    thanks for this great article.
    I have two questions:
    A)
    I have found a similar article and for the Language spam filter the other article suggests to use
    \s[^\s]*\s|.{15,}|\.|,

    you suggest
    .{13,}|\.

    Can you explain the difference please?

    B)
    It seems that you are updating the Spam Crawler Filter Expressions on a regular basis. Will you do that in the future, too?
    So all I would have to do is, to visit your site an apply the new Filter expressions (if they suit for my site). Is that correct?

    (I am not only applying this on my GA I am also writing an article in german and I will link to your article as well)
    Thank you!
    Anita

    Reply
    1. mike_sullivan

      The other expression includes a criteria about spaces, 15 characters instead of 13, and commas. You could use that if you want. The key is that it must match the spam but not a real language.

      Yes, I have been updating the expressions for the past 2 years and will continue to do do.

      Reply
  20. Ron

    Recently i created my google analytic account and associate with my google search console. I have noticed some suspicious activities in my Language section. When I go to Audience > Overview the i find certain suspicious characters in the Language section (beside Demographic). Looks like spams to me. I showed the screenshot to an expert and she old me that those are referral spams. How to get rid of them?

    Reply
  21. Vivek Patel

    Hi Mike,

    Few of us have notice language spam “Vitaly rules google ☆*:。゜゚・*ヽ(^ᴗ^)ノ*・゜゚。:*☆ ¯\_(ツ)_/¯(ಠ益ಠ)(ಥ‿ಥ)(ʘ‿ʘ)ლ(ಠ_ಠლ)( ͡° ͜ʖ ͡°)ヽ(゚Д゚)ノʕ•̫͡•ʔᶘ ᵒᴥᵒᶅ(=^ ^=)oO” “Google officially recommends o-o-8-o-o.com search shell!”

    While the hostname for these http://www.usatoday.com, http://www.cnn.com and http://www.telegraph.co.uk.

    Is it a good idea to create exclud filter for hostname?

    If anyone can guide me, it will be appriciated. :)

    Thanks in advance!
    Vivek

    Reply
  22. Barry

    You’ve probably already noticed, but it looks like there’s new language spam which has no ‘.’ in it:

    Vitaly rules google ☆*:。゜゚・*ヽ(^ᴗ^)ノ*・゜゚。:*☆ ¯\_(ツ)_/¯(ಠ益ಠ)(ಥ‿ಥ)(ʘ‿ʘ)ლ(ಠ_ಠლ)( ͡° ͜ʖ ͡°)ヽ(゚Д゚)ノʕ•̫͡•ʔᶘ ᵒᴥᵒᶅ(=^ ^=)oO

    Can you please update with how to get around that one!? Thanks!

    Barry

    Reply
  23. Julio Siqueira

    Hi, Mike!
    First of all, thanks for the great explanation/tutorial on how to deal with these referral spams. It’s really helpful.
    Secondly, I’ve also noticed some of this spam shows up in the social section of analytics. Any specific thing to do about this or is it already covered in the filters you mentioned?
    Best,

    Reply
  24. vera...obmana

    i can see that for these fake refferals, like, reddit.com, lifehacĸer.com, motherboard.vice.com the “network domain” is clodo.ru

    is there a way to filter our “network domain”? i tried but it was not in the selection for exclude filter

    Reply
    1. mike_sullivan

      I do not recommend filtering by domain if the domain is a generic ISP such as clodo.ru (that’s a hint…ISP domain). Also, the spammer uses a lot of other ISPs, so the filter is of limited use.

      Reply
  25. Shailendra Dubey

    Hi Mike,

    Thanks for sharing useful article.

    My problem is that i am getting spam referrals with valid host-name. Below is the sample of the report

    Referral Path Hostname
    /new-revolutionary-shell-from-lifehacĸer.com mywebsite.com

    How can i handle this?

    Reply
    1. mike_sullivan

      Yes, the latest spam attacks have used valid hostnames for some websites. The lifehacker (with a weird ‘k’) would be blocked by the Spam Crawlers 4 expression – use that in your segment.

      Reply
  26. Irene

    Hi everyone!

    This article is great, I implemented all the filters and I block all the spam for now, thank you very much.

    I only have a doubt. Although I don’t see the spam referral sources anymore in GA Acquisition, I still can see them in Real Time. Is that normal? Maybe I made some mistake implementing the filters? I talk specifically about motherboard.vice.com.

    Thank you in advance!

    Reply
    1. mike_sullivan

      Yes, it is normal to see some spam in the real time views but not in the main reports. That is your filters (and Google’s) at work.

      Reply
  27. Ady Jo

    I’ve been trying to filter several Ghsot Referral sites, but the same ‘…would not have changed your data.’ message appears when filter is verified. I’ve tried putting the tunnel-bar, even include a backslash before each period/dot. Any suggestions where te edit to make filter verifiable?

    Here’re the list of our site’s ghosts:[…]

    I’d appreciate any insights on this, thanks.

    Reply
    1. mike_sullivan

      I do not comment on individual filter expressions since it is too easy to overlook a space or extra character in the wrong place – I publish a set that I know works. The filter verification feature often does not work – try your filter in a test view first.

      Reply
  28. TrendyInners

    My analytics account got effected by spammers. It is very useful but i have 50 hostnames how can i find the spam hostname.

    Reply
    1. mike_sullivan

      Re-read the section on valid hostnames — most people have 1 valid hostname, so the other 49 would be fake hostnames used by the spammer. Your valid hostname appears in the browser address bar when you visit your website. My website is NOT http://www.foxnews.com, so that is NOT a valid hostname for my analytics.

      Reply
  29. Lars

    Hi Mike,

    I am a beginner in the industry so excuse me if this question seems very basic.

    In the beginning of this post you say that there is no ‘undo’ button for a filter. What do you mean by this?

    In my GA account, next to every filter it says “delete”. Did they recently change it or is there something else I’m missing here?

    Reply
    1. mike_sullivan

      Once data in your view is processed for the day, there is no changing it. If you create a filter that removes too much data, and don’t fix it (or delete it) for a couple of days, all that data is lost forever — it cannot be ‘undone’ for those days. Google Analytics will not go back and reprocess the data for the past.

      Reply
    1. mike_sullivan

      Language Settings is the Language field in your reports. My use of the expression \. means anything with a period in it, which covers all web addresses. The dot character is a special character in Regex, so it needs to be ‘escaped’ with a \ in front of it: \.

      Reply
      1. Jason Bouwmeester

        Appreciate the quick response Mike. And I understand the use of the \. – just not sure how the Language Settings would have a web address in it, or how this only filters out the fake traffic based on Unicode characters in the URL.

      2. mike_sullivan

        Check your Audience > Geo > Language report….. “Secret.ɢoogle.com You are invited! Enter only with this ticket URL. Copy it. Vote for Trump!” is showing up as a language for many fake referrals. If you filter on the dot, it doesn’t matter that the ‘G’ is a Unicode character.

      3. Donna Duncan

        Mike, Are we supposed to add the language settings exclusion to the new reporting segment as well? If yes, would we choose

        add filter
        exclude
        language contains
        \.

      4. mike_sullivan

        No…Be careful with the Segment expression…

        With a Custom Filter, the expressions are always “Regular Expressions” or Regex. In Regex, the dot character is a special character and must be ‘escaped’ by preceding it with a backslash (\.).

        With a Custom Segment, you can choose between “contains” or “matches regex”. If using “contains” the expression is a simple dot (.), but if using “matches regex” the expression needs to be escaped (\.).

  30. raaahlouf

    hi, how can you be aware of all theses thinks and details?? How can you be informed of so many things on the subject? It’s still strange …. it’s to wonder if you are not a link in the chain!

    Reply
    1. mike_sullivan

      I am surpised no one else has voiced that concern yet. I know the things I do because of the business I run — I sell products that download data direct from Google Analytics and other sources for people to build custom reports in Microsoft Excel. My customers saw the spam way back in December 2014 and I could tell that it was different from previous referral spam. With the API, you can download 7 dimensions at the same time, so it is easy for me to see Source/Medium, Hostname, City, Country, Browser, Operating System, Language all in one report, and that makes it easy to identify a unique combination of traffic that ‘spikes’ when these fake referrals appear.

      Since that time, I created my own tracking programs that scan for new sources of traffic daily. I started offering a service to my customers to manage filters for them, and now I scan hundreds of their sites on a daily basis, looking for common traffic amongst them [I build analysis tools for a living]. Because I offer a service to keep my customers spam free, I have to review the data every day and respond to any new threat. I share what I see with everyone through my articles and social media because my spam filtering service is a side business for me.

      Reply
  31. Luigi Riccobono

    Very great article! Mike Thanks for your job. I will post in my blog a direct link to this page to share your precious and helpful work!

    Reply
  32. corrie

    in including those hostnames, did you created another view that includes hostnames? or just created a filter under the test view? I am testing your steps now, and we have a test view for creating changes. thanks for a very insightful article!

    Reply
    1. mike_sullivan

      The valid hostname filter is created in the Test view along with the other spam filters, and after confirmation, applied to your main reporting view.

      Reply
  33. mike_sullivan

    The ghost spam in Google Analytics would have no impact to Google AdWords functionality, since it actually does not exist. AdWords tracks clicks by a different method that Analytics tracks sessions.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *