it's a raspberry pi
Setups & configurations
Tailor shop
✂ Embroidery & pennantWe got some boring movies
🎥 CinematiqueRegarding unity3d
🙂 In-game char iconsSocial news aggregation
🌍 Windward #redditDeveloper briefing
☸ Dev's changelogOfficial Windward wiki
🛠 Wiki @gamepediaJunk does not deliver mail
✉ Private messageNo ads, no trackers,
no web beacons
Get the weather widget
🌤 Weather code snippetHotchpotch of weblinks
📖 Yellow pagesPrecautions, cutback, elimination, prevention
This article is related to Webalizer - Server usage reports.
Referrer spam (also known as referral spam, log spam or referrer bombing[1]) is a kind of spamdexing (spamming aimed at search engines). The technique involves making repeated web site requests using a fake referrer URL to the site the spammer wishes to advertise. Sites that publish their access logs, including referer statistics, will then inadvertently link back to the spammer's site. These links will be indexed by search engines as they crawl the access logs, improving the spammer's search engine ranking. Except for polluting their statistics, the technique does not harm the affected sites.
✂ Wikipedia
... , the technique does not harm the affected sites.
Referrals may harm the affected sites. You can get bad reputations from search engines. Clickable links even may direct your visitors or yourself to infected websites !
You'll need access & full administrative rights to deal with
/var/www/html # http(s) root /var/www/html/.htaccess # directive file, server behaviour /var/www/html/robots.txt # directive file, webcrawlers /var/www/html/OutputDir # Webalizer output directory /var/www/html/OutputDir/webalizer.current # Webalizer main database /etc/webalizer/webalizer.conf # Webalizer configuration file /var/log/apache2/access.log # Apache server access log file /var/log/apache2/error.log # Apache server error log file
Rules of thumb Do never ever use the so-called bulk submission services, even if they offer it for free. Sooner or later you are confrontated with a lot of scammers and spammers. Extra early in your mail box. At first go for the Big Three Bing/Yahoo! Google
. Always submit your websites in the ordinary manner.
Reputable search engines shall obey the robots.txt
file. Instruct the search engine crawlers to not crawl your statistic pages, they will not publish the links in their directories. Set this on top in your robots.txt
Keep in your minds: same as the .htaccess
the robots.txt
functions like a batch file.
User-agent: * ... Disallow: /OutputDir ...
Do never use the Webalizer's standard output directory '/webalizer'. Spammers could search for that.
Move the content from '/webalizer' to your HDD and copy it back into the new directory.
root@raspberry:~# nano /etc/webalizer/webalizer.conf
OutputDir /var/www/html/myserverstats # whatever to name it
Leave the META-tag with "noindex,nofollow".
HTMLHead <meta name="robots" content="noindex,nofollow">
Referrer option determines if entries in the referrer table should be plain text or a HTML link.
LinkReferrer no # standard format, plain text HideURL /OutputDir IgnoreURL /OutputDir
Understanding Apache's server access.log file
Reading from the left hand to the right hand.
303.202.101.321 # decimal IP address of the client ba04d64a # hexadecimal IP address of the client or 303-202-101-321 # IP address of the client in combination with .client.example.com # identity of the client determined by identd # on the client’s machine. Returns a hyphen (-) # if this information is not available [28/Dec/2017:10:34:12] # time that the request was received "GET /pic.png http/1.1" # request line http method used & source 200 # status code, the server sends back 5867 # size of the object requested "http://spam.com.ua/" # the referral link. Returns a hyphen (-) # if this information is not available "Mozilla/5.0 (...)" # the user agent. Returns a hyphen (-) # if this information is not available
Once you got infected, so what's next? Take the prescription in ❸ first.
Consult /var/log/apache2/access.log
and /var/log/apache2/error.log
for investigation.
Study Log Files to Apache http Server Version 2.4.x .
Limit the bad bots
activities by your .htaccess
directives. This only limits the bandwidth taken!
You may limt the access from various IPs, domains and top level domains (TLD).
Set this on top in your .htaccess
Resulting with error code 403 Access forbidden
<Limit GET> Require all granted Require not ip 111.222.333.444 Require not ip 555.666.777 Require not ip 888.999 Require not host spam.com.ua Require not host info </Limit>
Apart from that in case you got spam in your forum, guestbook or any board as well then use
<Limit POST> Require all granted Require not ip 111.222.333.444 Require not ip 555.666.777 Require not ip 888.999 Require not host spam.com.ua Require not host info </Limit>
Once you changed the OutputDir, redirect the unwanted to a harmless external page.
Redirect /OutputDirOld https://duckduckgo.com/about ErrorDocument 403 https://duckduckgo.com/about
That's not all. Now we try harder and step ahead to evil's core.
root@raspberry:~# nano /etc/webalizer/webalizer.conf
Scroll down and look for the examples to IgnoreReferrer
.
Here you can set whatever you desire to keep off as referrals.
Reject top level domains (TLD), domains, IPs, certain expressions appearing in domain names.
IgnoreReferrer *.ru # rejects a top level domain (TDL) IgnoreReferrer westio.com # rejects a domain IgnoreReferrer essaydates.com # rejects a domain IgnoreReferrer casino # rejects any expressionNever I tested the following method:casinoIgnoreReferrer http:// # rejects all non-https domains IgnoreReferrer http://1 # rejects non-https IP addresses IgnoreReferrer http://2 # ... IgnoreReferrer ... # ... IgnoreReferrer http://8 # ... IgnoreReferrer http://9 # ...
IgnoreReferrer * # does this ignore any referrer = ? IncludeURL *google.* # can pass if it's a friendly bot IncludeURL *yahoo.* IncludeURL *bing.*
Quit good question. Fact has it's a matter of time until you get it solved.
Meanwhile you can manually tidy-up. webalizer.current
is a normal ASCII file.
Study it exactly from the top to the bottom before you know what you want to wipe off.
Prior do a backup!
root@raspberry:~# nano /var/www/html/OutputDir/webalizer.current
28-Dec 2017
env=!dontlogdirective
The SetEnvIf
and SetEnvIfNoCase
directives can be used in the following contexts in your global Apache (2.4) configuration file. E.g. if you get lots of visits from search engine spiders (bots), certain IPs or socalled referrer spammers
.
I figured out that another method got very effective against referrer spam.
Please follow up the internal link.
Study regularly Apache's access.log
& error.log
.
04-Mar 2018
ufw
is a front end application for iptables
. Here you get the basic handling to your personal firewall - but effective one - to IPv4 & IPv6. The "ufw" is a comfortable command line application for managing your personal "iptables" rules in Linux.
Follow this link: Install & configure the socalled "ufw" | Uncomplicated firewall for Linux web servers.
07-Jun 2018