it's a raspberry pi

Editorials

 Setups & configurations

anchor life saver

 We got some boring movies

🎥 Cinematique

 Regarding unity3d

🙂 In-game char icons
Externals

 All about pirate games

⚓ Pirates ahoy!

 This is German language

💻 Linux - ubuntusers.de
Windward bathtub
Tasharen Windward Game

 Social news aggregation

🌍 Windward #reddit
Windward intensive

 Official Windward wiki

🛠 Wiki @gamepedia
Front desk clerk

No ads, no trackers,
no web beacons

rum barrel
Something else

 Get the weather widget

🌤 Weather code snippet

 Hotchpotch of weblinks

📖 Yellow pages
Archives
Tag cloud

Apache 2.4 web server | robots.txt & xml sitemap, RSS feed


Apache Setup Raspberry Pi mini computer to dedicated game server. LAMP. Apache Raspberry Pi mini computer. Dedicated game server.
That is simply just another http web server powered by Apache Software
and the Raspberry Pi Foundation with Raspbian Debian OS Stretch Lite on it.

Transfer the robots.txt in the top-level directory (root) of your web server e.g. /var/www/hmtl .
Use all lower case for the filename: robots.txt, never Robots.TXT or something similar.

robots.txt informs search engine spiders (bots) how to interact with indexing your web content.
If you do not have a robots.txt file, your web server logs will return 404 errors whenever a bot tries to access. You can upload a blank text file named robots.txt if you want to stop getting 404 error messages.

Some search engines allow you to specify the address of an xml-sitemap, but if your site is small you do not need to create an xml-sitemap.

Please also note that if your robots.txt contains errors and spiders won’t be able to recognize your commands they will continue crawling thru your domain.


Simple robots.txt example


        # robots.txt to http://www.yourdomain.tld/
        # you are not allowed to use external URLs in your sitemaps !

        # first, let us reject some unwanted spider bots
        User-agent: Yandex
        User-agent: Baiduspider
        User-agent: SemrushBot
        User-agent: SemrushBot-SA
        Disallow: /

        # second, pass spider bots, ban certain extensions & sub-folders
        User-agent: *
        Disallow: /cgi-bin/
        Disallow: /documents/
        Disallow: /*.mp3$
        Disallow: /*.mp4$
        Disallow: /*.rar$

        # sitemap locations
        Sitemap: yourdomain.tld/sitemap.xml # for pages like .htm* .php* .xhtml etc.
        Sitemap: yourdomain.tld/sitemap-images.xml # for pages & images
        

Structure XML sitemap (for ASCII pages only)


        <!-- sitemap.xml -->
<?xml version="1.0" encoding="UTF-8"?/> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemalocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
<url> <loc>http://www.yourdomain.tld/index.php</loc> <lastmod>2017-10-13 00:27:25+00:00</lastmod> <changefreq>weekly</changefreq> <priority>1.00</priority> </url>
<url> <loc>http://www.yourdomain.tld/pages/admin.html</loc> <lastmod>2017-10-13 00:27:25+00:00</lastmod> <changefreq>weekly</changefreq> <priority>0.50</priority> </url>
</urlset>

Structure XML sitemap (for images & pages - used by Google exclusive)


        <!-- sitemap-images.xml -->
<?xml version="1.0" encoding="UTF-8"?/> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
<!-- just one image in your page -->
<url> <loc>http://www.yourdomain.tld/pages/admin.html</loc> <image:image> <image:loc>http://www.yourdomain.tld/images/loony.jpg</image:loc> </image:image> </url>
<!-- more than just one image in your page -->
<url> <loc>http://www.yourdomain.tld/index.php</loc> <image:image> <image:loc>http://www.yourdomain.tld/images/fiddler.png</image:loc> </image:image> <image:image> <image:loc>http://www.yourdomain.tld/images/backpiper.gif</image:loc> </image:image> </url>
</urlset>

How to PING your *.xml to Bing (MSN)/Yahoo! & Google?


Assuming your sitemaps are at location http://yourdomain.tld use the ping URLs below to submit your updated *.xml files.




Bing (MSN) / Yahoo! Slurp


        https://www.bing.com/webmaster/ping.aspx?sitemap=
        http://yourdomain.tld/sitemap.xml
        


Google Inc.


        https://www.google.com/webmasters/sitemaps/ping?sitemap=
        http://yourdomain.tld/sitemap.xml
https://www.google.com/webmasters/sitemaps/ping?sitemap= http://yourdomain.tld/sitemap-images.xml

09-Oct 2017


The structure of a really simple RSS feed


        <?xml version="1.0" encoding="UTF-8"?>
        <rss version="2.0">

        xmlns:content="http://purl.org/rss/1.0/modules/content/"
        xmlns:wfw="http://wellformedweb.org/CommentAPI/"
        xmlns:dc="http://purl.org/dc/elements/1.1/"
        xmlns:atom="http://www.w3.org/2005/Atom"

        <channel>

	       <title>Main header</title>
	       <link>http://example.com</link>
	       <description>Use a short header description</description>
	       <language>en-gb</language>

	       <item>
	       <title>First article</title>
	       <description>Use a short description</description>
	       <!-- Link to article -->
	       <link>http://example.com/one.html</link>
	       <!-- Date published -->
	       <pubDate>Wed, 30 Apr 2018 00:00:00 +0200</pubDate>
	       </item>

	       <item>
	       <title>Second article</title>
	       <description>Use a short description</description>
	       <!-- Link to article -->
	       <link>http://example.com/second.html</link>
	       <!-- Date published -->
	       <pubDate>Wed, 30 Apr 2018 00:00:00 +0200</pubDate>
	       </item>

        </channel>

        </rss>
        

The <pubDate> code is not needed, but be aware that the format is strict.
The +0200 indicates that time 2 hours ahead of GMT.


If you have one character missing - e.g. a closing tag - you get a white screen in your web browser.
Even if you are adding characters other than standard letters (&) and numerals.


20-Jul 2018

Hafenzoll 2019