or an possibility to add own meta tags
Maybe makes sense to set this by default anyway?
If not, maybe rather allow plugins to define meta tags and implement it as a plugin and publish it on the marketplace
by default anonymous user has no access, so it should block search engines from finding the content
I don't think we want it in core as it's good to index the login pages of Piwik (for example they link to piwik.org)
+1 for doing it in a plugin
So why closing this issue when we could build a plugin? I still think this is actually very useful for users.
Hi, this issue should be re-evaluated. See my post here: #8036 Its about preveting for googling for piwik installations.
<meta name="robots" content="noindex,nofollow">on Installation pages, Updater pages, Reporting UI pages, Admin pages, and others if any
<meta name="robots" content="index,follow">on the Login form.
just wanted comment on mattabs comment above:
"it's good to index the login pages of Piwik (for example they link to piwik.org)"
of course from seo point of view (backlinks) I could understand that.
From security point of view an easy to find login page is not that great.
One could e.g. easily do the following thing - FULLY AUTOMATED:
1) search for login page
2) start brute force attack
3) when you are successful: look for ecommerce
4) extract /download everything
5) make a database of ecommerce data
6) sell it to everyone (competitors)
on other systems their is a great effort to hide login page with the following:
so I would strongly vote for noindex, no follow, no archive also for login page
I wonder why it still has index,follow on the login page. Every update I have to change it. If I don't, the page will end up high in the search result for our company name. This is obviously not something anyone wants. The meta tag is for controlling search engines; why would a search engine needs to display a login page for admins?
I fully agree with AramVK, I don't want the stats page show up between my top search results. My workaround for now:
I created a general file
norobots.txt outside of the document root:
User-agent: * Disallow: /
Then I added to the VirtualHost in Apache:
Alias /robots.txt /path/to/norobots.txt
Voila, the site is ignored by search engines and I don't have to change anything after an update.
This was also my solution until I recently got a warning from google search console
'Indexed, though blocked by robots.txt'
Related 'learn more' link will lead to
which says that
'robots.txt is not the correct mechanism to avoid being indexed.'
After that I looked bit more and found
https://forum.matomo.org/t/exclude-piwik-from-being-indexed-by-search-engines/363/8 ,which will instruct modifying file '/plugins/Login/templates/login.twig'
<meta name="robots" content="index,follow">
<meta name="robots" content="noindex,nofollow">
Hopefully it works, but naturally it goes broken again after next update..
Yes you need to be careful with robots.txt and Disallow /, if you don't place it at the right path, it will block your whole website.
After yesterdays update I had to update the login.twig file once again, it's still part of the routine :relieved:
yes, although (just for sure) that was not my case. I was only blocking /piwik directory, but google had found it anyway.
I commented also the other related issue and asked if this could be included to the project.
I guess that one checkbox in the settings would not be too much asked.