Page 1 of 1

[SOLVED] prevent google from indexing server

Posted: Wed Jun 08, 2011 1:33 pm
by fyd
Has anyone done the 'robots.txt' way of preventing google from indexing links and pages on a ZCS server? I am trying to figure out a way to prevent this. The closest i could get on this topic is this thread post http://www.zimbra.com/forums/administrators/9088-solved-robots-txt.html#post106525

[SOLVED] prevent google from indexing server

Posted: Fri Jun 10, 2011 10:41 am
by 13335lindsey
You need to set the domain attribute 'zimbraMailKeepOutWebCrawlers' to TRUE, i.e.

zmprov ms zimbra.example.com zimbraMailKeepOutWebCrawlers TRUE
Anything listed in

/opt/zimbra/conf/robots.txt
will be appended to the robots.txt file that crawlers see.

[SOLVED] prevent google from indexing server

Posted: Sat Jun 11, 2011 3:59 am
by fyd
[quote user="13335lindsey"]You need to set the domain attribute 'zimbraMailKeepOutWebCrawlers' to TRUE, i.e.

zmprov ms zimbra.example.com zimbraMailKeepOutWebCrawlers TRUE
Anything listed in

/opt/zimbra/conf/robots.txt
will be appended to the robots.txt file that crawlers see.[/QUOTE]
Thanks lindsey for the reply :), but I read this attribute is from 7.0.1 onwards so I guess I am not lucky with that.
Any">http://www.zimbra.com/forums/announcements/47917-zcs-7-0-1-live.html
Any
other way I can get this done? Rich Graves says in the first thread that,
"You can add to the section of /opt/zimbra/jetty/webapps/zimbra/public/login.jsp; zmmailboxd stop; flush the /opt/zimbra/jetty/work/zimbra/jsp/org/apache/jsp/public_/ directory; zmmailboxd start"
Will this help to prevent ANY file from getting indexed by Google?

[SOLVED] prevent google from indexing server

Posted: Mon Jun 13, 2011 7:49 am
by 13335lindsey
Dang. It's never that easy, is it? :D
Unfortunately, that change that you list will only affect the root file. If a spider hits a page at anything but the root level they won't see the META tag and will happily index away.

[SOLVED] prevent google from indexing server

Posted: Mon Jun 13, 2011 8:44 am
by fyd
[quote user="13335lindsey"]Dang. It's never that easy, is it? :D

[/QUOTE]
Doh .. yeah right :D and these days getting someone to reply here is not easy too :) .. desperate times!
[quote user="13335lindsey"]

Unfortunately, that change that you list will only affect the root file. If a spider hits a page at anything but the root level they won't see the META tag and will happily index away.[/QUOTE]
Okay you mean that will do just the login page huh? I need to keep away one user's calendar from showing up in google search.
hey thanks again! ;)

[SOLVED] prevent google from indexing server

Posted: Sat Jun 25, 2011 1:56 am
by fyd
Zimbra has this feature by default since 6.0.11. A file 'robots.txt' will be placed in /opt/zimbra/jetty/webapps/zimbra/ directory of mailbox server with this content.
User-agent: *

Disallow: /
I had to do it manually on my 6.0.10. Placed the file /opt/zimbra/jetty/webapps/zimbra/robots.txt with ownership zimbra:zimbra on mailstore servers. Mailbox should be restarted after that with 'zmmailboxdctl restart'.

[SOLVED] prevent google from indexing server

Posted: Mon Jun 27, 2011 5:05 am
by fyd
Any thoughts on how long it will take to do its thing? Its been 5 days and I still see the results. Perhaps displaying from cache?

[SOLVED] prevent google from indexing server

Posted: Mon Jun 27, 2011 9:12 am
by fyd
Okay, It will be like this till google crawlers do the next lookup. Any details of crawling should be in jetty logs.

[SOLVED] prevent google from indexing server

Posted: Tue Feb 25, 2014 9:52 am
by 16113Chewie71
We recently discovered that setting this attribute breaks the ability to share calendars publicly and import them into Google Calendar. Has anyone else noticed this and is there a workaround?
Matt
[quote user="13335lindsey"]You need to set the domain attribute 'zimbraMailKeepOutWebCrawlers' to TRUE, i.e.

zmprov ms zimbra.example.com zimbraMailKeepOutWebCrawlers TRUE
Anything listed in

/opt/zimbra/conf/robots.txt
will be appended to the robots.txt file that crawlers see.[/QUOTE]

[SOLVED] prevent google from indexing server

Posted: Fri Feb 28, 2014 9:05 am
by 13335lindsey
Can you add an Allow directive to your robots.txt file in /opt/zimbra/conf/robots.txt?
Maybe be something like
User-agent: Googlebot

Allow: /*.ics$

Disallow: /
Although that opens up all of your calendars to Web indexing as well. Until Google does the right thing and creates a separate UserAgent for calendar features, you're stuck in an all-or-nothing situation.