Results 1 to 10 of 10

Thread: [SOLVED] prevent google from indexing server

  1. #1
    Join Date
    Oct 2010
    Posts
    373
    Rep Power
    5

    Default [SOLVED] prevent google from indexing server

    Has anyone done the 'robots.txt' way of preventing google from indexing links and pages on a ZCS server? I am trying to figure out a way to prevent this. The closest i could get on this topic is this thread post http://www.zimbra.com/forums/adminis...tml#post106525

  2. #2
    Join Date
    Feb 2008
    Location
    Urbana-Champaign, IL
    Posts
    68
    Rep Power
    7

    Default

    You need to set the domain attribute 'zimbraMailKeepOutWebCrawlers' to TRUE, i.e.
    zmprov ms zimbra.example.com zimbraMailKeepOutWebCrawlers TRUE

    Anything listed in
    /opt/zimbra/conf/robots.txt

    will be appended to the robots.txt file that crawlers see.

  3. #3
    Join Date
    Oct 2010
    Posts
    373
    Rep Power
    5

    Default

    Quote Originally Posted by lindsey View Post
    You need to set the domain attribute 'zimbraMailKeepOutWebCrawlers' to TRUE, i.e.
    zmprov ms zimbra.example.com zimbraMailKeepOutWebCrawlers TRUE

    Anything listed in
    /opt/zimbra/conf/robots.txt

    will be appended to the robots.txt file that crawlers see.
    Thanks lindsey for the reply , but I read this attribute is from 7.0.1 onwards so I guess I am not lucky with that. http://www.zimbra.com/forums/announc...-0-1-live.html

    Any other way I can get this done? Rich Graves says in the first thread that,

    "You can add <META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW"> to the <head> section of /opt/zimbra/jetty/webapps/zimbra/public/login.jsp; zmmailboxd stop; flush the /opt/zimbra/jetty/work/zimbra/jsp/org/apache/jsp/public_/ directory; zmmailboxd start"

    Will this help to prevent ANY file from getting indexed by Google?

  4. #4
    Join Date
    Feb 2008
    Location
    Urbana-Champaign, IL
    Posts
    68
    Rep Power
    7

    Default

    Dang. It's never that easy, is it?

    Unfortunately, that change that you list will only affect the root file. If a spider hits a page at anything but the root level they won't see the META tag and will happily index away.

  5. #5
    Join Date
    Oct 2010
    Posts
    373
    Rep Power
    5

    Default

    Quote Originally Posted by lindsey View Post
    Dang. It's never that easy, is it?
    Doh .. yeah right and these days getting someone to reply here is not easy too .. desperate times!

    Quote Originally Posted by lindsey View Post
    Unfortunately, that change that you list will only affect the root file. If a spider hits a page at anything but the root level they won't see the META tag and will happily index away.
    Okay you mean that will do just the login page huh? I need to keep away one user's calendar from showing up in google search.

    hey thanks again!

  6. #6
    Join Date
    Oct 2010
    Posts
    373
    Rep Power
    5

    Default

    Zimbra has this feature by default since 6.0.11. A file 'robots.txt' will be placed in /opt/zimbra/jetty/webapps/zimbra/ directory of mailbox server with this content.

    User-agent: *
    Disallow: /

    I had to do it manually on my 6.0.10. Placed the file /opt/zimbra/jetty/webapps/zimbra/robots.txt with ownership zimbra:zimbra on mailstore servers. Mailbox should be restarted after that with 'zmmailboxdctl restart'.

  7. #7
    Join Date
    Oct 2010
    Posts
    373
    Rep Power
    5

    Default

    Any thoughts on how long it will take to do its thing? Its been 5 days and I still see the results. Perhaps displaying from cache?

  8. #8
    Join Date
    Oct 2010
    Posts
    373
    Rep Power
    5

    Default

    Okay, It will be like this till google crawlers do the next lookup. Any details of crawling should be in jetty logs.

  9. #9
    Join Date
    Sep 2006
    Location
    Illinois
    Posts
    374
    Rep Power
    9

    Default

    We recently discovered that setting this attribute breaks the ability to share calendars publicly and import them into Google Calendar. Has anyone else noticed this and is there a workaround?

    Matt

    Quote Originally Posted by lindsey View Post
    You need to set the domain attribute 'zimbraMailKeepOutWebCrawlers' to TRUE, i.e.
    zmprov ms zimbra.example.com zimbraMailKeepOutWebCrawlers TRUE

    Anything listed in
    /opt/zimbra/conf/robots.txt

    will be appended to the robots.txt file that crawlers see.

  10. #10
    Join Date
    Feb 2008
    Location
    Urbana-Champaign, IL
    Posts
    68
    Rep Power
    7

    Default

    Can you add an Allow directive to your robots.txt file in /opt/zimbra/conf/robots.txt?

    Maybe be something like

    User-agent: Googlebot
    Allow: /*.ics$
    Disallow: /

    Although that opens up all of your calendars to Web indexing as well. Until Google does the right thing and creates a separate UserAgent for calendar features, you're stuck in an all-or-nothing situation.

Similar Threads

  1. Replies: 3
    Last Post: 03-20-2008, 03:50 AM
  2. Error loading on Mac OS X 10.4.10 server PPC
    By qprcanada in forum Installation
    Replies: 7
    Last Post: 10-26-2007, 07:25 AM
  3. Disable attachment indexing for imapsync
    By richard-hdd in forum Migration
    Replies: 5
    Last Post: 08-14-2007, 10:56 AM
  4. Replies: 1
    Last Post: 09-17-2006, 12:02 AM
  5. Replies: 12
    Last Post: 03-14-2006, 12:02 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •