Meta tags and robots.txt in Yahoo Search
As a webmaster, you can manage how your website appears in Yahoo Search by using meta tags and robots.txt.
Yahoo Search results come from the Yahoo web crawler (Slurp) and Bing’s web crawler. To learn more about optimizing for Bing, see the Bing Webmaster help center. Below, we’ll cover meta tags and robots.txt directives that the Yahoo web crawler understands.
You can manage how Yahoo indexes your website by adding meta tags to your website’s individual HTML pages, or configuring HTTP headers for them.
When you use the noindex tag on a page, Yahoo crawls the page and extracts links from it, but doesn’t include the page in the Yahoo Search index (the page won’t appear in search results). If a page has a noindex tag, but Yahoo hasn't crawled the page and "seen" the tag (but has found the page linked from other pages it did crawl), or is blocked from crawling the page by robots.txt directive, the page may still be indexed.
Make sure that the Yahoo crawler, Slurp, is allowed to crawl pages that you don’t want indexed so it can see the associated noindex tag.
Apply noindex in a robots meta tag
Place the following tag in the head section of an HTML page that you don’t want Slurp to index:
Apply noindex in an HTTP header
Instead of adding a meta tag to each page, you can place the directive in the HTTP header of one or more pages:
Use the “robots-nocontent” class to wrap the HTML code and page content that you don’t want the Yahoo web crawler to index:
Yahoo caches snapshots of most of the pages it discovers by crawling. These cached pages are linked to in Yahoo Search results pages. To prevent your website from being cached in this way, apply the noarchive meta tag or HTTP header directive.
Apply noarchive in a robots meta tag
Place the following tag in the head section of an HTML page:
Apply noarchive in an HTTP header
Configure your web server to place the following directive in the HTTP header:
Note - After the next content refresh cycle, Yahoo Search will continue to index and follow links in any pages you configure using noarchive, but the cached version of the pages won't display.
Yahoo Search obeys the “nofollow” directive for links; it follows the link, but excludes it from ranking calculation.
You can apply a rel="nofollow" attribute to any hyperlink on a page, the “nofollow” meta tag to an HTML page, or the X-Robots-Tag: nofollow directive in a page’s HTTP header, to indicate that the link or links on the page may not be approved or trusted.
While Yahoo Search may use the "nofollow" link for discovering content, the link won't be considered an approved link when ranking the target page.
This attribute works to reduce the benefits of comment spamming. For instance, websites with public comment areas can apply a "nofollow" attribute to publicly entered links to help fight comment spam.
Apply nofollow in an HTML “a” element
Apply nofollow in a robots meta tag
Apply nofollow in an HTTP header
Configure your web server to place the following directive in the HTTP header used to serve the page:
To tell Yahoo not to use a DMOZ description and title as candidate titles and descriptions for one or more of your URLs, use the “noodp” value in the robots meta tag:
When Yahoo finds any of these meta tags in a web document, it won't take DMOZ titles or abstracts into consideration when presenting the title and description for that URL in search results.
Not familiar with DMOZ? - Learn more about DMOZ (formerly the Open Directory Project).
If you'd like to prevent Slurp from reading some portion of your site, create a robots.txt file in the root directory (home folder) of your website, and add a rule for User-agent: Slurp.
Caution - Disallowing crawling of a page doesn’t guarantee that it won’t be indexed. To stop it from being indexed, see the “prevent a page from being indexed” section, above.
Example of code in a robots.txt file:
You can add a "Crawl-delay: xx" instruction, where "xx" is a delay value between successive crawler accesses. If the crawler rate is a problem for your server, you can increase the crawl delay.
Setting a crawl-delay of 1 for Yahoo Slurp looks like this:
It’s best to restrict total crawler activity to your server by disallowing unimportant content with a “disallow” robots.txt rule. If you feel a delay is necessary, use a small crawl-delay value to avoid blocking Yahoo Search discovery and refresh of your key content.
You can submit your sitemap to the Yahoo Search crawler, Slurp, through robots.txt directive. Just add the following to your robots.txt file:
Submit your sitemap to Bing - learn more about how Bing accepts sitemap submissions.