Business Development

Process driven.

Results driven.

Blocking WordPress Pages from Being Indexed by Google with Robots.txt

Last updated By

The Robots.txt file is a file every WordPress site comes with, which tells Bots and Search Engines which pages you want them to index. It’s more of a request (rather than a demand), but\’a0the Robots.txt file is publicly accessible at www.yourdomain.com/robots.txt so\’a0be aware the disallowing a page won’t make it hidden from the web. In fact,\’a0in some cases it’s even easier to find. It only means that it won’t show up on major search engines.\
\
In most WordPress themes, The Robots.txt file is typically easily editable from the WP Dashboard’s File Editor (like so). But not with Root’s Sage theme. That access is restricted. WordPress baked the Robot.txt into WordPress with version 3. So now the best way to edit it is by adding\’a0a robots_txt WordPress filter\’a0in the theme’s function.php\’96 or preferable Sage/src/filters.php\’a0file.\
\
For this particular use case, I didn’t want our AdWords landing pages showing up on Google when people search for our products. I don’t want anyone going to these pages that aren’t coming from AdWords because it can skew the data. So I’ve added this to our filters.php file:\
\
/**\
* Filter function used to disallow AdWords pages from robots.txt\
*/
\
\
add_filter('robots_txt', function($content, $is_public) \{\
$content .= "Disallow: /*-gaw01";\
return $content;\
\}, 10, 2);
\
\
Our AdWords pages URL is the same as the regular public product, but ends in -gaw01. So adding this filter will prevent any URLs containing -gaw0 to be ignored. Now our Robots.txt look like this:\

User-agent: *\
Disallow: /\
Disallow: /*-gaw01

Leave a Reply

Your email address will not be published. Required fields are marked *