Google does not like duplicate content. Whether it is on same domain or cross domain. It is because it confuses Google as to which page to index, which one is the original version of a webpage and which one is a duplicate?
In addition to this confusion of search engines, you are also at risk of duplicate content penalty if precautionary measures are not taken.
Canonical Tags – The Solution
Enters “Canonical Meta Tags”, which explicitly tells search engines which page is the original version of a piece of content and which page to index. The canonical meta tag looks like this:
<link rel="canonical" href="https://digitalscouts.net/seo/canonical-tags
This meta tag appears in the
<head> section of your HTML document and it should appear ONLY once on a page, if you define it multiple times Google will ignore both of them and will choose the webpage at it’s own will (which you don’t want).
The above mentioned canonical meta tag is telling search engines that the URL mentioned in the
href attribute is the original version and should be indexed.
Applications of Canonical Tags
I mentioned earlier that google does not like duplicate content whether it is on same domain or cross domain. So, canonical tags can be used in both situations.
a) Same Domain Content Duplication
You might be thinking that why would you publish same content twice on your site? So, it is easy to assume that you don’t have this issue.
But is is more common than you think. Primarily because Google’s bot crawls URLs not webpages. For example, if you are running an e-commerce store you probably will have URLs like this:
Now although both pages are essentially representing the same content but in the eyes of search engines bots they are both different pages and it’ll crawl both of them separately (thus consuming your “Crawl Budget”).
To avoid this and to save Google this confusion we’ll use a self-referencing canonical tag on
https://example.com/product/tee-shirt/ which will look like this:
<link rel=”canonical” href=”https://example.com/product/tee-shirt/” />
This essentially tells Google that the page mentioned in the “href” attribute should be indexed even if the URL contains any parameter.
However, this is only one example where same domain content can be considered duplicate by search engines.
b) Cross Domain Content Duplication
Cross domain duplicate content is more easier to understand. For example you’ve decided to re-post your blog post on some authority site as part of your guest blogging routine.
Now when the Google’s bot crawls both pages, it’ll get confuse as to which page to index and rank. In case, you’ve decided to guest blog on a more established site Google may even penalize your site for duplicate content even though your’s one was published first.
So, what can you do in this situation. You have 3 options:
- Talk to the site owner – Ask the webmaster of your guest blogging site to add a canonical tag to their webpage (where your post would be published) referencing the original post.
- Rewrite the content – you can re-write your blog post with different wording and format and then publish it on the guest blogging site, so it is essentially a unique content in the eyes of search engines.
- Post content exclusively on guest blog – this might be the only option for you in some cases because many authoritative websites would want exclusive content which is not published anywhere else.
The option# 1 is obviously the preferred solution. I rarely consider the option# 2, if the situation demands I would resort to option# 3.
Whatever you do, keep in mind that it is not all right to have duplicate content roaming around the web without a