Duplicate sitemap urls with Ventrian News Articles

Oct 25, 2013 at 1:39 PM
Hey Sacha,

I have DNN 7.1.2, Ventrian NewsArticles 0.9.6, OpenUrlRewriter71_1.0.1 and NewsArticles_OpenUrlRewriter 1.0.1.

Two questions:
1)
In my Sitemap.aspx I see the following:
http://www.mysite.com/[page name]/[article title]
http://www.mysite.com/[page name]/[category name]/[article title]
Which gives me multiple paths to the same article (which I have been told is not good for SEO).
Is there a way to prevent this, so that I have only a single url per article?

2)
I also see that there are entries for:
http://www.mysite.com/[page name]/[year]
http://www.mysite.com/[page name]/[year]/[month]
http://www.mysite.com/[page name]/[author]
Can I switch this of?

Kind regards,
~Michel
Coordinator
Oct 25, 2013 at 1:59 PM
Hi Michel,

1)
Which gives me multiple paths to the same article (which I have been told is not good for SEO).
you are right about seo. But Ventrian NewsArticles generate a canonical link to tell to google that all the pages are the same and witch one have to be indexed by the search engine. So google see a single url.
Exemple :
<link rel="canonical" href="http://openurlrewriter.satrabel.be/en/modules/ventrian-news-articles/my-first-news-article"/>
Is there a way to prevent this, so that I have only a single url per article?
No not at the moment. And i think it is not really possible without altering the good function of Ventrian NewsArticles. Witch can handle multiple categories per article.
But maybe you are right to say that it have no sense to add all alternative urls in the sitemap.
At the moment it is not possible in openurlrewriter to remove some urls of the sitemap. I was already thinking about adding this.
2) Can I switch this of?
This links are links generated by the "Archive" functionality of Ventrian NewsArticles
Maybe you are right to say that it have no sense to add all list urls in the sitemap, because this are not real content pages.
But i know it is a good practice that google index this pages, because it is a confirmation of the sitemap (it serve a natural sitemap, the archive section serve also for this).
At the moment it is not possible in openurlrewriter to remove some urls of the sitemap. I was already thinking about adding this.
Regards,
Sacha
Oct 29, 2013 at 9:54 AM
Hey Sacha,

Thanks for your fast reply. I wanted to wait with a reply, after I had the meeting with my SEO consultant today about these topics.

1)
Multiple paths to the same article can indeed be solved with the canonical link as you mentioned. Although it is normally used to tell Google that the page with the canonical link is a filtered version of the original content, it will work in this case: Google will see only the 'master' article and mark all the other as references to the master.
But Google sees multiple urls in the sitemap and crawls them all. So the downside of this is method is that this gives extra unwanted load on the Google bots and also on my site.
So this is not the nicest solution, but it will work. (I don't allow an article to be assigned to multiple categories, so I would rather not see them).

2)
And you are right about the natural sitemap that the "archive" functionality of News Articles produces. Although they are not actual content pages, Google bots are smart enough to see these as categorization of articles. But the same hold true here: the Google bots will scan what they see in the sitemap.

To conclude:
From a functionality standpoint we are good: it will work.
But we could do better, since responsiveness of a site also is a factor in the Google ranking and user experience.
So it is a good idea to add the option to remove some urls from the sitemap.

After using OpenUrlRewriter for a couple of weeks, I can say that it works excellent. My compliments on this fine product.

Kind regards,
~Michel
Oct 31, 2013 at 9:17 AM
Hey Sacha,

I have come across another strange thing related to SEO.
When I enable paging in the News Articles module, DNN generates the urls for the pager. These are in the format http://www.mysite.com/news/currentpage/2 for page 2 and http://www.mysite.com/news/currentpage/3 for page 3 etc.
The issue is that Google indexes the following:
The problem is with http://www.mysite.com/news/currentpage which is a non-existing/virtual page (no clue where it comes from). It redirects to http://www.mysite.com/news, which gives a duplicate for Google, which lowers the ranking force.
Is there anything that you can do in OpenUrlRewriter to solve this?

As a side note: http://www.mysite.com/news/non-existing-page-due-to-a-typo also redirects to http://www.mysite.com/news.
This is not really relevant for SEO, because Google only crawls what is there and it doesn't make up urls and it might even be a feature for the end users, since they get an existing page and not a 404 error. But I would expect the 404 error to show up.

Kind regards,
~Michel
Coordinator
Nov 4, 2013 at 7:14 AM
Hi,

I don't think the redirection behaviour comes from open url rewriter.
If your site is accessible over the Internet, i can have a look. Or look at the response header to have more info.

The News Articles provider don't rewrite currentpage... urls at this moment.
As a side note: http://www.mysite.com/news/non-existing-page-due-to-a-typo also redirects to http://www.mysite.com/news.
This is not really relevant for SEO, because Google only crawls what is there and it doesn't make up urls and it might even be a feature for the end users, since they >get an existing page and not a 404 error. But I would expect the 404 error to show up.
Open url rewriter don't manage 404 error. What you see is the default behaviour of dnn for module urls that not exist.
But my observation is not a redirect.
http://www.mysite.com/news/non-existing-page-due-to-a-typo shows the content of http://www.mysite.com/news without redirect.
Which is not good for seo.

Regards,
Sacha