Results 1 to 3 of 3

Thread: Preventing duplicate content

  1. #1
    Experienced user
    Join Date
    01-12-10.
    Posts
    814

    Default Preventing duplicate content

    I looked at Google analytics and some other webmaster tools (that Google provides), and saw we had some duplicate content.

    Duplicate domain
    (domain.com vs http://www.domain.com)
    To prevent an entire duplication of a domain I already uncommented the following lines in .htaccess so that the http://www.domain.com and domain.com variations of my website do not show up as duplicates:
    Code:
      RewriteCond %{HTTP_HOST} ^domain\.com [NC]
      RewriteRule (.*) http://www.domain.com/$1 [R=301,L]
    You can also disable/prevent this using a google setting in the Analytics settings. (btw; I like the domain.com version better, so I moved that www. portion to the first line which achieves the reverse of the default)

    Home page
    The home page (in my case called "home.html") is a duplicate from the website root. It is in reality the exact same page except Contao treats it in a special way allowing both urls. To prevent this I added an extra rule to the .htaccess:
    Code:
      ##
      # Custom redirect to prevent duplicate content of the home page
      ##
      RedirectMatch 301 home(.*) /$1
    I am by no means a htaccess expert, so I'd appreciate anyone confirming this is the best method.

    URL parameters
    In Google analytics you can define which url get parameters should not be considered part of a unique page url. These parameters can exist as ?foo=bar or /foo/bar variations and on another website this led to dozens of duplicates for any product in a catalog.

    Additional duplication
    Is there any other duplicate content I should be aware of that can exist? Also, are there better ways to achieve any of the above?

  2. #2
    User
    Join Date
    06-19-09.
    Posts
    417

    Default Re: Preventing duplicate content

    These is what I use and can be varied to suit your own purposes:

    # ENFORCE USE OF WWW
    RewriteCond %{HTTPS} !=on
    RewriteCond %{HTTP_HOST} ^[a-z-]+\.(eu|co\.uk|me\.uk|org\.uk)$ [NC]
    RewriteRule ^(.*)$ http://www.%{HTTP_HOST}$1 [R=301,L]

    I use this as the domain does not need to be changed for each site.

  3. #3
    Experienced user
    Join Date
    01-12-10.
    Posts
    814

    Default Re: Preventing duplicate content

    Quote Originally Posted by Doublespark
    ...
    # ENFORCE USE OF WWW
    RewriteCond %{HTTPS} !=on
    RewriteCond %{HTTP_HOST} ^[a-z-]+\.(eu|co\.uk|me\.uk|org\.uk)$ [NC]
    RewriteRule ^(.*)$ http://www.%{HTTP_HOST}$1 [R=301,L]
    ...
    I see, so it is easier to reuse. Good one!

    Can't we perhaps rewrite the top level domain to be generic as well? Then I'd never have to change it. Or do I get into trouble because of uk styled domains (.co.uk)? I mean if I'd have xx.nl then it might rewrite http://www.xx.nl to http://www.www.xx.nl if this generic rule fails. And other subdomain cannot be rewritten, so testing or anything not starting with www. is not good either) I'm not sure what regular expressions can do in htaccess??

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •