Sunday, March 24, 2013


INTRODUCTION:

.htaccess is one of those things that people often refer to as a dark art, much like reincarnation. However it shouldn't be overly confusing or hard, so here we'll look at the basics of rewrite rules with htaccess as well as the more important 301 redirects. Before you take a read, it's advisable that you take a brief read through the Introduction to Regular Expressions as these will feature rather heavily within rewrites etc.

THE BASICS:

The thing to grasp in the initial stages is that a .htaccess file has no extension, or suffix on its filename. The file is named simply .htaccess, that's the first issue that many people face when creating these files is that they are named incorrectly.
It's also worth mentioning that htaccess files are very powerful, but at the same time can possibly lead to some unwanted side effects if you're not careful, so it's best to not play with these on a live server - as it possible to maim a kitten. htaccess files take effect in the directory that it's located in, as well as subdirectories.

REWRITING URL's:

The are simple rewrites, not redirections. This will not change the URL that a user sees within the address bar but will load the page that you tell it to.
The rules are split into two sections, the first one being what the user/visitor will enter in their browser, and the second being the page that will be served to the user instead.
Simple Rewrite
Let's start with rewriting any request to 'about' to actually load the file about.html for the user to see.
Firstly, we must enable the ability to use rewriting:

RewriteEngine On

That simple line tells the server that we want to do some rewriting, next we need the actual rewrite:(Redirect User to about.html if user naviagates to /about)

RewriteRule ^about$ about.html

Simple as that, note that we've used some regular expression limiters on the first section, the caret ^ to match the start of the string, and the dollar $ to match the end of the string.

Regular Expression Rewrite

We can use regular expressions in the rewrites, and then use the matched string to load a relevant page, so let's take our previous example but expand it to any named file that has a-z or 0-9 in the name, and redirect to its respective .html file.
(Redirect User to /requestedpath.html if user visits /requestedpath )
RewriteRule ^([0-9a-z]+)$ $1.html

That's going to match the string, then take what we've matched, append .html to the end and serve it to the user.

What if...
... we don't want to rewrite something if the file already exists, or it's a directory that exists? So for example we have an about directory that already exists which contains a load of files - so anyone going to the directory should be served the directory, not our rewrite.
We can look now to use things called Rewrite Conditions, these allow us to set rules for when the rewrite rule will take effect, the below example will check that the requested URI isn't a directory or file that exists already:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f

Note that we place these before any rewrite rules that we want them to affect.

REWRITE FLAGS

There are some handy flags that we can append to the end of rewrite rules that allow us to specify extra additions, so for example we can specify that any additional query string that is sent in the request is also sent through to the script we're loading. Or we can tell the redirect that this is the last redirect that it should do if it matches - this could solve some unintended rewrite side effects.
There are loads of flags that you can use, but we'll cover just one or two:
  • QSA: This will append the query string that was sent to the original page to the end of any query string that you had before. eg: /about?foo=bar will become /about.php?foo=bar.
  • R: This will cause a HTTP redirect to be issued to the browser, you can optionally add a redirect code as well - we will cover this later in this article.
  • L: This will stop mod_rewrite from running any further rules within the set, and run the matched rule with this flag.
  • F: Final simple one, this will return a 403 forbidden error to the client, this insinuates a flag of L also.
You can use a flag by putting the letter in square brackets after the rewrite rule, so for example:

RewriteRule ^([0-9a-z]+)$ index.php?page=$1 [QSA]

This will rewrite the following:
  • /about -> index.php?page=about
  • /bacon?foo=bar -> index.php?page=bacon&foo=bar


Redirecting Pages

A wise man once said: "He who doesn't use a 301 redirect is destined for SEO failure.", well I like to think I'm wise. But the point stands, if you are redirecting a user, you will want to ensure that it's a 301 redirect. This specific redirect is super simple to implement, and maintains your current search engine ranking.
As mentioned earlier with the flags for rewrites, there is one that is R, that takes a number to show the type of redirect - in this case we'll be using 301, [R=301] would be the flag that we use. To not specify a redirect type, you can omit the number (and equals), though this might not be overly ideal.


ERROR DOCUMENT

You can specify an error document for each of your error pages by simply using the following syntax:
ErrorDocument xxx /xxx


Where xxx is the error number, so for example a 404 error that will redirect to thenotfound.html file in the errors directory would look like:
ErrorDocument 404 /errors/notfound.html

Not too complex, but watch out for any typos - as you could possibly break everything.
index.html or not index.html

By default a lot of sites look for index.html, or index.php as the file to display when you first load a directory. However, you might find that actually you want the site to load new_index.htmlfor a period, or even maintenance.html.
To do this, we can alter the priority that it looks for files in to display, falling back to the next one in the list if the previous doesn't exist.
Let's go with setting this order:
  1. maintenance.html - If there's no maintenance being done, we can rename this to maintenance_none.html (or similar) to fall to the next file.
  2. new_index.html
  3. index.html
This will result in:
DirectoryIndex maintenance.html new_index.html index.html


CONCLUSION

A very brief and simple introduction to .htaccess, hopefully entitling you to take what you have now and implement neater, simpler URLs but also meaning that you gracefully, and respectfully redirect people to other locations on your site as you change page names, slugs or even domains.
Just ensure that you test your htaccess files before you shove them live, a '500 Internal Server' error is in most cases a sign that there might be an issue with your config.


EXAMPLE

Try it:

ErrorDocument 404 /404
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^rssfeed\.rss$ /custom/feed.php [L]
RewriteRule ^([0-9a-zA-Z\_\-]+)\/$ $1 [R=301,L]
RewriteRule ^([0-9a-zA-Z\_\-]+)$ page.php?id=$1 [QSA]



0 comments:

Post a Comment