PDA

View Full Version : [.htaccess | Mod Rewrite] Better urls the Gamer.Saga way!


FuRom
09-25-2007, 03:07 AM
Have you ever wondered how website's codes can be organized when all their files in your browser's address are always like "topic-2084.html" or something that appears to have been personally created? Well, I have been too! I knew it was something and simple, probably to do with .htaccess, but I never really looked into until today when OwnManAtt was looking for a full tutorial on it. This is not a full tutorial on mod rewrite, but it will help you to accomplish helping you make your website easy to archive through any spider, at least I would assume this would help spider crawling, if I'm wrong, please correct me.

You might be asking "what's a spider?" The answer is simple. A spider is a search engine's bot. Search engines like google have bots that browse your website and archive information to make it more accessible to the public. You might as "Do I really want a bot on my site?" Commonly the answer would be no, but this is a good bot. Not a bad bot. It'll help your site grow.

I'm going to show you my home brewed version and explain it a little bit. This is not a newbie's guide. It is in no way newb-friendly, nor do I intend to make it beginner friendly. This is specifically for those who have achieved enlightenment in the art of web programming. This is also only for apache web servers that have the mod rewrite enabled. You can google how to enable it if you're running your own personal server, if you're on cpanel hosting, I don't know what to tell ya, just test it.

Now, this is my home brew, I'll explain a little about it after you've copied it. I'm not going into detail, since you should be experienced with this type of stuff, I'll just say how I handle it personally, you might get some ideas.

Filename: .htaccess
File Location: folderloc/
RewriteEngine On
RewriteBase /

RewriteRule ^([^*/]+)\/(.*).html?(.*)$ folderloc/$1.php?idx=$2 [L]
RewriteRule ^([^*/]+)\.html?(.*)$ folderloc/$1.php [L]

Filename: index.php
File Location: folderloc/
<?php
// This is important! It is my way of handling variables.
// You can't just put ?blah=def at the end with this and
// I'm too lazy to figure out how when I can just do this
$_GET = explode("/~", $_GET['idx']);

// This is how I display information. I got magic quotes
// on... not like it's important for you to know though...
echo $_GET['0'].'<br>';
echo $_GET['1'].'<br>';
echo $_GET['2'].'<br>';
echo $_GET['3'].'<br>';
echo $_GET['4'].'<br>';
echo $_GET['5'].'<br>';
?>


The rewrite module uses a language known as "regex" or regular expressions. It's highly advanced and a pain in the neck to understand, therefore I call it a language in it's own.

Now, you'll notice I have two rewrite rules going on here:
RewriteRule ^([^*/]+)\/(.*).html?(.*)$ folderloc/$1.php?idx=$2 [L]
RewriteRule ^([^*/]+)\.html?(.*)$ folderloc/$1.php [L]
on is to handle a url like "blah.com/folderloc/filename/~var/~var2/~var3.html" and the other is just for something like "blah.com/folderloc/filename.html". One will handle $_GET method variables and the other just gives you the file.

The one that handles variables ["blah.com/folderloc/filename/var/~var2/~var3.html"], allows you to get the variable into your page with the $_GET array like: $_GET[0], $_GET[1], and so on. I handle it like this because of the simple fact it doesn't let you use "?var1=blah" and such at the end of the url and I'm just too lazy to figure out how to handle it. It works in the order that the variable is put in the url. You can access "index.php" by something like: "blah.com/index.html" or "blah.com/index/var/~var2/~var3.html".

Well, good luck in understanding this. I really did want to write a full explainatory tutorial on this, but this is just one of those things that you really have to learn other things to really understand it. I can try to help you understand it, but I'm not going to give you step by step instructions on how to make this specific script work, nor any other. It's just not that productive for me and it doesn't help you learn.

More technical information about mod rewrite can be found here:
http://httpd.apache.org/docs/1.3/mod/mod_rewrite.html

cpvr
09-25-2007, 03:09 AM
You're also missing the redirect http://domain.com to http://www.domain.com .htaccess code.

You can do this to make that happen:


RewriteCond %{HTTP_HOST} ^domain\.com [NC]
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301]

FuRom
09-25-2007, 03:13 AM
^^ Ah, I see. I never felt the "www" was all too important, but I guess someone could use it. Also, as a side note, the "RewriteCond" is like an if statement, you can use it to prevent people from going to the original file, at least I'd assume so. I don't really use it because I don't care if anyone sees my original file.

Also, CPVR, think this is worthy of a sticky?

cpvr
09-25-2007, 03:17 AM
Well - if you have both of the domains up, then you risk "duplicate content" with the search engines and your sites won't rank as high. Yes.

FuRom
09-25-2007, 03:22 AM
Oh, I see! I never thought about that. You make a very good point.

cpvr
09-25-2007, 03:23 AM
It's the same rule with "pointing" a domain to another site and the search engines see that = another duplicate content penality.