Search Engine Friendly URLs
Contents:
Introduction
These days, everyone who is anyone has a dynamic website. Gone are the days of hand creating hundreds of standard HTML pages, we now have our web servers creating thousands of pages on the fly from databases of content or products. With these dynamically created pages comes a problem. How do we pass information from one page to the next? By using “query strings”. There’s nothing new there – we see them every day – (e.g.: http://www.exampledomain.com/article.php?id=5 – the part after the Question Mark is the query string).
The only problem with this is that the search engines may not spider them – which means you wont be in their listings, and no one will be able to find your site, no matter how beautiful and dynamic it is! (Many search engines WILL spider dynamic pages, but with a limit on how many per site, or how deep into the site, so the crawler may not pick up all your pages).
How can we fix this problem? By using “Search Engine Friendly URLs”. Throw out that query string, and ‘trick’ the search engines into thinking it’s just an ordinary page (but don’t worry, its perfectly legal, and your site WONT be penalized in any way)
Both the ideas presented below work using PHP and apache, and most likely will not work under other languages / web servers. If you do find a comparable method using, for example, ASP, please let me know.
A Simple way to implement Search Engine Friendly URLs using PATH_INFO
Under Apache, you can add any extra information you want after the URL, for example:
http://www.exampledomain.com/mypage.php/test
As long as there is no file /mypage.php/test apache will ‘roll back’ the URL, bit by bit, until it finds an existing page (mypage.php). However, the full URL is still saved in memory. You can access it through the $PATH_INFO variable in PHP. In the above example, $PATH_INFO contains “mypage.php/test”.
We can parse out the details after the page name with this code:
$query_array = explode("/",$PATH_INFO);
After executing this code, $query_array contains:
$query_array [0] -> “mypage.php”
$query_array [1] -> “test”
Going back to our original URL : http://www.exampledomain.com/article.php?id=5
Can now become : http://www.exampledomain.com/article.php/5
Adding this code to the top of the page sets the $id variable as it should be
$query_array = explode("/",$PATH_INFO);
$id = $query_array[1];
Making your Search Engine Friendly URLs even friendlier!
Enable Multiviews
Edit your .htaccess file – add this line:
Options MultiViews
‘Multiviews’ allow you to not include the extension at the end of a filename, but apache will find and display the file anyway. So, our url of http://www.exampledomain.com/article.php/5 can now become http://www.exampledomain.com/article/5
Again, we have made our link even shorter.
But wait… It gets (even) better!
Normal visitors will be accustomed to seeing either a .htm or .php extension at the end of a web URL. With this method, the extension is stripped, but we can put it back in (purely for the aesthetic reason mentioned above, creating SURFER friendly Search Engine Friendly pages! – although it may also have a small added effect on search engines who think they are indexing static instead of dynamic pages)
So, instead of just http://www.exampledomain.com/article/5, we now want http://www.exampledomain.com/article/5.htm.
Add this code to the top of your page (above the other code discussed in this article)
$PATH_INFO = str_replace (".htm", "", $PATH_INFO);
That line of code simply strips away the .htm from the end of the $PATH_INFO variable.
Now our example URL is http://www.exampledomain.com/article/5.htm, and the journey into the world of Search Engine Friendly URLs is complete!
Drawbacks with Search Engine Friendly URLs
Fixed variable order
With a normal URL, you can list any variables in the query string in any order or miss some out. With this method, you cannot. Every variable must be in the correct position, or the system will fail.
Server Strain
Using both the “roll back” and “Multiviews” apache feature means that your server has to do extra work to serve the page to a viewer. This means that if you have a high traffic site, your server may be unable to handle this extra work. You may need to upgrade your hosting plan or dedicated server if this happens.