SERVER SIDE INCLUDES

Powered by Clarynx Technologies
Click here to listen to this story
Please rate this article!
Article Rating:
( 10 = Best )
Please rate this article at cgi-resources.com
so it can be ranked in their system and others can find it.

Take advantage of
SSI's to create
dynamic web pages!




Contents:
1. Client-Server Basics
2. Document Parsing
3. SSI Format
4. Executing the Command
5. Writing the Program
6. Can I start Using Them?
7. How do I enable them?
8. A perfect use
9. Conclusion

Author's Note: This article was written in 1995, so technical specifics may be out of date. However, the concepts still stand and I think this is a good introduction to what SSI's are and how/why they work.
For more information on SSI's, please the CGI Resources List of articles to continue your learning.


If you've ever seen a counter on someone's page that says "You are visitor #1000" or a greeting that says "Welcome, visitor from yoursite.com", you have seen server-side-includes at work. They can be very useful for a number of different things, and if you've found yourself wanting to learn more about how to do "cool things" on your web pages, this might be your next step.

CLIENT - SERVER BASICS

Before teaching you how to do these tricks, let's first make sure you have a basic understanding of how documents are loaded. Let's say a user from somewhere in the world wants to load your page into their WWW browser. They type in the address, and that sends a request out to the machine where your page is stored. On this machine, there is a special program constantly running and looking for incoming requests. The request from the person wanting your page gets sent to this program, and it is its job to go get the page from the computer, package it up the correct way, and ship out the data to the person who is wanting it. The program requesting the document (Netscape, Mosaic, Lynx, etc) is called the client. The program that takes the request and returns the page is called the server. The important thing to remember is that the server is always running, and it is what does the work to return the page that is requested.

There are many different servers available for many types of computers, and they are all a little different. So getting things to work correctly with one server may not work with another, and some servers can do things other servers can't. But most servers (that I know of) are able to do server-side-includes.

DOCUMENT PARSING

Now comes the important part. Most servers are not just limited to grabbing a home page from the disk and sending it back, though. They can do other nifty tricks before sending the results back to the client requesting it.

One of these things is searching through the page it is returning for special commands that it recognizes. This is called "parsing" the document. In order for the server to do this, it must be told to do so. By default, most servers do not parse a document before sending it.

If your site gets a lot of accesses to their pages, your server has a lot of work to do. It has to manage all of the incoming requests, find the files, and return them to the right places. This can take a lot of computing power. Now, imagine if the server had to first look through every single document for commands before it sends it - even if most of them do not contain any commands at all. This will just put even more load on the server, and it may not be necessary. So there is a method used to tell the server which documents it should parse for commands and which documents it can safely ignore.

The way it decides on this is by the filename of the document it is going to send. Most WWW pages end in .html, which lets the server know it is just an HTML document. Most servers will not look at these for commands. Instead, if you want your document to be parsed, you need to give it an extension of .shtml. When the server is going to send any document that ends with .shtml, it knows that is has to check for hidden commands first.

You can usually set things up the way you want it, though. So, the WWW administrator decides on what fileames wll be parsed and which ones will not. On our system, I have enabled parsing on ALL documents that end in .html, so there is no need to use .shtml. This can slow the server down considerably, but if the machine is fast enough and is not overhwelmed, the performance will not suffer.

So now you know that the server can check for commands. But what does it look for? And what does it do when it finds a command?

SSI FORMAT

The commands that the server looks for are actually kept inside of comment lines. A comment line in an HTML document looks like this:


<!--This is a comment-->

The '<!--' and '-->' are what contains the comment. This will not be displayed by the browser that is viewing the page.

By using a special format inside of the comment, the server will recognize it as a command, rather than just something to skip over and ignore. A server-side-include looks something like this:


<!--#exec cgi="filename.cgi"-->

The important part is the '#' at the beginning of the comment, which tells the server that this is a command. This is followed by the keyword 'exec', which is a command telling the server to execute the following program. The cgi="filename.cgi" portion tells the server exactly which program to execute. #exec is just one of a few possible commands, which will be covered below. For now, we'll use this format as a specific example.

EXECUTING THE COMMAND

In the example above, "filename.cgi" is given as the name of the program to execute. So, the server executes this prorgam automatically and waits for it to finish.

Anything printed to the output by the program replaces the server-side-include command in the document. So, if you have a command like:


<!--#exec cgi="filename.cgi"-->

in your document, and filename.cgi said

print "hello"

the output "hello" would be sent back to the server and it would insert it into the document in place of the command. So anyone viewing the source code to your page would simply see "hello" and not even be aware that it was the output of an external program. This is the whole idea behind SSI's - inserting something into the document automatically every time it is loaded.

WRITING THE PROGRAM

Now that you have the correct format for executing a command, all you need is a program to run. If you aren't a programmer at all, you might have a lot of difficulty at this point because things like this just are not that simple. But I'll try to explain the basics.

Programs can be written in any language, as long as they can run on the machine you are using. This means C, C++, Pascal, Shell scripts, fortran, or any other language. The most commonly-used language for these purposes is Perl. Perl is probably the fastest, easiest way to accomplish tasks that you want without having to go to great lengths just to do it. It is available for many systems (mainly unix, but other Operating Systems have ports available) and you'll be able to find many scripts written in Perl to do what you want. But going into detail about Perl is beyond the scope of this page. Here is a basic perl script that will execute and return "Hello":


#!/usr/bin/perl
print "Content-type: text/html\n\n";
print "Hello";

Wait...if the only thing this prints is "Hello", then why is it printing that "Content-type" line too? The answer is simple, and often forgotten by people new to programming like this.

The line that reads "Content-type: text/html\n\n" is printed first so that the server knows what kind of data is coming back from the script. When you use a SSI, the server has no idea what is goin to be returned by the script and inserted into the document. It can be plain text, HTML, a gif image, a jg image, etc. So every script must start with a line telling the server what is being returned so it knows how to interpret it correctly. So the complete output of the above script is this:


Content-type: text/html

Hello

Notice the empty line between the Content-type line and the data that is to be returned. This is REQUIRED in order for the line to be understood. Also, the Content-type in this case (and in most cases) is text/html, which means the returned data is just text, and it might contain some HTML tags like <B>, <I>, etc.<P> What are all the Available Commands?
I'm not going to duplicate all of the possible calls here, because they differ slightly depending on what server youi are running, and because there are other places that already do that. Check out NCSA's site for more information.

CAN I START USING THEM?

It's not quite that easy (but what is? :). Since server setups differ for different machines and sites. Some things to keep in mind:

  • Not all sites have SSI's enabled at all. You'll have to mail your sysadmin to see if you can use them.
  • Some sites do not allow scripts do be executed in user's home directories. This means that you cannot do any of this simply. Ask your sysadmin, once again.
  • Some sites have #exec disabled, but not other SSI's.
  • Some sites do not have Perl installed on their system.
  • If all else fails, mail your system administrator. Ask if SSI's are enabled, whether you can have scripts in your own directory, whether you can have access to the /cgi-bin directory (where most scripts are stored), whether you have perl or not, etc. If he does not know how to fix these things, tell him that a REAL sysadmin would know how. Blackmail him to keep his dirty secret private.
HOW DO I ENABLE THEM?

I can only speak for the NCSA httpd server, since that is what I have experience with. there is one file that controls this access - srm.conf. Usually it is located in /usr/local/etc/httpd/conf/srm.conf. If you are curious, you can even look at it yourself to see. At the bottom of the file, there are these lines:


# ScriptAliased directories, uncomment the following lines.

#AddType text/x-server-parsed-html .shtml
#AddType application/x-httpd-cgi .cgi

These control the things we have been talking about. The first line says what filenames should be parsed - .shtml by default. The second line tells the server which files are programs or scripts to execute. By default it is .cgi, so any program you write to use will need to end in .cgi or it won't be recognized as a program.

In order to enable both of these things (which is probably what you want), get rid of the "#" in the front, which tels the server to ignore the lines. Then, you'll have to restart the httpd server in order for it to re-read the configuration files and do what you want. This is accomplished by the following command:


kill -HUP #

# is the process ID of the httpd server running right now. To find out what the PID is, you can do two things:
1. do a 'ps -ax' and see what it says.
2. look in /usr/local/etc/httpd/logs/httpd.pid

A PERFECT USE

Another command available in SSI's is #include, as opposed to #exec. This command does not execute a program that you must write - instead, it just inserts a file into your WWW page. I find this to be one of the most valuable parts of SSI's. If you've browsed any of my pages, you'll notice that most of them have the same exact footer at the bottom. To type each one in separately would take a long time. Plus, when I want to change something I would have to change every single page. So instead, I use an #include statement to insert a separate footer file that I have created.

This way, I can change the one footer file and be done. Each time a page is loaded, the server will go out and insert the footer for me, making it look like I just typed it in on all of them. See the reference above for the syntax on this command, as well as the others.

CONCLUSION

Hopefully you now have a good understanding of how SSI's work, how to use them, and how to enable them on your server. Now the real work is ahead of you - either finding the scripts you want to use or writing them on your own.

As is so often the case, the things you see on pages that make you say "Cool!" are also the complicated things. It's ridiculous to think that you can learn to create these great things in a week or so, so don't expect to.

I've created some interesting applications for my site, I think. But I've been programming different things for many years. I've been using unix since '92, including some system administrating. I've been using the WWW since '93. I started with CGI and Perl shortly after that. This is by no means a simple, quick process, and the absolute best thing you can remember is that the most difficult and time-consuming things to learn are also the most rewarding :)

Good luck!!

This is my experiment with (purely voluntary) Micro-Payments. Basically, web users make very small cash contributions to the sites they find most useful. If many people do this, the web site operators can be rewarded without having to resort to banner ads, etc. This link will take you to PayPal where you can donate $2 to support my site (please... no need to donate more than once). This is purely voluntary and is in no way a "fee" for the information I put on my site. Thanks.
I write these articles to help people. Please let me know how I'm doing!
What did you think of this page?
Please rate it and drop me a quick anonymous note.
Please leave a quick comment and let me know your thoughts. All honest comments are appreciated!


Copyright 1995, Matt Kruse <matt@mattkruse.com>
This document may not be reproduced in any way without the permission of the author