Die-ing on the Web by brian d foy Web servers aren't very friendly to misbehaving CGI scripts - they like to complain rather than give constructive criticism, so the detective work is left up to the script maintainer. Often, the script maintainer is not the same person as the script author, which leads to several difficulties in diagnosing the problem. I present four methods for dealing with uncooperative CGI scripts. The first two, "Fiddling with die()" and "Redirecting STDERR", show some basic techniques in error handling which you can apply to more than just CGI scripts. The third method illustrates the use of the CGI::Carp module, which does the work of the first two methods for you. The last method should become part of every CGI developer's personal library - a custom built error handling routine that not only outputs error messages to the right places, but can also send mail to the appropriate person or take other actions to try to fix the problem. A respected physicist once consoled me with "An expert is someone who has made every mistake". I defer on expert status, but here's what I've learned from my mistakes. 1. FIDDLING WITH DIE What happens when you need to fix a CGI script that you did not write? I have had plenty of clients ask me to quickly fix their existing script so they had something in production while I was re-coding the project. Amazingly enough, I have found a lot of CGI scripts that use die()! Usually the problem is a failed open() which mucks up the works before an HTTP header can make it to the server. This leads to server errors. The first time I had to debug someone else's CGI script, I rushed off to the substitute function, s/die/$another_thing/g; only to REALLY cause problems. Luckily no one was around to see this and I quickly figured out that I needed to start over with the original file. Now, rather than going through the code to change every instance of die(), I simply redefine what die() does. Instead of printing to STDERR, I can change die() to print to STDOUT and include a minimal HTTP header. I change what die() does by fiddling with $SIG{__DIE__} - the signal that is sent when die() is invoked. In this case, I use an anonymous subroutine to print die()'s message to STDOUT: ( the same thing works for warn() and $SIG{__WARN__} ) #!/usr/bin/perl # use an anonymous subroutine to replace die()'s # default behaviour $SIG{__DIE__} = sub { my $message = shift; print STDOUT "The following message is brought to you anonymously:\n"; print STDOUT "$message\n"; }; print STDOUT "Content-type: text/plain\n\n"; print STDOUT "This is from STDOUT\n"; #well, at least my system doesn't have this file :) open FILE, '/etc/password' or die "This is from die: $!\n"; __END__ It is a great tool if you inherit a script from a colleague, I don't recommend this technique if you are starting a long script from scratch. I notice that Randal likes to use this technique in his WebTechniques scripts [*]. That's probably fine for short scripts, but a code reviewer might not remember what you did to die() several hundred lines further down. Um, not that I know this from experience or anything. 2. REDIRECTING STDERR TO STDOUT What happens if I have redefined die(), but I still get output on STDERR? If STDERR flushes before the server receives a proper HTTP header, I get more server errors and more frustration. I had to debug a complicated mess of CGI scripts whose documentation seemed to be some sort of ASCII-fied cryllic language. The tangle of require()'s and files was quite a mess and I couldn't read the documentation, but something was secretly printing to STDERR. I wanted to see the error message so I could infer from where it was coming by searching for pieces of the message. However, I was on a VT100 terminal at the time (yes, they still exist), so watching the server log was annoying and time-consuming. Running the script from the command line interwove STDOUT and STDERR into one big HTML mess. I decided to redirect STDERR to STDOUT so that I could see the error message in Lynx, which would also nicely format the HTML. There is an example in "How do I capture STDERR from an external command" in the Perl FAQ [*], but it gets a bit tricky since I needed to print things in a certain order for the web server not to give me an error: #!/usr/bin/perl BEGIN { #we want STDOUT to flush right away select(STDOUT); $| = 1; print STDOUT "Content-type: text/plain\n\n";} } open STDERR, ">&STDOUT"; print STDOUT "This is from STDOUT\n"; print STDERR "This is from STDERR\n"; die 'This is from die'; __END__ Notice that I set STDOUT to autoflush. if I don't do that, the STDERR handle flushes first and I get a "malformed header from script", even though I supposedly output the HTTP header in a BEGIN block. 3. USING CGI::Carp Now that I've shown you the basics of managing fatal errors and STDERR, you can forget about them and use CGI::Carp which comes with the standard perl distribution. The CGI::Carp module easily traps fatal errors and send them elsewhere. There are two exportable functions that do simplistic error handling - carpout() and fatalsToBrowser(). The carpout function allows me to redirect the output from die, warn, croak, confess, and carp to another file handle. The POD suggests that I set up this redirection in a BEGIN block so I can catch some compile time errors: BEGIN { use CGI::Carp qw(carpout); open(ERROR_LOG, ">>my_error_log") or die("my_error_log: $!\n"); carpout(\*ERROR_LOG); } beware though! we have now redirected STDERR to a file. STDERR is where perl prints the carpout.cgi syntax OK message after you test your script from the command line with perl -cw carpout.cgi which we are all doing, right? Well, I wasn't thinking about this when I tried it the first time, so my screen looked like dog[32] perl -cw carpout.cgi dog[33] I couldn't figure out what I had done wrong since I had never seen perl not return anything. Furthermore, running the script appeared equally fruitless. It's whole job was to die() so it's output disappeared into the corn fields too. dog[36] ./carpout.cgi dog[37] When I start playing with redirection, I usually need to draw myself a picture to avoid such lapses of memory. Despite all of this, I still need to handle sending the HTTP header to the server since carpout doesn't do that for me. But today is my lucky day since CGI::Carp already knows how lazy I am. The second method redirects fatal messages from die() or confess() to the browser along with a minimal set of HTTP headers so that the server won't complain. Strangely enough it's called fatalsToBrowser(): #!/usr/bin/perl use CGI::Carp qw(fatalsToBrowser); open FILE, "quotes.txt" or die("$!\n"); print <<"HTTP"; Content-type: text/plain HTTP while( ) { print } close FILE; __END__ some HTML gets sent to my browser

Software error:

No such file or directory

Please send mail to this site's webmaster for help. and the error message is neatly recorded in the error log with the time stamp and source filename. [Sat Oct 18 04:20:48 1997] fatalsToBrowser.cgi: No such file or directory 4. ROLLING YOUR OWN By far the best solution for large projects is to make a custom error handling routine. Instead of using die(), I have my own routine which I export from a module: cgi_error(). I can do various clean up sorts of things as well as making sure that I get an intelligent message in the browser if something went wrong. I can even send myself nasty little email messages: sub cgi_error { my $message = shift; print <<"HTTP"; Content-type: text/plain there was an error: $message HTTP open MAIL, '| /usr/lib/sendmail -t -odq -oi'; print MAIL <<"MESSAGE"; To: brian@sri.net From: jimminy_cricket@sri.net Subject: Your groovy CGI, baby. something horrible has happened: $message MESSAGE close MAIL; } I can add even more to cgi_error() to collect most of the data that I need to diagnose the problem, and maybe even fix it. This was especially handy with one script that needed to be setuid. If it wasn't setuid it would eternally loop trying to get a record lock on a database to which the server UID did not permission to write. The script did not seem to break if it didn't have permission; there were no server errors or other indications of failure. It just ate as much processing time as it could. It looked like a very slow connection from the browser's perspective. Adding a cgi_error() fixed the problem - if the script could not get the database lock after so many tries, it called cgi_error() which diagnosed a few things: is the process that has the lock still running? If not, cgi_error can call a lock_smith script to release the zombie lock. Does the script have the right permissions? If not, cgi_error can call a program to restore the proper permissions to the script. Once cgi_error has tried to fix the problem, it collects some information and sends me a message saying what it tried to do and what actually happened. I have found this to be a far superior solution to listening to clients say "I don't know - it's broken" or to getting up before three in the afternoon to diagnose it by hand. Once I developed a cgi_error function with which I was satisfied, I found that my time spent in the debugging phase of a project was greatly reduced. IN SUMMARY I presented several techniques to deal with error handling in CGI scripts - some specific to the CGI environment and others that can easily be adapted to other situations. I highly encourage any CGI developer, especially those that distribute their software, to develop robust and sensible methods of error handling. REFERENCES The Perl FAQ, Section 8, System Interaction, Randal Schwartz's WebTechnique columns, __END__