Foreword to UNIX Relational Database Management

I stumbled into using UNIX in late 1978, in a Bio-Medical Research environment. The machine was a PDP 11-34 running V6 UNIX, with custom automated microscopy gear used for image processing of mammalian cells. In those days there was the Kernighan and Ritchie C book and the UNIX manuals themselves; nothing more. Two degrees in computer science prepared me to tackle UNIX but I was still perhaps understandably shy about investigating all of the 100+ tools, including such strange ones as awk, lex, and yacc.

I quickly discovered that the use of the UNIX tools freed me from the great bulk of the drudgery of supporting real end users on a computer system. I could string tools together quickly instead of writing small one shot programs to do the myriad of data manipulation tasks that characterize real life in the scientific computing world. In the course of my work I acquired a relational database package, but found that it did not mesh well with the UNIX tool-kit philosophy. Like the mammoth database systems I had previously used on large IBM systems, one had to drop into the database package, where a whole new command language applied and the UNIX tools could only be reached with extreme difficulty, if at all.

By 1983 I had added an early 16-bit microprocessor-based UNIX timesharing system running V7 UNIX to my computer shop. I had a growing number of users who were doing general purpose data processing and needed a variety of tools to hand their needs. I was fortunate enough to be in attendance at the Usenix conference where Rod Manis first described his /rdb database, that existed as a set of UNIX filters. As he presented his paper I realized, along with numerous other members of the audience, that Rod had grasped and implemented the central abstraction I had been toying with but never formalized: a UNIX database should use ASCII files and operate at the shell level as a series of filters. This simple but brilliant insight of Rod's, brought about by his mastery of a UNIX tool ( awk ) that I had ignored to that point, provided my researchers with the single most powerful tool I have been able to acquire to this date. Along with many others who remained after his talk to discuss his ideas, I was one of the first users of the /rdb package. The medical research environment is characterized by chronic lack of funding for computing equipment and scarcity of programming expertise. Without mega-funding, I had no choice but to find ways to let end users solver their own problems as much as possible, without making them all programmers.

To my mind, the power of this tool is the freedom it gives to the end users. For the first time I could sit down with a novice and create a working sample database using real data in a matter of minutes. Queries could be demonstrated and stored in user-friendly shell scripts. Within hours the end users were fully capable of creating their own databases. Their creativity was amazing: One researcher who had had a single Fortran class eight years earlier wrote a two-page shell script use /rdb filters to completely computerize her elaborate data processing and analysis. Other users discovered that the innocent-appearing simple report writer was in reality a powerful meta-tool that allowed them to write shell scripts that wrote shell scripts, yielding in effect a two-dimensional program. By word of mouth it was demanded to be bought for at least a dozen other machines at this site. The database for the world's largest melanoma (skin cancer) epidemiology study runs on /rdb, as do many other applications.

I have found this abstraction of a database (regular ASCII files, filter programs, normal file access) to serve well for moderate size databases and the frequency of update and query that are common in the scientific research fields. Like most UNIX utilities and indeed UNIX itself, it is always possible to write a single-point solution that would be faster. I have found, however, that the ability to apply the full power of the UNIX toolkit to database problems far outweighs any speed penalties that I might be paying on the current generation of super-micros and super-minis. Even in situations where a "TRADITIONAL DATABA$E" is required, the rapid prototyping of the /rdb approach is often used to provide the early insights into the proper way to tackle problems requiring exceptional speed or size. I find that writing a custom program to manipulate or calculate data is virtually unnecessary, since such problems can nearly always be solved with the /rdb database and regular UNIX filters.

A completely unexpected benefit of this tool-kit approach to databases has been the education of users, many of whom have had no prior computer training or exposure. The early confidence gained in putting up their own simple databases often led to users taking the plunge and developing more complex shell scripts using those tools. In numerous cases, their continued interest led to them delving into awk and writing custom front or back end programs. A few adventurous souls then began writing C code in the awk scripts, as well as learning to use lex, which after is merely a C program-generator cleverly (?) disguised as a lexical-analyzer. The net result is that I have had several novice users bootstrap themselves up to be excellent applicationss programmers in C in an amazingly short time, all due to the good first experience with the database tools!

The evolution of personal computers into machines with sufficient power to run UNIX well (IBM AT, Mac II, etc) puts the ability to use the UNIX tool-kit philosophy into the reach of every research lab and small business. A book like this is as much a guide to using UNIX itself and the UNIX philosophy of problem solving as it is to being a guide to a specific database. The authors have quietly effected a revolution in the use of databases as a problem solving tools that is as startling in its clarity and simplicity as the development of the spreadsheet program.

Tom Slezak

Computer Scientist, UNIX Support

Lawrence Livermore National Laboratory