July 2, 1994
by sjvn01
0 comments

Putting The Pieces Together: Mosaic

So, you think doing stuff on the Web can be hard today do you? Venture me with back to 1994, when, in the pages of Government Computer News, I described how to get Mosaic, the first popular Web browser, working. This was in the ‘good’ old days when changing your home page meant manually editing an ini file.

There’s no computing subject hotter then the Internet. And, when it comes to Internet tools, nothing is more white-hot then Mosaic. Alas, Mosaic is also one of the most difficult Internet tools to install and certainly the most misunderstood.

Before I go into down and dirty details on how to install Mosaic, let’s go over what Mosaic can, and can’t do. First, Mosaic is a freeware World Wide Web (WWW) browser from the National Center for Supercomputing Applications (NCSA). The WWW is a world-wide distributed hypermedia system.

In the WWW, documents, databases, and Internet resources appear as hypertext documents. As you read the document, you can click on highlighted keywords and go to other documents and resources. For example, when you visit the Novell WWW server, you’ll see an illustration of red Novell technical manuals. Selecting the manual on technical support takes you to keywords that move you closer to your information destination.

Continue Reading →

April 1, 1994
by sjvn01
0 comments

RAID!

Two well-known laws of computing are that there’s no such thing as enough disk space and no hard drive is fast enough. If you get a gigabyte sized disk, within a year you’ll have more than a gigabyte of data. You’ll also want megabytes of that data in memory faster than your system can provide it. At least nowadays, you can get disks in gigabyte sizes. Unfortunately, their data-transfer speeds are not much different from their 40 MB relatives. A new technology called RAID may change all of that forever.

SLED

One problem with conventional mass storage is that single large expensive disks (SLEDs) are, as the name suggests, pricey. It’s not easy making conventional drives with the tolerance levels that can handle King-Kong sized data loads. This translates into high development and construction costs.

A more important problem with SLEDs is that purely mechanical considerations hinder their output. While CPU and memory speeds continue to improve at a remarkable rate, the raw speeds of secondary storage devices are improving at a much more modest rate.

Most users haven’t noticed that their hot new processors are outracing their disks. That’s because caching makes it possible to hide hard drives’ comparative slowness. Both dynamic RAM (DRAM) and static RAM (SRAM) have gone down in price and up in performance. These trends mean that users to have the room needed to use caching software effectively and vendors can produce reasonably priced caching disk controllers.

Caching disguises the problem, but does nothing to cure it. Higher hard disk data density helps by improving transfer rates, but not enough. SLEDs are still hamstrung by the need to move a drive’s heads mechanically to seek data and the delays caused by disk rotation.

RAID

Fortunately, there is another way of getting gigabyte-sized storage and good performance. This method is to take an array of inexpensive disks and attach them to a computer so that the computer views the array as a single drive. It’s simple, it’s slick, and it works. Called RAID, for redundant arrays of inexpensive disks, it’s going to change the way you buy file-server and workstation storage.

RAID is an old and dusty way of viewing mass storage. As our hunger for more space and higher speeds grows, RAID has become more prominent. Equally important, David Patterson, Garth Gibson, and Randy Katz in their seminal paper, “A Case for Redundant Arrays of Inexpensive Disks,” gave computer designers a RAID taxonomy. By defining and classifying as levels the ways that an array of disks can be used to improve performance, Patterson, and his fellow researchers, opened a new vista of mass storage technology.

Pluses

RAID provides several benefits. The first is the best. RAID systems have the potential to deliver vastly increased data transfer rates. In theory, the input/output transmission rate of a RAID system can be more than ten times greater than a SLED.

RAID pulls this trick off by “striping” data across the array’s disks. In English, this means that a file can be distributed across the array so that it can be read or written much more quickly than on a SLED. For instance, a file can be placed so that while the first part of it is being read from disk one of the array, the second portion is already being picked up from disk two.

By enabling parallel data transfers, data throughput can be multiplied by the number of drives in the array. For example, a four disk RAID could have four times the throughput of an equal-sized SLED. The resulting increase in bandwidth is largely what gives RAID systems their performance kick. The same mechanical factors that slow down SLEDs drag the theoretical performance benefits of RAID back to earth. Nevertheless, RAID designs are still inherently faster then SLED designs.

Another plus for RAID designs is that the right kind of RAID can handle multiple small read requests. This can vastly increase the effective speed of disks used in network servers. ;;;In practice, many considerations can drop the RAID performance edge to a less impressive level. Some RAID levels are not well suited for network operating systems (NOS) or multiuser operating systems.

For example, a network file-server with dozens of users requiring access to data scattered hither and yon across the disks seems tailor made for RAID. And, it is, if it’s the right kind of RAID. A RAID implementation that’s meant for large sequential data reads and writes simply won’t cut it on a file-server. Such a RAID controller might work well on a dedicated database engine, but there’s little else in the microcomputer world where such designs can play a role. You must be certain that any particular RAID design fits your needs.

Another concern that limits RAID’s power boost is that operating systems like MS-DOS, OS/2 and most flavors of Unix require every block of a file to be on one drive. This cuts out RAID’s ability to improve throughput by accessing multiple drives concurrently.

The moral of the story is that the operating system can determine how effective a RAID really is. For the maximum in RAID benefit, the drives should be coupled with an operating system like Novell Netware or some types of Unix that can distribute a file block’s across the entire array rather than one disk. Another alternative is a RAID controller that can fool the operating system into thinking that the RAIDed disks are one remarkably large disk.

The second RAID advantage is that these drives should prove cheaper than their SLED equivalents. Note that I say, ‘should.’ RAID technology is just emerging from the starting gate, and to date, their prices are low. At this point, we’re still paying for RAID research and development.

Minuses

It’s real easy to see RAID’s Achilles’ heel. RAID gets its performance and megabyte for the dollar bang from putting multiple cheap disks into a single logical array. Now, to find the Mean Time to Failure (MTTF) for that array simply take the MTTF of one disk and divide it by the total number of disks in the array. For instance, many hard drives have a MTTF of 20,000 hours. That’s not bad. Now, in a RAID with 10 of these drives, the MTTF drops to an appalling 2,000 hours. In other words, you can be reasonably sure that the RAID will fail within a year of normal business use. Ouch.

Luckily, there is a way of getting around that painful MTTF: fault tolerance. This key concept unlocks RAID’s potential. Fault tolerance’s importance to RAID can be judged by the fact that it’s what Patterson used to define RAID’s levels.

SUBHED: Levels of RAID

The first, or zero, layer of RAID doesn’t use fault-tolerance at all. The zero level relies upon high MTTF’s on each drive to protect the RAID from disaster. Systems built around this idea tend to be faster than a bat out of hell. Cynics might say that they’re about as reliable.

RAID 0 drives, if carefully designed, can work quite well. The secret is to have only a few drives in the array and to make certain that these drives are highly reliable. This brings these drives MTTF to industry-acceptable levels.

MicroNet Technology takes this approach with their Macintosh specific Raven disk array storage system. The Raven SBT-1288NPR uses a pair of Seagate WrenRunner 2 644 MB drives and two MicroNet’s NuPORT SCSI-2 host adapters.

The result is one of the fastest drives ever for a Macintosh. The WrenRunner 2 drives run at 5,400 RPM, about 50 percent faster than conventional drives. Combine this with the RAID advantage and MicroNet’s ‘Overlapping Seek’ data search algorithms and you get a gigabyte plus of storage with access times that dip as low as 6 milliseconds. Even more impressive, the SBT-1288NPR can sustain 4.4 MBs per second data transfers. Can you say “Zoom?”

High-end workstations can make the best use of Level 0 RAIDs. Level 0 sub-systems would not work well on file-servers where the limited number of disks in an array would limit the speed benefits obtainable from multiple read requests.

First Level

Level 1 RAIDs rely upon that old stand-by of fault tolerance, disk mirroring, to protect their data. Level 1 RAIDs are safe, sure, and easy to make. There are probably more Level 1 RAID designs now in production than any other. Disk mirroring, however, means that only half a RAID’s maximum disk space can be used for data storage. That’s too high a price for safety for many users.

Missing disk space is the most clear cut problem with disk mirroring but it’s not the only one. Disk mirroring also slows I/O because of the need to read and write from two disks. In uni-processor systems with unintelligent controllers, the I/O performance drop can be as bad as 50 percent. Multiprocessors and controllers with onboard processors go a long way to removing this performance string.

Are you willing to pay the price of half your disk’s room for RAID’s performance benefits? RAID vendors are betting that you are.

There are two reasons why these companies are making this bet. First, disk mirroring is cheap from a production point of view. Any controller that can handle multiple active drives can be put into service as a RAID 1 controller with the proper software driver. While this keeps ST-506 and ESDI controllers out, they can handle only one active drive at a time, SCSI controllers have little trouble being retrofitted into RAID Level 1 controllers.

The other reason is that RAID Level 1 systems are not really competing with SLEDs. Users currently using disk mirroring are the customers for Level 1. From this perspective, Level 1 makes a great deal of sense. Level 1 RAIDs are faster than pure disk mirroring systems, but provide the same safety net.

Network administrators probably will find RAID 1 an attractive alternative. It’s inexpensive and provides the benefits of disk mirroring with less of a performance hit.

Novell Netware 386, the most popular NOS, supports Level 1 RAID. Data General’s DG/UX operating system for their AViiON workstations also enables Level 1 RAID. Other workstation vendors that are positioning their machines as network servers will be adding RAID 1 to their offerings. By the time this sees print, there is no doubt that other vendors will be producing RAID 1 software.

Second Level

The least interesting RAID Level for microcomputer users is Level 2. At this level, data is safeguarded by bit-interleaving data across the entire disk array with Hamming error-correction codes. This takes up less room than disk mirroring, but that’s about its only virtue for small computer users.

Several disks must be assigned as check disks to store error correction codes. Level 2 eats up about 40% of available disk space. Level 2 also requires the controller or CPU to be constantly generating error-correction codes. Worse yet from a performance standpoint, every disk in the array must be accessed for a single data read or write.

All of this makes Level 2’s small-file data transfer rate, in a word, awful. An unadorned Level 2 array simply isn’t suitable for PC or file-server use. No one makes Level 2 RAIDs for PCs, if anyone did, no one should buy them.

Third Level

Another of Level 2’s problems is that its check disks are redundant. It’s a simple enough task to enable a disk controller to be able to tell when a specific drive in an array has failed. Even detecting sector failures isn’t that much trouble. Level 3 uses the idea that information on a failed disk or sector can be restored with a single check disk.

Level 3 guards against data loss by parity checking. Level 3 error-correction works by calculating a parity value for each byte. In parity checking, an extra bit holds the parity value for each byte. Systems that use ‘even’ parity checking have a ‘1’ as the parity bit if the sum of the numbers in the byte is an even number. If the sum is odd, then the parity bit is ‘0.’

How does this work to restore data if a disk in a RAID bites the big one? Each byte’s parity in the intact data disks and the check disk can be used to determine a new parity. This is compared to the parity of the array before the failure. If the parities are not the same, then the lost bit was a ‘1’, otherwise the missing bit was a ‘0.’

Besides being a neat way of restoring data, this means that up to 85% of the array’s space can be available for storage. Level 3 gives more storage room to users than either Levels 2 or 1.

That’s the good news, the bad news is that Level 3 has some of Level 2’s I/O woes. Unlike Level 2, reads can be made at high speed. Writes are another story. Every time data is written to a disk, either the CPU or a controller processor must generate a new parity value.

This really puts a load on the processor. Even a 50 MHz 486 would show signs of overwork in a transaction heavy environment. In practical terms, Level 3 should not be implemented in software destined for any PC’s main processor.

The need to write the parity values to the check disk also slows Level 3 designs. If that wasn’t enough, Level 3 can only perform a single I/O transaction at a time. Level 3 works fine for large data block transfers. Like Level 2, though it’s really not well suited for LAN, multiuser, or workstation use.

Fourth Level

The primary difference between Level 3 and Level 4 is the level of data interleave and parity checking. In Level 4, data is interleaved between disks by sector instead of by bits.

The results are faster data reads because several reads can be conducted at once if the reads aren’t to the same disk. Write speeds are still hampered because the parity drive must be updated every time there’s a write. Overall effective performance is dramatically better than RAIDS 1 through 3 though. That’s because reads make up the vast majority of primary storage interactions.

Level 4 small data transfer I/O also gets a kick in the pants because the parity calculation is simpler. Level 3’s parity calculations are not difficult, but they are processor killers because every disk in the array must be consulted. Level 4 sidesteps this. In Level 4, only the values of the old data, the new data, and the old parity are used to calculate parity. Write operations take up far less time. Unfortunately, only a single write can be done at a time.

The right combination of cache and intelligent controller can overcome this slowdown. Dell Computers takes this approach with their Dell Drive Array (DDA).

SUBHED: DDA. The DDA starts with a high-performance 32-bit EISA disk controller. This controller uses a dedicated 16 MHz Intel RISC 960 microprocessor to generate parity values. The processor controls both data access and layout. In turn, the i960 gets its marching orders from instructions stored in 512K of 32 bit firmware ROM. These instructions are supplemented by optional dynamically loaded firmware that can be loaded in a 256K Static RAM (SRAM) storage area. This means that when Dell improves the RAID’s code, the new and improved firmware can be loaded as software. Using DDA means never having to say you’re sorry that you’re locked into obsolete firmware. The SRAM also can be used as a cache.

DDA uses an Intel 82355 bus master interface chip to connect with the EISA bus. This combination can support up to a burst transfer rate of 33 MBs per second. In real world applications, the DDA can sustain up to 5 MB per second transfer rates.

The DDA can handle up to ten 200 MB integrated drive electronics (IDE) drives for a total capacity of 2 GB. To work, the DDA must have at least 2 drives. The drives themselves have an average access time of 16 milliseconds. The speed of the DDA itself depends on its configuration.

The DDA can be set up to support simultaneous reads. In this mode, up to five concurrent unrelated data reads can occur at once. While ideal for network servers, this comes at the cost of fault tolerance. While in simultaneous seek mode, the protection of data redundancy is unavailable.

In DDA’s other mode, data striping works with Level 4 data guarding. In this setup, the DDA gains the bandwidth advantages of being able to read data from logically concurrent sectors across the width of the array.

Either mode makes Dell’s disks faster than their SLED counterparts. Your system requirements will determine which setup

will work best for you. Workstation users will clearly be better off with full Level 4 protection. Network administrators will have a much harder time deciding which mode to use.

Compatibility shouldn’t be a problem for anyone. From an operating system point of view, the DDA looks like the popular Adaptec 1540 SCSI controller. In addition, DDA directly supports MS-DOS, OS/2, Unix and Novell Netware.

Fifth Level

At Level 5, the parity disk bottleneck is broken. Parity information is stored directly on the data disks. This means that up to 85 percent of the disk can be used for data without the I/O hassles of Level 3. Even more important, Level 5 supports multiple simultaneous reads and writes.

The 5th level of RAID promises the most, but it’s also the hardest to create. A dedicated processor on the controller is a must for Level 5. The processor must handle not only making and tracking parity check bytes but it must be faster than greased lightening to handle the I/O demands.

There are three ways to put Level 5 to work. In the first, the existing data and parity is read and then a transient parity value generated by removing old data from the equation. This transient parity is then used with the new data to create the new parity value.

The second method uses data that will be not changed by the write transaction with the new data to create a new parity value. Afterwards, the new data and parity is written to disk.

The final way of obtaining parity values in Level 5 is not to bother reading existing data or parity values. Instead, the controller waits for two new bytes to be written and then creates the parity value from the incoming information. The advantage to this is that, the controller doesn’t need to waste time reading from the disk every time a write request comes in.

Well known hard disk manufacturer, Micropolis has been a leader in bringing RAID 5 to the marketplace. At this time, they are not shipping a RAID 5 product. There will soon be, however, a hardware implementation of RAID 5 for their Model 2112 1.085 GB drives.

Future RAID

Make no doubt about it, RAID systems are coming. With the coming of processors like the i960, it’s now possible to make controllers with the necessary smarts to deal with RAID’s processor demands. With that technical barrier out of the way, RAID controllers will enter the marketplace in increasing numbers as design problems are ironed out.

At this time, RAIDs are too expensive for any but the most demanding LAN or workstation users. The technology’s price will drop. As this happens, RAID designs’ speed and safety features will make them the mass storage systems of choice for the rest of the 1990s.

A version of this story was first published in Byte.

January 1, 1994
by sjvn01
0 comments

Best Buy-Operating System: OS/2 2.1

Who says you can’t teach old dog new tricks? For years, IBM stayed out of the direct market and was beaten at every turn by Microsoft in the operating systems wars. Not anymore. IBM has entered the direct market with a flourish and OS/2 2.1 leads the way with our Best Buy award for operating systems.

In a year flooded with new operating systems, UnixWare, Windows NT, Solaris and NeXTStep, OS/2 emerged victorious. OS/2 has done more then just beaten the new-comers though, it has broken Microsoft’s iron grip on today’s computers.

It doesn’t take a genius to see why OS/2 emerged triumphant. OS/2 liberated the power in today’s 32-bit processors. Finally, ordinary end-users could really put all their memory and the multitasking ability of 386s and 486s to work.

If that wasn’t enough, OS/2 lets you use your old Windows and MS-DOS programs. The only things users have to lose by switching to OS/2 are the chains of archaic operating systems. With OS/2 you can truly run multiple DOS, Windows and OS/2 programs with OS/2’s pre-emptive multitasking.

OS/2’s graphical user interface, Presentation Manager, is also a winner. Windows users will find it familiar enough so that they won’t suffer from operating system culture shock.

OS/2 is speedy as well. While some 32-bit operating systems put have so much overhead that even a Pentium feels like a 12Mhz 286, OS/2 is lean, mean and gets your jobs done in double-quick time.

Another plus for OS/2 is IBM’s legendary support. If your new program goes haywire under OS/2 at midnight on Friday night, you don’t have to wait for Monday morning to get help. IBM’s HelpCenter is open 24 hours a day, seven days a week.

Some critics say that there’s not enough OS/2 programs and that OS/2 2.1 still lacks drivers. Neither argument held much water for Shopper’s readers. More OS/2 programs and drivers are emerging by the day.

Shopper’s readers have looked at the future, and what they see there is OS/2. The microcomputing world may never be the same.

A version of this story first appeared in Computer Shopper.

August 14, 1993
by sjvn01
0 comments

NeXTStep Brings Objectivity to Operating Systems

With the arrival of NeXTStep for Intel object-oriented operating systems are no longer the stuff of science fiction and vaporware for PC users. NeXT Corporation’s $795 workstation operating system brings a distinctly different look and feel to today’s PCs.

Other then a pretty front-end, what do you get from NeXTStep? Well, you get several things. One is an interface, Workspace Manager, that’s a delight to use. With NeXTStep, there’s finally an interface for the PC that rivals, and even surpasses, that of the Macintosh.

The interface is completely object oriented. That means that Workspace Manager’s individual elements, icons, menus and windows, can be taken apart and sewed back together to form an interface’s that’s custom tailored for the way you work.

Continue Reading →

April 20, 1993
by sjvn01
0 comments

WAIS and WEB: the future of Internet data searching.

Every now and again, you hit a home run, you get on the story before anyone else does. My shot at jounalistic glory came with this piece when I was the first author in a mass-market magazine to point out that the Web was something special.

Computer Shopper, April 1993 v13 n4
WAIS and WEB: the future of Internet data searching.
Steven J. Vaughan-Nichols.

Lately, the Internet has been bursting with new services. Last time around, I looked at Gopher, an ingenious tool for searching the Internet’s information resources. Today, I’m going to peek at two of the newest programs in the Internet arsenal: WAIS and WEB.

THE WAYS OF WAIS

WAIS (pronounced “wayz”) works something like Gopher in that it’s a tool for finding information and resources on the Internet. With Gopher, though, you need to point it in a certain direction via its menus. Gopher is very easy to use, but you can facilitate things greatly if you have an idea where something is before you go looking for it.

WAIS, on the other hand, does the real leg-work for you. Of course, WAIS can’t do everything. It’s not capable of searching willy-nilly through public directories throughout the Internet universe (which is fortunate; otherwise, it would eat up network bandwidth like peanuts). Instead, WAIS relies on indexed data collections or, as they’re called, libraries.

These libraries are file collections consisting mostly of informational material. For example, if molecular biology is your meat and drink, several journals on the subject are available online via WAIS libraries.

Every WAIS client on your local system or accessible via telnet from a remote site knows where to find the WAIS libraries. Presently, more than 300 free libraries exist. These data collections have been indexed and made available mostly by volunteers at academic sites. Commercial WAIS libraries, like the Dow Jones Information Service, are also available.

Most libraries are free; this means that WAIS data can be very spotty. For example, you won’t be greatly surprised to know that computer-science subjects are well-covered; however, if you want to know something about antique cars, you’re out of luck …for now. WAIS libraries keep springing up at a surprising rate, and there’s no telling what may be available by the time you read this. I was recently bemused to discover a WAIS library of technical documentation on the Musical Instrument Digital Interface (MIDI) during a data hunt.

WAIS itself is mindlessly simple to use. WAIS clients are available for everything from Macintoshes to PCs to supercomputers. In every case, you simply key in words for WAIS to search on and then let her rip. WAIS will respond with a listing of libraries where it thinks the information you’re looking for is hiding out.

You could instruct WAIS to search everywhere, but that would be a waste of time. For instance, searching for information about chess in the WAIS library devoted to the Simpsons (yes, the cartoon) won’t do you any good.

Armed with a library’s listing, you pick out the most likely targets, and WAIS begins to narrow down its search. If all goes well, you’ll be looking at the documents concerning your subject in a matter of seconds.

WAIS SEARCHING IS LIMITED

Then again, maybe you won’t. WAIS searching doesn’t recognize any of the Boolean search terms. In other words, while I can search for references to “Cyrix and 486,” I won’t get just documents that contain both terms. Instead, WAIS uses an internal weighing system that measures the value of each term for the search, including the “and.” It’s possible that an article which contains many mentions of “and” and “486” will be tagged by WAIS as more important than a short document containing “Cyrix” in the title, even if all further references are to c486SLC and c486DLC. It makes my researcher blood boil just thinking about it.

Why use WAIS then? Because it does just fine at simple, one-term searches, and, more importantly, it has the unique ability to perform “relevance feedback” searches. Say I find an article on Cyrix 486 chips that exactly hits the spot. I can then pull terms from that document and use them to start a new search. WAIS gives me the ability, once I’m on the right trail, to spring down the path to other relevant articles. For this ability alone, WAIS, even in its current teething stage, is an excellent information-gathering tool.

CAUGHT IN THE WEB

World-Wide Web (WEB) is still a development project, but it is publicly accessible and it provides Internet information hunters with greater power. WEB brings hypertext to the Internet.

What is hypertext, you ask? It’s a way to look at documents that, while not unique to computers, makes full use of a computer’s ability to interconnect data. In a hypertext document, certain words are links to other documents or files. For instance, in a biography of Grace Hooper, you could jump from a description of her inventing COBOL to a manual on the language, and from a reference in it to Unix to an article by yours truly on our favorite operating system.

WEB takes the hypertext idea and applies it to information available on the Internet. The result is potentially the most powerful automated information-gathering tool in existence.

Alas, for now, WEB remains mostly potential. The WEB server is only available by telneting to info.cern.ch or nxo01.cern.ch. Its full hypertext informational resources are limited at this time, but they are growing. WEB is the informational wave of the future.

Like Gopher and WAIS, WEB boasts several easy-to-use interfaces. Of course, the read-only version of WEB really only has two commands, so it’s not hard to make it easy to use. These are: Start a search and follow a link. That’s it. WEB takes care of the rest. This leads to a quite different way of looking at information. For example, you can use WEB to wander about WAIS libraries and leap from term to term, regardless of a document’s format or location.

Unfortunately, since much of the data that WEB deals with isn’t in hypertext format, WEB usually comes across as a slower version of WAIS with a more consistent interface. This is true now, but as more true hypertext documents become available, WEB’s uniquely strong searching capacities will stand out more and more.

October 29, 1992
by sjvn01
0 comments

Gophering the Internet

Let’s face it, getting the most out of the Internet isn’t easy. Even archie, described in my last visit, only helps with one specific area of net use, finding and ftping files. Probably more than a few of you have been saying, “The net is neat but why does it have to be so hard?” Well, these days, with the right software, it doesn’t have to be so hard. There are two user-friendly programs that makes using the Internet’s resources easier than ever before: gopher and wais.

Before this dynamic duo showed up, some of the net’s most valuable resources were only available to a lucky few in the know. The most important of these resources are the online databases. These systems provide public access to everything from library catalogs to technical documentation collections. Unfortunately few people knew how to access these databases. Now, with gopher at your side, you can liberate this information for your own uses.

Gopher

Gopher and wais may sound like Ren and Stimpy, but they’re anything but cartoons. Unlike the other tools I’ve been looking at, ftp and archie, gopher is a general purpose information tool. Gopher builds on the foundation of ftp, archie and other information sources to erect an easy-to-use, menu-driven interface to the net’s file and informational resources.

Gopher was ‘born’ at the University of Minnesota, the Golden Gophers. Its name is a bad pun on the University’s sports team name and the program’s purpose, ‘go-fer’ the data gopher!

Unlike archie, which relies on a centralized archie database of ftpable files, gopher doesn’t rely on any particular data collection. To use the analogy of a library, archie is like a card catalog dedicated to publicly available files. Gopher, on the other hand, is like a librarian. Gopher doesn’t know where a particular item is but it does knows where to find out where information is hiding.

The best thing about gopher is that you don’t have to have a clue about where some file or bit of information is. IP addresses, file formats, domain names, forget ’em, with Gopher you don’t need to know Unix esoteria. Gopher does the dirty work, all you have to do is pose the questions.

Getting a Gopher

To get gopher started you should have a gopher client on your system. If you don’t, you can telnet your way to a site with a publicly accessible gopher client the same way you can access archie.

You shouldn’t have to do this, however. Gopher client programs are free and come in makes and models for almost every architecture and operating system under the sun. While the Unix character-based interface is the most common front-end to gopher, you can also get a HyperCard-style gopher for the Macintosh and a DOS-character interface based sub-species for PCs. You can get the right one for your system by using archie to find a nearby site with ftpable gopher files. If all else fails, you can always find the gopher clients at the site: boom-box.micro.umn.edu in the pub/gopher directory. Always look for a closer site first, however, everyone who a net’s expert knows about boom-box and that site can be very busy.

Once you’ve installed the program, usually a simple task, although you will require system administrator privileges on Unix systems, you type gopher at your command prompt. You’re then presented with a set of menus. You then select the choice that looks like the best path to your informational destination. You could, for instance decide, that you want to find a library with a copy of the newest Tom Clancy thriller.

In the pre-gopher days, you’d do this by telneting to every computerized library catalog you could think of. Right, like everyone knows IP addresses or domain names for automated card catalogs. Even after you found your library and its access point, you then faced the problem of how to log into the system. Some just let you right in, some require a user id of ‘guest,’ others, ‘anonymous’ and so on. And, if you made it that far, you’d have to figure out to that system’s particular each idiosyncrasies. It’s no wonder that until recently Internet information access has been a black art practiced only by net gurus.

Go-Fer It.

With gopher, however, the gopher server takes care of all this. Your client gopher starts looking for the information. It does this by first checking for local resources, usually a gopher server, or telnets to a preset gopher service.

Gopher clients come with a pre-coded gopher server they look to for information, but this can, and should be, changed to access the closest available gopher server. The server, presented with your request then tries to figure out where to find the information. All you know, sitting at your desk, is that a few seconds after you start your inquiry is that gopher has presented you with menu choices that take you closer to your destination.

These choices come in two forms: resources and directories. A directory, marked with a ‘/’ at the end of its menu item, indicates that choosing this item will lead you to a sub-menu. Resources are, like the name indicates, actual sources of information.

From the menus, you proceed to narrow down your choices until you can reach an appropriate resource. In our search for our Clancy high-tech shoot em-up, for example, we can probably live without looking into card catalogs for libraries thousands of miles away,

Eventually, you end up with what you and gopher agree is probably a system or program that can supply you with the information or file you need. At this point, you and Gopher go-fer for it.
When you access a resource, gopher takes over the job of logging in to the computer and service. Gopher also shields you from the local system. No matter what you’re logged into, you use gopher’s search interface, not the remote systems.

This has one great advantage, you never have to learn the ins and outs of a database you may only use once. There are two mirror image problems to gopher’s approach. The first is that while gopher can perform fairly complicated searches, you may not know if the software gopher is talking to can handle it.

Archie, for example, can only search on a single word. Or, you could try searching for say ‘386DX and Unix’ Some systems take that to mean you want to know about books or articles which contain both the words ‘386DX’ and ‘Unix.’ Others assume you really want the phrase ‘386DX and Unix’. With gopher, in the way, you’ll only know that your searches are going wrong.

The flip side to this is that gopher defaults to the lowest common denominator searching. The resource you’re accessing may be capable of very precise searches, but you’ll be limited to gopher’s search capacities. Gopher, for instance, can’t tell the difference between lowercase and uppercase. This may not be a big deal if you’re only occasionally on a data hunt, but big-time information hunters will want to throw gopher out the window. At least they’ll feel that way until they recall how much work hunting for information without gopher is.

Another thing you should keep in mind with Gopher’s is that sometimes Gopher may dig up an information resource for you that you can’t access. The most common example of this is are the news services. The UPI news feed is available, for example, on many academic sites but its inaccessible from most commercial sites no matter what the gopher menu says.

Another interesting point is that not all gopher servers are the same. Some servers may be much stronger in certain areas than they are in others. That’s because gopher servers tend to be best connected to local resources. The original gopher server at the University of Minnesota, to no surprise, is filled with information resources from that school. One of the neater things about gopher, however, is that you’re not limited to a single server. You can use gopher to hunt for other gopher servers that might give you access to information that’s more your speed.

Flaws, and all, you’ll never mistake gopher for such powerful single purpose, online information retrevial engines as Ziffnet’s Computer Library, gopher does have its good points. Because gopher brings the almost limitless information resources to your reach, gopher is an invaluable tool for any Internet explorer. Gopher’s not the only one that making the Internet a better place for information hunters. Next time around, I’ll take a look at wais and, still making the transition from experiment to the essential, the Web.

A version of this story was published in Computer Shopper in 1993.

Practical Technology

for practical people.