Practical Technology

for practical people.

WAIS and WEB: the future of Internet data searching.

| 0 comments

Every now and again, you hit a home run, you get on the story before anyone else does. My shot at jounalistic glory came with this piece when I was the first author in a mass-market magazine to point out that the Web was something special.

Computer Shopper, April 1993 v13 n4
WAIS and WEB: the future of Internet data searching.
Steven J. Vaughan-Nichols.

Lately, the Internet has been bursting with new services. Last time around, I looked at Gopher, an ingenious tool for searching the Internet’s information resources. Today, I’m going to peek at two of the newest programs in the Internet arsenal: WAIS and WEB.

THE WAYS OF WAIS

WAIS (pronounced “wayz”) works something like Gopher in that it’s a tool for finding information and resources on the Internet. With Gopher, though, you need to point it in a certain direction via its menus. Gopher is very easy to use, but you can facilitate things greatly if you have an idea where something is before you go looking for it.

WAIS, on the other hand, does the real leg-work for you. Of course, WAIS can’t do everything. It’s not capable of searching willy-nilly through public directories throughout the Internet universe (which is fortunate; otherwise, it would eat up network bandwidth like peanuts). Instead, WAIS relies on indexed data collections or, as they’re called, libraries.

These libraries are file collections consisting mostly of informational material. For example, if molecular biology is your meat and drink, several journals on the subject are available online via WAIS libraries.

Every WAIS client on your local system or accessible via telnet from a remote site knows where to find the WAIS libraries. Presently, more than 300 free libraries exist. These data collections have been indexed and made available mostly by volunteers at academic sites. Commercial WAIS libraries, like the Dow Jones Information Service, are also available.

Most libraries are free; this means that WAIS data can be very spotty. For example, you won’t be greatly surprised to know that computer-science subjects are well-covered; however, if you want to know something about antique cars, you’re out of luck …for now. WAIS libraries keep springing up at a surprising rate, and there’s no telling what may be available by the time you read this. I was recently bemused to discover a WAIS library of technical documentation on the Musical Instrument Digital Interface (MIDI) during a data hunt.

WAIS itself is mindlessly simple to use. WAIS clients are available for everything from Macintoshes to PCs to supercomputers. In every case, you simply key in words for WAIS to search on and then let her rip. WAIS will respond with a listing of libraries where it thinks the information you’re looking for is hiding out.

You could instruct WAIS to search everywhere, but that would be a waste of time. For instance, searching for information about chess in the WAIS library devoted to the Simpsons (yes, the cartoon) won’t do you any good.

Armed with a library’s listing, you pick out the most likely targets, and WAIS begins to narrow down its search. If all goes well, you’ll be looking at the documents concerning your subject in a matter of seconds.

WAIS SEARCHING IS LIMITED

Then again, maybe you won’t. WAIS searching doesn’t recognize any of the Boolean search terms. In other words, while I can search for references to “Cyrix and 486,” I won’t get just documents that contain both terms. Instead, WAIS uses an internal weighing system that measures the value of each term for the search, including the “and.” It’s possible that an article which contains many mentions of “and” and “486” will be tagged by WAIS as more important than a short document containing “Cyrix” in the title, even if all further references are to c486SLC and c486DLC. It makes my researcher blood boil just thinking about it.

Why use WAIS then? Because it does just fine at simple, one-term searches, and, more importantly, it has the unique ability to perform “relevance feedback” searches. Say I find an article on Cyrix 486 chips that exactly hits the spot. I can then pull terms from that document and use them to start a new search. WAIS gives me the ability, once I’m on the right trail, to spring down the path to other relevant articles. For this ability alone, WAIS, even in its current teething stage, is an excellent information-gathering tool.

CAUGHT IN THE WEB

World-Wide Web (WEB) is still a development project, but it is publicly accessible and it provides Internet information hunters with greater power. WEB brings hypertext to the Internet.

What is hypertext, you ask? It’s a way to look at documents that, while not unique to computers, makes full use of a computer’s ability to interconnect data. In a hypertext document, certain words are links to other documents or files. For instance, in a biography of Grace Hooper, you could jump from a description of her inventing COBOL to a manual on the language, and from a reference in it to Unix to an article by yours truly on our favorite operating system.

WEB takes the hypertext idea and applies it to information available on the Internet. The result is potentially the most powerful automated information-gathering tool in existence.

Alas, for now, WEB remains mostly potential. The WEB server is only available by telneting to info.cern.ch or nxo01.cern.ch. Its full hypertext informational resources are limited at this time, but they are growing. WEB is the informational wave of the future.

Like Gopher and WAIS, WEB boasts several easy-to-use interfaces. Of course, the read-only version of WEB really only has two commands, so it’s not hard to make it easy to use. These are: Start a search and follow a link. That’s it. WEB takes care of the rest. This leads to a quite different way of looking at information. For example, you can use WEB to wander about WAIS libraries and leap from term to term, regardless of a document’s format or location.

Unfortunately, since much of the data that WEB deals with isn’t in hypertext format, WEB usually comes across as a slower version of WAIS with a more consistent interface. This is true now, but as more true hypertext documents become available, WEB’s uniquely strong searching capacities will stand out more and more.

Leave a Reply