What is the Internet? That question is tremendously difficult to answer because the Internet is so many things to so many different people. Nonetheless, you need a short answer to give your mother when she asks, so here goes:
The Internet consists of a mind-bogglingly huge number of participants, connected machines, software programs, and a massive quantity of information, spread all around the world.
Now, let's see if I can put those various parts into some kind of meaningful context.
To say the Internet is big -- in terms of people, machines, information, and geographic area included -- is to put it mildly. How big is it? Let's take a look and see.
The Seattle Kingdome seats approximately 60,000 people for a sellout Mariners baseball game (a once-in-a-lifetime experience for the hapless Mariners). That's about the same number of people who read a single, mildly popular newsgroup on the Internet. If all of the estimated 30 million people on the Internet (according to some sources; others estimate much lower numbers) were to get together, they'd need 500 stadiums each the size of the Kingdome to have a party. I could calculate how many times that number of people would reach to the moon and back if we stacked them one on top of another, but I think I've made my point.
In the infancy of the computer industry, IBM once decided that it did not need to get into the computer business because the entire world needed only six computers. Talk about a miscalculation! Many millions of computers of all sizes, shapes, and colors have been sold in the decades since IBM's incorrect assumption. An estimated 3.9 million of these (3,864,000 as of October, 1994, for those of you who like the digits) are currently connected to the Internet. I keep having trouble with these numbers since they change so frequently. In the first edition, I used 2.2 million computers as the basic number. I had to change it to that at the last minute, since the manuscript I'd originally sent to Hayden used 2.0 million, the number from a few months before. When we published Internet Starter Kit for Macintosh, Second Edition in August of 1994, I updated the number to 2.3 million, and here I am, not all that many months later, updating to 3.9 million.
I can't pretend that the Internet offers more pieces of useful information than a good university library system, but that's only because a university has, in theory, a paid staff and funding for acquisitions and development. Information on the Internet is indeed vast, but finding your way around in it proves a daunting task. However, neither could I pretend that finding a given piece of information in a large research library would be any easier without the help of a skilled reference librarian.
Information on the Internet also changes and seems to appear more quickly than in a physical library, so you never know what's arrived since you last visited. Also, keep in mind that Internet information is more personal and fluid than the sort of information in a library. Although you may not be able to look up something in a reference work on the Internet, you can get 10 personal responses (some useful, some not) to almost any query you pose.
Explaining how large the Internet is geographically is difficult because, in many ways, messages traveling over the network connections don't give a hoot where they are going physically. Almost every industrialized nation (that's some 60 countries) has at least one machine on the Internet, and more countries come online all the time. But geographical distance means little on the net. For example, I mail issues of TidBITS to our mailing list on Monday night. People down the road from me find it in their mailboxes on Tuesday morning, as do subscribers in New Zealand and Norway (Norway apparently has the highest per-capita density of Internet machines). A friend described the Internet as ranging from Antarctica to the space shuttles, from submarines to battle tanks, from a guy riding a bicycle around the globe to others crossing oceans in a yacht, from kids in kindergarten to the most eclectic gathering of brains.... Well, you get the idea.
Perhaps the best way of wrapping your mind around the Internet is to recall the old joke about blind men all giving their impression of an elephant based on what they can feel. Like that elephant, the Internet is too large to understand at one mental gulp (see figure 3.1).
Figure 3.1: The Internet Elephant -- Elephantidae internetus
Some people may think of the Internet in terms of the people that are on the net (this is my favorite way of looking at it). Technical people may insist that the machines and the networks that comprise the physical Internet are the crux of the matter. Software programmers may chime in that none of it works without software. Others may feel that the essence of the Internet lies in the information present on it.
In fact, all of these people are equally right, although as I said, I personally prefer to think of the Internet as millions of people constantly communicating about every topic under the sun. The amount and type of information, as well as the hardware and software, will all change, but the simple fact of people communicating will always exist on the Internet.
The most important part of the Internet is the collection of many millions of people, homo sapiens, all doing what people do best. No, I don't mean reproducing; I mean communicating.
Communication is central to the human psyche. We are always reaching out to other people, trying to understand them and trying to get them to understand us. As a species, we can't shut up. But that's good! Only by communicating can we ever hope to solve the problems that face the world today. The United Nations can bring together one or two representatives of each nation and sit them down with simultaneous translations. But via the wire and satellite transmissions of the Internet, anyone can talk to anyone else on the Internet at any time -- no matter where he or she lives.
I regularly correspond with friends (most of whom I've never met) in England, Ireland, France, Italy, Sweden, Denmark, Turkey, Russia, Japan, New Zealand, Australia, Singapore, Taiwan, Canada, and a guy who lives about 15 miles from my house. (Actually, my wife and I finally broke down and went to visit the guy down the road, and he's since become one of our best friends in Seattle.) I've worked on text formatting issues over the networks with my friend in Sweden, helped design and test a freely distributable program written by my friend in Turkey, and cowritten software reviews with a friend in New Zealand who'd been one of my Classics professors back at Cornell. On the net, where everything comes down to the least common denominator of ASCII text, you don't worry about where your correspondents live. Although people use many languages on the net, English is the de facto language of the computer industry, and far more people in the world know English than English speakers know other languages.
During the Gulf War, while people in the US were glued to their television sets watching the devastation, people in Israel were sending reports to the net. Some of these described the terror of air raid sirens and worrying about SCUD missiles launched from Iraq. No television shot of a family getting into their gas masks with an obligatory sound bite can compare with the lengthy and tortured accounts of daily life that came from the Israeli net community.
The Internet also helped disseminate information about the attempted coup in the former Soviet Union that led to its breakup. One Internet friend of mine, Vladimir Butenko, spent the nights during the events near the Parliament. When everything seemed to be clear, he went to his office, wrote a message about what he'd seen, and sent it to the Internet. His message was widely distributed at the time and even partially reprinted in the San Jose Mercury News.
Although people on the Internet are sometimes argumentative and contentious, is that entirely bad? Let's face it, not all the events in the world are nice, and people often disagree, sometimes violently. In the real world, people may repress their feelings to avoid conflict, and repression isn't good. Or people may end up at the other extreme, where the disagreement results in physical violence. On the Internet, no matter what the argument (be it about religion, racism, abortion, the death penalty, the role of police in society, or whatever else), there are only three ways for it to end. First, and mostly likely, all parties involved may simply stop arguing through exhaustion. Second, both sides may agree to disagree (although this usually happens only in arguments where both sides are being rational about the issues at hand). Third, one person may actually convince another that he or she is wrong -- though I doubt this happens all that often, since people hate to admit they're wrong. But notice, in none of these possibilities is someone punched, knifed, or shot. As vitriolic as many of these arguments, or flame wars, can be, there's simply no way to compare them to the suffering that happens when people are unable to settle their differences without resorting to violence.
Most of the time on the net, an incredible sense of community and sharing transcends all physical and geopolitical boundaries. How can we attempt to understand events in other parts of the world when we, as regular citizens, have absolutely no clue what the regular citizens in those other countries think or feel? And what about the simple facts of life such as taxes and government services? Sure, the newspapers print those info-graphics comparing our country's tax burden to that in other countries. But this information doesn't have the same effect as listening to someone work out how much some object, say a PC, costs in France once you take into account the exchange rate and add an 18 percent VAT (value added tax), which of course comes on top of France's already-high income taxes. It makes you think.
If nothing else, that's the tag line I want to convey about the Internet. It makes you think.
Maybe with a little thought and communication, we can avoid some of the violent and destructive conflicts that have marked world affairs. Many of the Internet resources stand as testament to the fact that people can work together with no reward other than the satisfaction of making something good and useful. If we can translate more of that sense of volunteerism and community spirit back into the real world, we stand a much better chance of surviving ourselves.
Let's get down to the technical data: Around 3.9 million computers of all sizes, shapes, and colors make up the hardware part of the Internet. In addition to the computers are various types of network links, ranging from super-fast T-1 and T-3 lines all the way down to slow 2,400 bps modems. T-3 was also the code name for Microsoft Word 6.0 -- but I digress. That often happens when I'm talking about relatively boring things like networks, because what's important is that the Internet works. Just as with your telephone, you rarely notice its technical side unless something goes wrong.
The computers that form the Internet range from the most powerful supercomputers from Cray and IBM all the way down to your personal computer. You can split these machines into two basic types: host computers and client computers. Host computers are generally the more powerful of the two and usually have more disk space and faster connections. However, I don't want to imply that host machines must be fancy, expensive computers; an FTP site can be a lowly 80286 computer.
Similarly, client machines can also be large, powerful workstations from companies like Sun and Hewlett-Packard. Because their task of sending and receiving information for a single person (as opposed to many people) is more limited, clients generally require less processor power and storage space than host machines. Basically, it can be a waste to use a $10,000 Unix workstation as an Internet client (although it can make a great client machine, if you have the money to throw around). In my opinion, microcomputers make the best clients. Why spend lots of money and a large amount of configuration time when an inexpensive PC does the job as well or better?
I look only at client hardware and software in this book. The gritty details of setting up an Internet host and the many programs that run it aren't all that interesting to most people, not to mention the fact that I haven't the foggiest idea of how to configure a Unix workstation to be an Internet host. I'll leave those tasks to the wonderful people who are already doing them. (First rule of the Internet: Be extremely nice to your system administrator.) If you want to get into the administration end of things, O'Reilly & Associates publishes a long line of books on Unix and network administration.
NOTE: You don't absolutely need to use Unix on an Internet host, and in fact you can run an FTP, Gopher, or World Wide Web server under Windows or Windows NT with no trouble. But this setup requires a fast and constant connection to the Internet, and again, I'm aiming this book more at users of Internet information, not information provider wanna-bes. Maybe the next book.
In basic terms, two computers attached together form a local area network, and as that network grows, it may become connected to other independent local area networks. That configuration is called an internet, with a small i. The Internet, with a capital I, is the largest possible collection of interconnected networks. I could spill the gory details of what networks are connected to the Internet and whether they are true parts of the Internet (as defined by using a set of protocols called TCP/IP, or Transmission Control Protocol/Internet Protocol, the language that Internet machines speak), but that information wouldn't be very useful to you.
NOTE: An Internet old-timer once commented that "Internet" technically applies only to machines using TCP/IP protocols. He said the term once proposed for the collection of all the interconnected networks, no matter what protocols they used, was "WorldNet." That term seems to have faded into obscurity -- unfortunately, since it's rather apt. Common usage now includes "Internet" (the safest term, though it's technically inaccurate), "the net" (sometimes capitalized), and "cyberspace" (a heavily overused term from William Gibson's science-fiction novel Neuromancer, which, ironically enough, he wrote on a manual typewriter with images of video games, not the Internet, in his head). Lately, the term "information superhighway" (an unfortunate term that has spawned imagery of toll booths, speed bumps, on-ramps, and road kill, but which means almost nothing in the context in which it's generally used) is in vogue.
Worrying about the specific protocol details is generally pointless these days because many machines speak multiple languages and exist on multiple networks. For instance, my host machine speaks both TCP/IP as an Internet client and UUCP (Unix to Unix CoPy) as a UUCP host. My old machine at Cornell existed in both the Internet and in the BITNET worlds. The distinctions are technical and relatively meaningless to the end user.
For most people using a microcomputer such as a PC, a modem generally makes the necessary link to the Internet. Modem stands for modulator-demodulator (glad you asked?), and it enables your computer to monopolize your phone, much like a teenager. You may not need a modem if you study or work at an institution that has its local area networks attached to the Internet. If you are at one such site, count yourself lucky and ignore the parts of this book that talk about finding connections and using the modem. But remember those sections exist; one day you may leave those connections behind, and nothing is more pitiful than someone pleading on the nets for information on how to stay connected after graduation.
Certain new types of connections, including high-speed ISDN (Integrated Services Digital Network) may be the death of the modem as we know it. However, even ISDN connections require a box called a terminal adapter to enable the computer to appropriately pass data over the ISDN lines. In addition, even for folks with normal telephone line connections to the Internet, the modem itself may fade into the background -- or rather, into the innards of the computer. It's already possible to completely emulate a modem in software using digital signal processors (DSP). Eventually, wireless modems may become common, so the details of making a connection may fade away entirely. Or at least that's what I hope happens.
NOTE: In a exaggerated show of acronym making, normal phone service is known as POTS, or Plain Old Telephone Service. Don't the people who came up with this have anything better to do with their time?
It's beyond the scope of this book to tell you what sort of modem to buy, but most reputable modem manufacturers make fine modems with long or lifetime warranties. Some companies sell extremely cheap modems which often work fine in most cases, but you may also get what you pay for.
NOTE: I don't want to bash specific manufacturers, but many users have had good luck with modems from Supra and Hayes, and I personally have owned modems from Telebit and Practical Peripherals.
Suffice it to say that you want the fastest standard modem you can lay your hands on, and as of this writing, the fastest standard means that you want a modem with the magic letters v.34 stamped prominently on its box. Those letters, which say that the modem supports a certain standard method of transmitting information, ensure that your modem talks to most other modems at a high rate of speed, generally 28,800 bits per second (bps). (This speed, although fast for a modem, doesn't even approach that of a local area network).
NOTE: Be careful when you purchase a modem that is faster than 14,400 bps (v.32bis). Although modem speeds are becoming faster and faster, this doesn't mean that the computer can keep up. The serial chip on most PCs, call the UART, may limit speeds to about 19,200 bps. You can run a utility program called msd to determine which UART your computer uses. If you have an 8250 UART, you'll need to purchase a new serial I/O card for your machine for faster speeds. Look for a 16550 UART designator. Internal modems do not suffer this problem since they attach directly to the computer's bus.
Modem manufacturers often make claims about maximum throughput being 57,600 bps, but real speeds vary based on several variables such as phone line quality and the load on the host. Modems never (well, almost never) reach the promised maximum speed. The main point to keep in mind is that it takes two to tango; the modem on either end of a connection drops to the slowest common speed (usually 2,400 bps) if they don't speak the same protocols. Just think of this situation as my trying to dance with Ginger Rogers -- there's no way she and I could move as quickly as she and Fred Astaire did.
NOTE: You may see such terms as PEP, v.fast, and v.terbo (or v.turbo), but remember what I said about needing two to tango. If your provider doesn't use modems that also support these new semistandards, you'll probably drop to an earlier and slower standard.
Actually, there are more caveats to the modem question than I'd like to admit. Modems work by converting digital bits into analog waves that can travel over normal phone lines, and -- on the other end -- by translating those waves back into bits. Translation of anything is an inherently error-prone process, as you know if you've ever managed to make a fool of yourself by trying to speak in a foreign language.
A large percentage of the problems I've seen people have since the first edition of this book came out were related to their modems. Modem troubles are exacerbated by the fact that modem manuals are, without a doubt, the worst excuse for technical writing that I've ever seen. They're confusing, poorly written, poorly organized, and usually focused on the commands that the modem understands -- without providing any information as to what might go wrong. So, as much as I'd like to pretend that modems are all compatible, and that setting one up to communicate with an Internet host is a simple process, it might not work right away. If you encounter problems after first checking all the settings to make sure you've done everything right, you should check the settings in your modem manual to see if they match the settings your provider gives you.
I can't tell you how unhappy I am to have written that last paragraph, but it's just how the world is. You probably didn't get a driver's license without passing a written test, practicing with an adult, taking Driver's Ed, and finally passing a practical test. Perhaps more apt, you probably weren't able to find anything in a school library until one of the teachers or librarians showed you around. If your modem works on the first try, great! If not, don't get depressed -- not everything in this life is as easy as it should be. If it were, we'd have world peace.
Anyway, modems connect to phone lines, of course, and residential phone lines are generally self-explanatory, although at some point you may want to get a second line for your modem. Otherwise, those long sessions reading news or downloading the latest and greatest shareware can irritate loved ones who want to speak with you. (Of course, those sessions also keep telemarketers and loquacious acquaintances off your phone.) I also thoroughly enjoy being able to search the Internet for a file, download it, and send it to a friend who needs it, all while talking to him on the phone.
Not all telephone lines are created equal, and you may find that yours suffers from line noise, which is static on the line caused by any number of things. Modems employ error correction schemes to help work around line noise, but if it's especially bad, you may notice your modem slowing down as it attempts to compensate for all the static. When it's really bad, or when someone induces line noise by picking up an extension phone, your modem may just throw up its little modem hands and hang up on you. You can complain to the phone company about line noise; as I understand it, telephone lines must conform to a certain level of quality for voice transmissions. Unfortunately, that level may not be quite good enough for modems, especially in outlying rural areas; but if you're persnickety enough, you can usually get the phone company to clean up the lines sufficiently.
NOTE: If you connect from home and order a second line from your phone company, don't be too forthcoming about why you want the second line. Business rates are higher than residential rates, although they provide no additional quality or service. Some phone companies are sticky about using modems for nonbusiness purposes, which is why this point is worth mentioning. If you connect to the Internet from your office, there's no way around this situation.
As for the software, the programs that probably come to your mind first are the freeware and shareware files stored on the Internet for downloading -- things such as games, utilities, and full-fledged applications. I'll let you discover those files for yourself though, and concentrate on the software available for connecting to the Internet, much of which is free. Other programs are shareware or commercial, although most don't cost much. I'll talk about pretty much every piece of software I know about for connecting to the Internet in Part III of this book. Although there's no way for the book's discussion to keep up with the rate at which new and updated programs appear, I provide the latest versions of all the freeware and shareware programs on my file site, ftp.tidbits.com. Don't worry about the details now -- I'll get to them later in the book.
For the time being, I want to hammer home a few key points to help you understand, on a more gut level, how this setup all works. First, the Internet machines run software programs all the time. When you use electronic mail or Telnet or most anything else, you are actually using a software program, even if it doesn't seem like it. That point is important because as much as you don't need to know the details, I don't want to mystify the situation unnecessarily. The Internet, despite appearances, is not magic.
Second, because it takes two to tango on the Internet (speaking in terms of host and client machines), a software program is always running on both sides of the connection. Remember the client and host distinctions for machines? That's actually more true of the software, where you generally change the term host to the term server, which gives the broader term client/server computing. So, when you run a program on your PC, say something like FTP, it must talk to the FTP server program that is running continually on the remote machine. The same is true no matter what sort of connection you have. If you're using a Unix command-line account and you run a program called Lynx to browse the World Wide Web, Lynx is a client program that communicates with one or more World Wide Web servers on other machines.
NOTE: Think of a dessert cart filled with luscious pastries. You're not allowed to get your grubby hands on the food itself, so the restaurant provides a pair of dessert tongs that you must use to retrieve your choice of desserts. That's exactly how client/server computing works. The dessert cart is the server -- it makes the data, the desserts in this example, available to you, but only via the client program, the dessert tongs. Hungry yet?
A third thing to remember is this: FTP is the high-level program that you interact with, but low-level software also handles the communications between FTP and an FTP server. This communication at multiple levels is how the Internet makes functions understandable to humans and still efficient for the machines, two goals that seldom otherwise overlap.
So, if you can cram the idea into your head that software makes the Internet work on both a high level that you see and a low level that you don't, you'll be much better off. Some people never manage to understand that level of abstraction, and as a result, they never understand anything beyond how to type the magic incantations they have memorized. Seeing the world as a series of magic incantations is a problem because people who do that are unable to modify their behavior when anything changes, and on the Internet, things change every day.
More so than any other human endeavor, the Internet is an incredible, happy accident. Unlike the library at Alexandria (the one that burned down) or the Library of Congress, the Internet's information resources follow no master plan (although the Library of Congress, as do many other large university and public libraries, has its catalog and some of its contents on the Internet). No one works as the Internet librarian, and any free information resources that appear can just as easily disappear if the machine or the staff goes away. And yet, resources stick around; they refuse to die -- in part because when the original provider or machine steps down, someone else generally feels that the resource is important enough to step in and take over.
Andy Williams at Dartmouth, for instance, runs a mailing list devoted to talking about scripting, specifically about Frontier (a scripting program from UserLand Software). Originally, Andy also made sample scripts and other files pertaining to Frontier available, but he was not able to keep up with the files and still do his real job (a common problem). Luckily, Fred Terry at the University of Kansas quickly stepped in and offered to provide a Frontier file site because he was already storing files related to two other programs. (Fred also rescued the Nisus [a word processor] mailing list when Brad Hedstrom, the list creator and administrator, had to bow out. Fred's probably a sucker for stray dogs too.) Fred felt that keeping the information available on the Internet was important, and the sacrifice was sufficiently small that he was able do it.
Andy's also something of a sucker for resources in need of a home. When a man named Bill Murphy came up with a method of translating our issues of TidBITS into a form suitable for display on the World Wide Web, he ran into the problem of not having a sufficiently capable machine to provide that information to the Internet community. Who should step in but Andy, who offered the use of a World Wide Web server that he runs at Dartmouth. Between Bill's and Andy's selfless volunteer efforts, the Internet had yet another information resource for anyone to use.
These are just a few examples of the way information can appear on the Internet. Damming the Internet's flow of information would be harder than damming the Amazon with toothpicks. In fact, some of the Internet's resiliency is due to the way the networks themselves were constructed, but we'll get into that later on. Next, let's look at the main ways information is provided on the Internet.
You can think of an Internet host machine as a post office, a large post office in a large metropolitan area. In that post office, huge quantities of information are dispensed every day, but it doesn't just gush out the front door. No, you have to go inside, sometimes wait in line, and then go to the appropriate window to talk to the proper clerk to get the information you want. You don't necessarily pick up mail that's been held for you at the same window as you purchase a money order. Internet information works in much the same way. But on an Internet host, instead of windows, information flows through virtual ports (they're more like two-way television channels than physical SCSI ports or serial ports). A port number is, as I said, like a window in the post office -- you must go to the right window to buy a money order, and similarly, you must connect to the right port number to run an Internet application. Luckily, almost all of this happens behind the scenes, so you seldom have to think about it. See table 3.1 for a list of some common port numbers.
PORT NUMBER DESCRIPTION ----------- ----------- 20, 21 File Transfer Protocol (data on 20, control on 21) 23 Telnet 25 Simple Mail Transfer Protocol 53 Domain Name Server 70 Gopher 79 Finger 80 World Wide Web 110 Post Office Protocol - Version 3 119 Network News Transfer Protocol 123 Network Time Protocol 194 Internet Relay Chat Protocol
NOTE: I found this information in RFC (Request for Comment) 1340 via Gopher at is.internic.net. All of the RFCs that define Internet standards are stored there should you want more technical information about how parts of the Internet work at their lowest levels.
So, in our hypothetical Internet post office, there are seven main windows that people use on a regular basis. There are of course hundreds of other windows, usually used by administrative programs or other things that people don't touch much, but we won't worry about them. The main parts to worry about are email, Usenet news, Telnet, FTP, WAIS, Gopher, and the World Wide Web. Each provides access to different sorts of information, and most people use one or more to obtain the information they need.
Now that I've said how they're all similar, in the sense of all working through connections to the proper ports, there are some distinctions we must make between the various Internet services.
Email and Usenet news (along with MUDs and Internet Relay Chat) are forms of interpersonal communication -- there is always a sender and a recipient. Depending on the type of email message or news posting, you can use different analogies to the paper world, and I'll get to those in moment.
All of the information made available through other main parts of the Internet, such as Telnet, FTP, WAIS, Gopher, and the World Wide Web, is more like information in libraries than interpersonal communication, in the sense that you must visit the library specifically, and once there, browse or search through the resources to find a specific piece of information. These services have much more in common with traditional publishing than email and news.
NOTE: I should note that, in my eyes, the difference between browsing and searching is merely that when you're browsing, you're not looking for a specific piece of information. Perhaps you only want some background, or simply want to see what's out there. When you're searching, you usually have a particular question that you want answered.
No matter what you use, there is still some sort of communication of information going on. With email and news, it's generally informal and between individuals, whereas with the rest of the Internet services, the information is usually more distilled -- that is, someone has selected and presented it in a specific format and in a specific context. None of these distinctions are hard and fast. For instance, much informal information is available via Gopher, and it's certainly easy enough to find distilled information via email. I'll try to give you a sense of what each service is good for when talking about it later on.
Email is used by the largest number of people on the Internet, although in terms of traffic, the heaviest volumes lie elsewhere. Almost all people who consider themselves connected to the Internet in some way can send and receive email.
As I said above, most personal exchanges happen in email, since email is inherently an interpersonal form of communication. All of your email comes into your electronic mailbox, and unless you allow it, no one else can easily read your mail. When you get a message from a friend via email, it's not particularly different than getting that same message, printed out and stuffed in an envelope, via snail mail. Sure, it's faster and may have been easier to send, but in essence personal email is just like personal snail mail.
Because it's trivial to send the same piece of email to multiple people at once, you can also use email much as you would use snail mail in conjunction with a photocopy machine. If you write up a little personal newsletter about what's happening in your life and send it to all the relatives at Christmas, that's the same concept as writing a single email message and addressing it to multiple people. It's still personal mail, but just a bit closer to a form letter.
The third type of email is that carried on mailing lists. Sending a submission to a mailing list is much like writing for a user group or alumni newsletter. You may not know all of the people who will read your message, but it is a finite (and usually relatively small) group of people who share your interests. Mailing list messages aren't usually aimed at a specific person on the list, but are more intended to discuss a topic of interest to most of the people who have joined that list. However, I don't want to imply that posting to a mailing list is like writing an article for publication, since the content of most mailing lists more resembles the editorial page of a newspaper than anything else. You'll see opinions, rebuttals, diatribes, questions, comments, and even a few answers. Everyone on the list sees every posting that comes through, and the discussions often become quite spirited.
The fourth type of email most resembles those "bingo cards" that you find in the back of many magazines. Punch out the proper holes or fill in the appropriate numbered circle, return the card to the magazine, and several weeks later you'll receive the advertising information you requested. For instance, I've set up my computer to send an informational file about TidBITS automatically to anyone in the world who sends email to a certain address (email@example.com, if you're impatient and want to try something right away). A number of similar systems exist on the Internet, dispensing information on a variety of subjects to anyone who can send them email. A variant of these autoreply systems is the mailserver or fileserver, which generally looks at the Subject line in the letter or at the body of the letter and returns the requested file. Mailservers enable people with email-only access to retrieve files that otherwise are available only via FTP.
Like email-based discussion lists, Usenet news is interpersonal information -- it comes from individuals and is aimed at thousands of people around the world. Unlike email, even unlike mailing lists, you cannot find out who makes up your audience. Because of this unknown audience, posting a message to Usenet is more like writing a letter to the editor of a magazine or major metropolitan newspaper with hundreds of thousands of readers. We have ways of estimating how many people read each of the thousands of Usenet groups, but the estimates are nothing more than statistical constructs (though hopefully accurate ones).
Almost everything on Usenet is a discussion of some sort, although a few groups are devoted to regular information postings, with no discussion allowed. The primary difference between Usenet news and mailing lists is that news is more efficient because each machine receives only one copy of every message. If two users on the same machine (generally multiuser mainframes or workstations at this point) read the same discussion list via email, getting the same information in news is twice as efficient. If you have a large mainframe with 100 people all reading the same group, news suddenly becomes 100 times as efficient because the machine stores only the single copy of each message, rather than each individual receiving his or her own copy.
In many ways, Usenet is the kitchen table of the Internet -- the common ground where no subject is taboo and you must discuss everything before implementing it. In great part because of the speed at which Usenet moves (messages appear quickly and constantly, and most machines don't keep old messages for more than a week due to lack of disk space), finding information there can be difficult. Think of Usenet as a river, and you must dip in to see what's available at a specific point in time because that information may disappear downstream within a few days.
NOTE: The speed at which messages disappear from Usenet varies by group and by the machine you use. Each administrator sets how long messages in a group will last before being expired or deleted from the system. Messages in newsgroups with many postings per day may expire after a day or two; messages in groups with only a few postings per week might last a month. Since Usenet traffic is about 200 MB of information each day, you can see why the short expiration times are essential.
You can, of course, always ask your own question, and you usually get an answer (though it may be one you don't like), even if it's the sort of question everyone asks. Common questions are called Frequently Asked Questions, or FAQs, and are collected into lists and posted regularly for newcomers. Luckily, the cost of disk storage is decreasing sufficiently so that some people and organizations are starting to archive Usenet discussions. These enable you to use WAIS or Gopher to go back and search for information that flowed past in a mailing list or newsgroup a long time ago.
Telnet is a tough thing to describe. The best analogy I can think of is that Telnet is like an Internet modem. As with a standard modem, Telnet enables your computer to communicate with another computer somewhere else. Where you give your modem a phone number to dial, you give Telnet an Internet address to connect to. And just like a modem, you don't really do anything within Telnet itself other than make the connection -- in the vernacular, you telnet to that remote computer. Once that connection is made, you're using the remote computer over the Internet just as though it were sitting next to you. This process is cool because it enables me to telnet to the mainframes at Cornell University, for example, and use them just as I did when I was actually in Ithaca, and not 3,000 miles away in Seattle.
NOTE: Telnet, FTP, and Gopher can all work both as nouns describing the service or the protocols, and as verbs describing the actions you perform with them. If Telnet or Gopher is capitalized in this book, it's a noun describing the service; if it's in lowercase, it's a verb describing the action. (FTP is always capitalized in the book, since it's an acronym.) Unfortunately, others on the Internet aren't as consistent (and they don't have editors checking on their text) so this isn't a universal convention.
I realize I'm supposed to talk about information in this section, but Telnet is such a low-level protocol that it's impossible to separate the information that's available via Telnet from the protocol itself.
Most people don't have personal accounts on machines around the world (and I never use the Cornell mainframes any more either), but a number of organizations have written special programs providing useful information that anyone can run over the Internet via Telnet.
Say I want to search for a book that's not in my local library system. I can connect via Telnet to a machine that automatically runs the card catalog program for me. I can then search for the book I want, find out which university library has it, and then go back to my local library and ask for an interlibrary loan.
Or, for a more generically useful example, if you telnet to downwind.sprl.umich.edu, you reach the University of Michigan's Weather Underground server, with gobs of data about the weather around the entire country.
FTP feels like it's related to Telnet, but in fact that's an illusion -- the two are basic protocols on the Internet, but are not otherwise related. Where Telnet simply enables you to connect to another remote computer and run a program there, FTP enables you to connect to a remote computer and transfer files back and forth. It's really that simple.
More data is transferred via plain old FTP than by any other method on the Internet, and it's not surprising since it's a least common denominator that almost every machine on the Internet supports. Like Telnet, you must be directly connected to the Internet while using FTP, although there are a few special FTP-by-mail services that enable you to retrieve files stored on FTP sites by sending specially formatted email messages to an FTP-by-mail server.
There are probably millions of files available via FTP on the Internet, although you may discover that many of them are duplicates because people tend to want to give users more than one way to retrieve a file. If a major file site goes down for a few days, it's nice to have a mirror site that has exactly the same files and can take up the slack.
NOTE: Mirror sites are important because as the Internet grows, individual machines become overloaded and refuse to accept new connections. As with anything that's busy (like the phone lines on Mother's Day, the checkout lines at 5:00 p.m. on Friday afternoons at the grocery store, and so on) it always seems that you're the one who gets bumped or who has to try over and over again to get through. Don't feel special[md[hundreds of other people suffer exactly the same fate all the time. Mirror sites help spread the load.
In the PC world, several sites with lots of disk space (several gigabytes, actually) store a tremendous number of freeware and shareware programs along with commercial demos and other types of PC information.
The vastness of the number of files stored on FTP sites may stun you, but you have access to a tool that helps bring FTP under control. Archie takes the grunt work out of searching numerous FTP sites for a specific file. You ask Archie to find files with a specific keyword in their names, and Archie searches its database of many FTP sites for matches. Archie then returns a listing to you, providing the full file names and all the address information you need to retrieve the file via FTP.
I mentioned using WAIS to search for information about deforestation in the Amazon rainforest in the preceding chapter, but that's only the tip of the iceberg. WAIS originated from a company called Thinking Machines but has now split off into its own company, WAIS, Inc. Using the tremendous processing power of a powerful computer, WAIS can quickly (usually under a minute) return a number of articles to English-language queries, sorted by the likelihood that they are relevant to your question. WAIS is limited only by the information that people feed into it.
Last I counted, there were over 500 sources available for searching within topics as diverse as Buddhism, cookbooks, song lyrics, Supreme Court decisions, science fiction book reviews, and President Clinton's speeches. For all the sources on nontechnical topics, I'm sure an equal number exist about technical topics in many fields.
NOTE: People talking about WAIS (pronounced "ways," I hear) tend to use the terms "source," "server," and "database" interchangeably, and so do I.
Perhaps the hardest part about WAIS is learning how to ask it questions. Even though you can use natural English queries, it takes your question quite literally, and only applies it to the selected sources. So, if you asked about deforestation in the Amazon rainforest while searching in the Buddhism source, I'd be surprised if you found anything.
Gopher, which originated with the Golden Gophers of the University of Minnesota, is an information browser along the same lines as FTP, but with significant enhancements for ease of use and flexibility. Numerous sites -- over 2,300 at last count -- on the Internet run the host Gopher software, placing information in what are colloquially called gopher holes. When you connect to a Gopher site, you can search databases, read text files, transfer files, and generally navigate around the collection of gopher holes, which is itself called Gopherspace.
I find Gopher to be among the most useful of the Internet services in terms of actually making information available that I need to answer specific questions. Part of the reason for my opinion is Veronica, and to a less extent Jughead, which enable you to search through Gopherspace as Archie enables you to search for files on anonymous FTP servers.
NOTE: Veronica and Jughead were both named to match Archie (from the Archie comics), but Veronica's creators at the University of Nevada did come up with an acronym as well -- Very Easy Rodent-Oriented Net-wide Index to Computerized Archives. Jughead stands for Jonzy's Universal Gopher Hierarchy Excavation And Display. Glad you asked?
Veronica searches through all of Gopherspace, which is useful, although badly phrased searches (Veronica doesn't use natural English, as WAIS does) can result in hundreds of inappropriate results. Jughead searches a subset of Gopherspace and can thus be more accurate, though less comprehensive.
One of the special features of Gopher is that it provides access to FTP (and Archie) and WAIS, and can even run a Telnet program to provide access to resources only available via Telnet. Gopher can also work with other programs to provide access to special data types, such as pictures and sounds. When you double-click on a picture listing in Gopher, it downloads the file and then runs another program to display the picture. This sort of integration doesn't generally work all that well if all you have is Unix command-line access to the Internet.
When I wrote the first edition of this book, the World Wide Web existed but lacked a good client program on either the PC or the Macintosh. I managed to write a paragraph or two about NCSA Mosaic, the Web browser that was officially released a few months after I finished the book, but there simply wasn't much I could check out on the Web at that point.
NOTE: You may see the World Wide Web referred to as simply "the Web," "W3," or sometimes as "WWW."
Everything about the Web has changed since last year. It's become much, much larger, and the resources available on it have become incredibly diverse and far more useful.
The Web brings a couple of very important features to the Internet. First, unlike Gopher or anything else, it provides access to full fonts, sizes, and styles for text, and can include images onscreen with no special treatment. Sounds and movies are also possible, though often too large for many people to download and hear or view. Second, the Web provides true hypertextual links between documents anywhere on the Web, not just on a single machine. For those unfamiliar with hypertext, it's a powerful concept that enables the reader to navigate flexibly through linked pieces of information. If you read a paragraph with a link promising more information about the topic, say results from last winter's Olympic Games, simply click on the link, and you'll see the results. It really is that simple, and the World Wide Web is indeed the wave of the future for the Internet. Nothing touches it in terms of pure sexiness, although many Web servers that you see suffer from the same problem that many publications did after desktop publishing became popular: they're designed by amateurs and are ugly as sin.
I've tried to answer one of the harder questions around: "What is the Internet?" The simple answer is that the Internet is a massive collection of people, machines, software programs, and data, spread all around the world and constantly interacting. That definition, and the explication I've provided about the various parts of the Internet elephant, should serve you well as we look next at the history of this fascinating beast.