This article originally appeared in TidBITS on 1997-09-25 at 12:00 p.m.
The permanent URL for this article is: http://tidbits.com/article/4511
Include images: Off

Hey, I'm Talking to You! Part 1

by Glenn Fleishman

Recent surveys show that there are roughly 26 million machines connected to the Internet at any given time. Some of these include dialup modem connections, but since those modems are in use most of the time, they count.

Given the number of machines and the number of connections and the size of the Internet, how does any one machine find another in this vast maze? The answer isn't simple, but it's more straightforward than I'd imagined when first trying to figure this out in late 1994.

Back then, the Engsts and I were a few of the Seattle-area "pioneers" of the Internet, and we would puzzle out these issues in order to explain them to our readers and colleagues, and to use them in our day-to-day work on the Net. We spent some time, one Saturday afternoon, trying to understand how the machines "knew" where other machines existed.

Talk Amongst Yourselves -- For starters, look at any local area network (LAN). Most people these days use Ethernet, a method of exchanging data at 10 megabits per second (Mbps) that dates back to the early 1970s and a hand-drawn sketch at a conference. (There are other kinds of networks, but the principles are similar.)

<http://wwwhost.ots.utexas.edu/ethernet/>
<http://wwwhost.ots.utexas.edu/ethernet/ 10quickref/ch1qr_1.html>

Keep in mind that the basic unit of measure in networking is the packet, which is a small bundle of data capped with information in a header preceding it that usually describes where the data came from, where it's headed, and what kind of data it is.

Ethernet works by controlling how different devices talk to each other; it doesn't care what kind of data it carries. Ethernet can carry TCP packets (the protocol the Internet uses), IPX packets (a Novell NetWare protocol), AppleTalk packets (the primary way Macintoshes talk to each other), and other types of data.

The main job that Ethernet hardware performs is taking a stream of data from the computer, dividing it into packets of identical lengths, and waiting for a chance to "speak" on the network. Ethernet devices are polite; they all know to wait until they don't "hear" any traffic before sending a packet. If two devices start talking nearly simultaneously, both of them stop, wait a random interval, and start again.

Devices can often talk at the same time on a busy network, and if you look at a network hub with a "collision detector" light on it, you can see that light flash as those packets "hit" each other. Packets that collide are retransmitted up to 16 times after longer and longer timeout intervals; if the transmission still fails, they're thrown away or "dropped." The busier a network is, the more chance packets will collide, and the more times they must be retransmitted. At times, this can cause networks to bog down and stop working.

(If Ethernet drops a packet, it's up to the protocol sending the packet to know how to respond. With TCP/IP, one kind of packet is retransmitted until it succeeds; a second kind, used for streaming information, like audio or real time statistics, will just be dropped, since the loss of individual packets in that protocol isn't important.)

An Ethernet network consists of wiring - the "physical medium" - laid out in either a daisy chain (one device connects to the next) or a hub-and-spoke arrangement (each wire goes back to a device that essentially cross-wires all the connections). Either way, it's just like one big continuous electrical connection. A signal on any part of an Ethernet will reach every other part of the Ethernet electrically.

You can divide big Ethernets into segments by using devices such as switches, bridges, or routers, and this is where we fit into place the next piece in the puzzle. If every device on an Ethernet network can "see" every other device (printers, workstations, routers, servers, etc.), it's simple to understand how one machine sends data to another. When you split the network up, how does a machine on one segment know how to find machines on a different segment? It's in the packet.

Finding the Right Address -- Every Ethernet device has a number assigned to it, unique in the entire world. It's called a MAC address - but has nothing to do with the Macintosh. MAC stands for Media Access Control, and it's a six-byte number, usually expressed in hexadecimal (base 16) like 00:05:02:C8:EA:5F. The first three bytes are unique by manufacturer. Global Village Communications' devices start with 00:02:88; Sun Microsystems' start with 08:00:20.

<http://standards.ieee.org/db/oui/>

These MAC addresses are broadcast constantly over the Ethernet by every device on the network. Each machine on the Ethernet has to have some notion of what other machines are out there for it to know how to send a packet - it must have a destination address. If the network is split into several pieces, the electrical connection is broken, but machines can still find each other.

The connecting devices tell the different networks about each other. Switches and bridges, for instance, are usually used to split up busy networks so each segment has less overall traffic but can still reach every other segment. (Switches are generally used in single facilities, while bridges often protect restricted resources or connect networks that are physically somewhat distant.)

The switch or bridge passes queries for resources, like printers or other servers, back and forth across networks they're connected to. When a machine on one segment wants to communicate to a machine on another, the switch "hears" the destination address and rebroadcasts the packet on the other segment.

If you're having trouble wrapping your head around this process, think of it like this. A guy has a telephone handset next to each of his ears. On each phone is a separate conference call. Whenever each group talks among its own members, he's silent. Occasionally, he hears the name of a person who's on the other conference call. He listens to the message - like, "Bill, we need more bananas!" - and then repeats the message into the other receiver. Voila! Bill receives his message, the guy in the middle does only necessary work, and both calls aren't tied up with the other's business.

You should now understand how computers talk to one another over local area networks. What happens when you introduce the Internet into the mix? Tune in next week when I finish putting the pieces of the puzzle together.

[Glenn Fleishman is editor in chief of NetBITS. His goal in life is to take a packet's eye journey over the Internet.]