Networking and Unix X52.9547/Y12.1009
Syllabus

Instructor

Mark Meretzky
mark.meretzky@nyu.edu

Textbook

Every example is online in the X52.9547 Handouts.

The textbook is TCP/IP Network Administration, Third Edition by Craig Hunt; O’Reilly, ISBN 0-596-00297-1.

Assignments and Grading Policy

There are no tests. Grades are based on the homework. Client/server programming can be done in either C or Perl; all examples will be presented in both languages. The end of X52.9547 Handout 1 has a more detailed grading policy.

Description and Objectives

We begin with networking theory. The first part of the course is theoretical: binary arithmetic, packets, and protocols. The second part of the course is how to configure your Unix box to talk to a network. In the last part, we write clients and servers in C or Perl. A computer attached to a network is called a host. X52.9547 is a course in configuring your Unix host to talk to a network. We will assume that the network already exists (probably an Ethernet), and that you want to attach your host to it. We’re not going to build the network.

NYU gives each student a non-root (i.e., non-superuser) account on the Solaris machine i5.nyu.edu. But if you want to perform the actual configuration, you’ll need a machine where you have the root password. Linux running on your PC would be fine.

Prerequisites

You must already know how to use pipes, shellscripts, and regular expressions—after all, Unix Operating System X52.9545 (a.k.a Y12.1005) is a prerequisite for this course. You’ll pick up a little C or Perl in this course; most people prefer Perl because it’s similar to the shell language.

Lectures 1–3: binary arithmetic and protocols

IP addresses, netmasks, and MAC (i.e., Ethernet) addresses are examples of binary numbers. We learn to count in binary, and use bc to convert between binary, octal, decimal, and hexadecimal. Bit masking is the art of reaching into a binary number and turning selected bits on and off. We perform bit masking with the following bitwise operators, shared by the languages C, C++, Perl, Java, and by Unix utilities such as tcpdump and snoop.

A protocol is a set of rules that two communicating programs have agreed to obey. A typical rule is that "data must be divided into packets (segments, datagrams, frames, etc.) for transmission, and reassembled at the receiving end." We describe the relationships between the most important protocols. For example, "each packet of TCP is carried inside of a packet of IP".

Little packets are carried inside big packets, and fragmented if they don’t fit. We’ll eavesdrop on the packets with the packet sniffers snoop and tcpdump, and trace their route with traceroute. We will also cover packet formats and headers, including MAC (Ethernet) addresses, IP addresses; IP address classes, subnetting, and netmasks. Multiple programs on the same machine are identified by TCP/UDP port numbers. Each open port is a potential entrance for a security assault.

Lectures 4–5: booting and configuring

Most networking is set up automatically when the machine is booted. The init process executes the commands in the /etc/inittab table, which run the the startup scripts as the machine ascends through successive run levels. These scripts configure the network interfaces with ifconfig. the Internet Services Dæmon

Some networking programs are spawned on demand by the the Internet dæmon inetd, configured by the inetd.conf file. We cover Dæmons and background processes.

Lecture 6: routing

Routing directs the packets from one network to another. This is no longer a Unix topic—a dedicated router is used nowadays. But we’ll walk through two routing protocols: the original RIP (Routing Information Protocol), with its “counting to infinity”, “split horizon”, and “poison reverse”; and the far more complicated OSPF (Open Source Routing Protocol).

Lectures 7–8: the acronyms

DNS (Domain Name System). Every host has both an IP address number and fully qualified domain name. For example, our computer is known as both 128.122.253.152 and i5.nyu.edu. Hardware and software like numbers, but human beings prefer names. We’ll configure a DNS server to endow each computer with a name on top of its native number.

PPP: Point-to-Point Protocol. If you’re using a modem instead of an Ethernet connection, your IP packets will be carried by PPP. Configure your host to be a PPP client using chat and the PPP dæmon pppd. The PPP server will issue your host a temporary IP address.

DHCP: Dynamic Host Configuration Protocol. Instead of writing the IP address of each machine in its startup or configuration files, let DHCP distribute this information to each machine as it is booted up from a central server.

RPC: Remote Procedure Call. RPC is a layer of software that lets a program on one host call procedures (subroutines, functions, etc.) on another host. Our simple example will be written in the language C. We set up RPC because NIS and NFS (see below) are carried by it.

NIS: Network Information System, a.k.a. the Yellow pages Each host has a file (/etc/passwd) listing the people who have accounts on that host. But what would you do if you have to give someone an account on all 100 hosts on a network? Instead of editing the /etc/passwd file on each host, NIS will let you automatically distribute one master copy of this file. We select NIS by configuring the resolver with the resolv.conf file.

NFS: Network File System. Create the illusion that a file on one host is simultaneously present on another host. share commands are stored in the dfstab configuration table.

sendmail: the mail server. Create the sendmail configuration files with the m4 macro processor. See how local and remote mail addresses are rewritten for different mailers.

Set up a Web Server. We’ll download, decompress, un-tar, configure, compile, link, and install the Apache web server. It speaks HTTP: the Hypertext Transport Protocol. If there’s any interest, we’ll do some CGI programming in C or Perl.

Lectures 9–10: write a client and server

We’ll write a C or Perl program, whichever you find easier, to communicate with a program on another machine via TCP sockets. Maybe UDP sockets too, if we have time. The program that initiates the conversation is called the client and the other one is the server.

Clients and servers often have to perform two or more tasks simultaneously. For example, a server may be willing to talk to multiple clients, and a client or a server may want to perform input and output at the same time. Our servers and clients will do several tasks at the same time by spawning extra processes via the Unix system calls fork, exec, and wait. We’ll catch the death-of-child signal and harvest zombies.

Extra topic: Expect

Some of the most commonly used networking programs are interactive: they are usually run by a live human being typing at a keyboard. Classic examples are ftp and telnet.

A normal shellscript can only drive a non-interactive program. But what if a network administrator has to run an interactive program on many machines, or to communicate with many machines? In this case, we drive them by a script in the language Expect, which is a superset of Tcl.