1. 程式人生 > >Python核心程式設計 第二章--Network Programming

Python核心程式設計 第二章--Network Programming

2.1 Introduction

In this Section, we will take a brief look at network programming using sockets. But before we delve into that, we will present some background information on network programming, how sockets apply to Python, and then show you how to use some of Python’s modules to build networded applications.

2.2 What Is Client/Server Architecture?

What is client/server architecture? It means different things to different people, depending on whom you ask as well as whether you are describing a software or a hardware system. In either case, the premise is simple: the server – a piece of hardware or software – provides a “service” that is needed by one or more clients

(users of the service). Its sole purpose of existence is to wait for (client) requests, respond to those clients (provide the service), and then wait for more requests.

Clients, on the other hand, contact a server for a particular request, send over any necessary data, and then wait for the server to reply, either completing the request or indicating the cause of failure. The server runs indefinitely, continually processing requests; clients make a one-time request for service, receive that service, and thus conclude their transaction. A client might make additional requests at some later time, but these are considered separate transactions.

The most common notion of the client/server architecture today is illustrated in Figure 2-1, which depicts a user or client computer retrieving information from a server across the Internet. Although such a system is indeed an example of a client/server architecture, it isn’t the only one. Furthermore, client/server architecture can be applied to computer hardware as well as software.

這裡寫圖片描述

2.2.1Hardware Client/Server Architecture

Print(er) servers are examples of hardware servers. They process incoming print jobs and send them to a printer (or some other printing device) attached to such a system. Such a computer is generally network-accessible and client computers would send it print requests.

Another example of a hardware server is a file server. These are typically computers with large, generalized storage capacity, which is remotely accessible to clients. Client computers mount the disks from the server computer as if the disk itself were on the local computer. One of the most popular network operating systems that support file servers if Sun Micro-systems’ Network File System (NFS). If you are accessing a networked disk drive and cannot tell whether it is local or on the network, then the client/server system has done the job. The goal is for the user experience to be exactly the same as that of a local disk – the abstraction is normal disk access. It is up to the programmed implementation to make it behave in such a manner.

2.2.2 Software Client/Server Architecture

Software servers also run on a piece of hardware but do not have dedicated peripheral devices as hardware servers do (i.e., printers, disk drives, etc.). The primary services provided by software servers include program execution, data transfer retrieval, aggregation, update, or other types of programmed of data manipulation.

One of the more common software server today is the Web server. Individuals or companies describing to run their own Web server will get one or more computers, install the Web pages and or Web applications they wish to provide to users, and then start the Web server. The job of such a server is to accept client requests, send back Web pages to (Web) clients, that is, browsers on users’ computers, and then wait for the next client request. These servers are started with the expectation of running forever. Although they do not achieve that goal, they go for as long as possible unless stopped by some external force such as being shut down, either explicitly or catastrophically (due to hardware failure).

Database servers are another kind of software server. They take client requests for either storage or retrieval, act upon that requests, and then wait for more business. They are also designed to run forever.

The last type of software server we will discuss are windows servers. These servers can almost be considered hardware servers. They run on a computer with an attached display, such as a monitor of some sort. Windows clients are actually programs that require a windowing environment in whch to execute. These are generally considered graphical user interface (GUI) applications. If they are executed without a window server, meaning, in a text-based environment such as a DOS window or a Unix shell, they are unable to start. Once a windows server is accessible, then things are fine.

Such an environment becomes even more interesting when networking comes into play. The usual display for a windows client is the server on the local computer, but it is possible in some networked windowing environments, such as the X Window system, to choose another computer’s window server as a display. In such situations, you can be running a GUI program on one computer, but have it displayed at another!

Bank Teller as Servers?

One way to imagine how client/server architecture works is to create in your mind the image of a bank teller who neither eats, sleeps, nor rests, serving one customer after another in a line that never seems to end (see Figure 2-2). The line might be long or it might be empty on occasion, but at any given moment, a customer might show up. Of course, such a teller was fantasy years ago, but automated teller machines (ATMs) seem to come close to such a model now.

The teller is, of course, the server that runs in an infinite loop. Each customer is a client with a need that must be addressed. Customers arrive and are handled by the teller in a first-come-first-served manner. Once a transaction has been completed, the client goes away while the server either serves the next customer or sits and waits until one comes along.

Why is all this important? The reason is that this style of execution is how client/server architecture works in a general sense. Now that you have the basic idea, let’s adapt it to network programming, which follows the software client/server architecture model.

這裡寫圖片描述

Client/Server Network Programming

Before a server can respond to client requests, some preliminary setup procedures must be performed to prepare it for the work that lies ahead. A communication enpoint is created which allows a server to listen for requests. One can liken our server to a company receptionist or switchboard operator who answers calls on the main corporate line. Once the phone number and equipment are installed and the operator arrives, the service can begin.

This process is the same in the networked world – once a communication endpoint has been established, our listening server can now enter its infinite loop, waiting for clients to connect, and respondig to requests. Of course, to keep our corporate phone receptionist busy, we must not forget to put that phone number on company letterhead, in advertisements, or some sort of press release; otherwise, no one will ever call!

Similarly, potential clients must be made aware that this server exists to handle their needs – otherwise, the server will never get a single request. Imagine creating a brand new Web site. It might be the most super-duper, awesome, amazing, useful, and coolest Web site of all, but if the Web address or URL is never broadcast or advertised in any way, no one will ever know about it, and it will never see the any visitors.

Now you have a good idea as to how the server works. You have made it past the difficult part. The client-side stuff is much more simple than that on the server side. All the client has to do is to create its single communication endpoint, and then establish a connection to the server. The client can now make a request, which includes any necessary exchange of data. Once the request has been processed and the client has received the result or some sort of acknowledgement, communication is terminated.

2.3 Sockets: Communication Endpoints

In this subsection, you’ll be introduced to sockets, get some background on their origins, learn about the various types of sockets, and finally, how they’re used to allow processes running on different (or the same) computers to communcate with each other.

2.3.1 What Are Sockets?

Sockets are computer networking data structures that embody the concept of the “communication endpoint”, described in the previous section. Networked applications must create sockets before any type of communication can commence. They can be likened to telephone jacks, without which, engaging in communication is impossible.

Sockets can trace their origins to the 1970s as part of the University of California, Berkeley version of Unix, known as BSD Unix. Therefore, you will sometimes hear these sockets referred to as Berkeley sockets or BSD sockets. Sockets were originally created for same-host applications where they would enable one running program (a.k.a. a process) to communicate with another running program. This is known as interprocess communication, or IPC. There are two types of sockets: file-based and network-oriented.

Unix sockets are the first family of sockets we are looking at and have a “family name” of AF_UNIX (a.k.a. AF_LOCAL, as specified in the POSIX1.g standard), which stands for address family: UNIX. Most popular platforms, including Python, use the term address family and the abbreviation AF; other perhaps older systems might refer to address families as domains or protocol families and use PF rather than AF. Similarly, AF_LOCAL (standardized in 2000-2001) is supposed to replace AF_UNIX; however, for backward-compatibility, many systems use both and just make them aliases to the same constant. Python itself still uses AF_UNIX.

Because both processes run on the same computer, these sockets are file-based, meaning that their underlying infrastructure is supported by the file system. This makes sense, because the file system is a shared constant between processes running on the same host.

The second type of socket is networked-based and has its own family name AF_INET, or address family: Internet. Another address family, AF_INET6, is used for Internet Protocol version 6 (IPv6) addressing. There are other address families, all of which are either specialized, antiquated, seldom used, or remain unimplemented. Of all address families, AF_INET is now the most widely used.

Support for a special type of Linux socket was introduced in Python 2.5. The AF_NETLINK family of (connectionless [see Section 2.3.3]) sockets allow for IPC between user and kernel-level code using the standard BSD socket interface. It is seen as an elegant and less risky solution over previous and more cumbersome solutions, such as adding new system calls, /proc support, or “LOCTL”s to an operating system.

Another feature (new in version 2.6) for Linux is support for the Transparent Interprocess Communcation (TIPC) protocol. TIPC is used to allow clusters of computers to “talk” to each other without using IP-based addressing. The Python support for TIPC comes in the form of the AF_TIPC family.

Overall, Python supports only the AF_UNIX, AF_NETLINK, AF_TIPC, and AF_INET {,6} families. Because of our focus on network programming, we will be using AF_INET for most of the remainder of this chapter.

2.3.2 Socket Addresses: Host-Port Pairs

If a socket is like a telephone jack – a piece of infrastructure that enables communication – then a hostname and port number are like an area code and telephone number combination. Having the hardware and ability to communicate doesn’t do any good unless you know to whom and how to “dial”. An Internet address is comprised of a hostname and port number pair, which is required for networked communication. It goes without saying that there should also be someone listening at the other end; otherwise, you get the familar tones, followed by “I’m sorry, the number is no longer in service. Please check the number and try your call again.” Your have probabaly seen one networking anology during Web surfing, for example, “Unable to contact server. Server is not responding or is unreachable.”

Valid port numbers range from 0-65535, although those less than 1024 are reserved for the system. If you are using a POSIX-compliant system (e.g., Linux, Mac OS X, etc.), the list of reserved port numbers (along with servers/protocols and socket types) is found in the /etc/sercices file. A list of well-known port numbers is accessible at this Web site: http://www.iana.org/assignments/port-numbers

2.3.3 Connection-Oriented Sockets vs. Connectionless

Connection-Oriented Sockets

Regardless of which address family you are using, there are two different styles of socket connections. The first type is connection-oriented. What this means is that a connection must be established before communication can occur, such as calling a friend using the telephone system. This type of communication is also referred to as a virtual circuit or stream socket.

Connection-oriented communication offers sequenced, reliable, and unduplicated delivery of data, without record boundaries. That basically means that each message may be broken up into multiple pieces, which are all guaranteed to arrive at their destination, put back together and in order, and delivered to the waiting application.

The primary protocol that inplements such connection types is the Transmission Control Protocol (better known by its acronym, TCP). To create TCP sockets, one must ust SOCK_STREAM as the socket type. The SOCK_STREAM name for a TCP socket is based on one of its denotations as stream socket. Because the networked version of these sockets (AF_INET) use the Internet Protocol (IP) to find hosts in the network, the entire system generally goes by the combined names of both protocols (TCP and IP), or TCP/IP. (Of course, you can also use TCP with local [nonnetworked AF_LOCAL/AF_UNIX] sockets, but obviously there’s no IP usage there.)

Connectionless Sockets

In stark contrast to virtual circuits is the datagram type of socket, which is connectionless. This means that no connection is necessary before communication can begin. Here, there are no guarantees of sequencing, reliability, or nonduplication in the process of data delivery. Datagrams do preserve record boundaries, however, meaning that entire messages are sent rather than being broken into pieces first, such as with connection-oriented protocols.

Message delivery using datagrams can be compared to the postal service. Letters and packages might not arrive in the order they were sent. In fact, they might not arrive at all! To add to the complication, in the land of networking, duplications is even possible.

So with all this negativity, why use datagrams at all? (There must be some advantage over using stream sockets.) Because of the gurantees provided by connection-oriented sockets, a good amount of overhead is required for their setup as well as in maintaining the virtual circuit connection. Datagrams do not have this overhead and thus are “less expensive.” They usually provide better performance and might be suitable for some types of applications.

The primary protocol that implements such connection types is the User Datagram Protocol (better known by its acronym, UDP). To create UDP sockets, we must use SOCK_DGRAM as the socket type. The SOCK_DEGRAM name for a UDP socket, as you can probably tell, comes from the word “datagram.” Because these sockets also use the Internet Protocol to find hosts in the network, this system also has a more general name, going by the combined names of both of these protocols (UDP and IP), or UDP/IP.

2.4 Network Programming in Python

Now that you know all about client/server architecture, sockets, and networking, let’s try to bring these concepts to Python. The primary module we will be using in this section is the socket module. Found within this module is the socket() function, which is used to create socket objects. Sockets also have their own set of methods, which enable socket-based network communication.

剩下的要直接看書了,不抄了