Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Professional Java.JDK.5.Edition (Wrox)

.pdf
Скачиваний:
31
Добавлен:
29.02.2016
Размер:
12.07 Mб
Скачать

Chapter 11

break;

out.print(c);

out.flush();

}

}catch (IOException ioe) { ioe.printStackTrace();

}finally { try {

if (socket != null) { socket.close();

}

}catch (IOException ex) { ex.printStackTrace();

}

}

}

public static void main(String[] args) {

//our default port int port = 1500;

//use port passed in by the command line, if one was if (args.length >= 1) {

try {

port = Integer.parseInt(args[0]);

}catch (NumberFormatException nfe) {

System.out.println(“Error: port must be a number -- using 1500 instead.”);

}

}

try {

ServerSocket serverSocket = new ServerSocket(port);

System.out.println(“Echo Server Running...”); int counter = 0;

while (true) {

Socket client = serverSocket.accept();

System.out.println(“Accepted a connection from “ + client.getInetAddress().getHostName());

// use multiple threads to handle simultaneous connections Thread t = new Thread(new SocketEcho(client));

t.setName(client.getInetAddress().getHostName() + “:” + counter++); t.start(); // starts up the new thread and SocketEcho.run() is called

}

}catch (IOException ioe) { ioe.printStackTrace();

}

}

}

486

Communicating between Java Components and Components of Other Platforms

Running the Echo Server

To start up the echo server, simply run it like any other Java application from the command prompt:

java book.SocketEcho

Once the server is started, it will begin accepting connections on port 1500 (or what was specified as a parameter in the command line). Whenever a connection is accepted, information about who connected is outputted to the screen as seen in Figure 11-2.

Figure 11-2

To connect to your client, run Telnet. Because you are running your server on a different port than Telnet’s default, you have to specify the port to which you want Telnet to connect:

telnet localhost 1500

Notice the welcome message displays. Now anything you type will be sent to the server and then echoed back to your screen. If you press the ? character, the server closes the connection. Figure 11-3 shows an example conversion between the client and server.

Figure 11-3

Implementing a Protocol

Sockets provide the building blocks for developing communication languages, or protocols, between two separate applications. TCP sockets provide input and output streams, but any data sent on one end is simply bytes to the other end unless the other end understands its meaning. In the previous echo

487

Chapter 11

server example, the server did not understand any of the data sent to it. It only read the data, and passed it back to the client. In practice, applications such as these are really only good to test network connectivity. They can serve no other purpose. To have any sort of meaningful communication, both a client and server must talk the same language, or protocol. Implementing protocols is a difficult task. As you have seen previously, sockets in Java are not difficult to program — they are simply another way of reading from an input stream and writing to an output stream. Many of the hard tasks associated with socket programming are the same hard problems associated with reading certain types of files. Files are structured in some sort of meaningful way — for instance, bitmaps are basically a two-dimensional array of color values. Programs that can read and display bitmaps must understand how to parse the file format. Writing parsers for anything more involved than simple text commands can be a daunting task, and is out of the scope of this chapter. Implementing a protocol requires agreeing on some form of a contract (or file/data format) between the client and server. Once this protocol has been developed, clients and servers can then implement it to talk to each other. The protocol needs to be unambiguous for two separate implementations to work correctly with each other. It is no trivial task to specify an unambiguous protocol, and then have two separate implementations work with each other. In this section, a simple implementation of one of the commands in the HTTP protocol will be explored. By implementing just a minute fraction of a simple textual protocol like HTTP, you will appreciate the difficulty in writing and implementing more detailed protocols. Other options will then follow that spare application programmers the need to recreate the wheel by writing new protocols for every application they develop.

Protocol Specification

During the development of an application that employs the use of sockets, there will be some point where either a custom protocol is defined, or the definition of an existing protocol is used as the foundation for the logic in all socket programming in the application. Only for the development of specialized applications is there ever a need to develop a custom protocol. For example, the communications modules of the Mars Landers from NASA probably have to use sockets to issue commands to the robot and receive its status (or if not sockets, some other software abstraction of communication for which you would develop your own protocol). A custom protocol would need to be specified and implemented for this unique set of commands for the Lander. In most applications though, there is probably a protocol out there that suits the application’s needs. There are many different ways to write a protocol specification, and this chapter will not delve into such matters, as it is a large subject on its own. In this section, HTTP is used as a test case for implementing someone else’s protocol. Only a small portion of the HTTP specification will be looked at and a simple piece implemented.

Basic Elements of HTTP

HTTP follows the simple request/response paradigm. A client sends a request to an HTTP server, issuing a particular command. The server then returns a response to the client based upon what command was sent. HTTP is a stateless protocol, meaning that the HTTP server does not need to retain information about a particular client across different requests. Every request is treated the same, no matter what requests a client has previously made.

Note: There are ways to simulate state over HTTP, and this is what all Web applications do. They use session identifiers and cookies to retain information about a particular client across multiple requests. This is how sites like amazon.com can identify particular users and provide one of the building blocks necessary for e-commerce.

HTTP was developed purposely to be a simple protocol and easy to implement. This is why things such as stateful-session support had to be built on top of HTTP later — HTTP was originally designed just to

488

Communicating between Java Components and Components of Other Platforms

be a mechanism for transferring HTML pages across a network. In HTTP, a client merely connects to a port (usually 80) on a remote machine and issues an HTTP command. The main HTTP commands are

GET. Retrieves the content found at the URL specified.

POST. Sends data to the HTTP server and retrieves the content found at the URL specified. Oftentimes the content the HTTP server passes back is based on the data sent in by the POST command (that is, form data passed to a server).

PUT. Asks the HTTP server to store the data sent with the request to the URL specified.

HEAD. Retrieves only the HTTP headers of a request, and not the actual content.

DELETE. Asks the HTTP server to delete the content found at the URL specified.

After receiving an HTTP command, an HTTP server returns a response. It returns a response code to indicate something about the response. I’m sure you have seen some of these response codes while simply browsing the Web. Depending on which response code is returned, content may be returned along with the response code. The client can then parse through the content and display it as necessary. Some of the more common HTTP response codes are

200. Response OK, the request was fulfilled.

404. The requested URL could not be found.

403. The request for the URL was forbidden.

500. The server encountered an internal error that prevented it from fulfilling the request.

See the actual HTTP specification online at the following URL:

http://www.w3.org/Protocols/HTTP/

It is detailed and precise, and gives a good idea of what a specification for even a protocol as simple as HTTP looks like. For this example, you are going to look at a simple implementation of GET, and how it is be implemented.

A Simple Implementation of HTTP GET

By implementing a small portion of a protocol, the inherent complexity and difficulty of implementing a full protocol specification will be revealed. Writing custom protocols is no picnic, and often leads to hard-to-maintain systems. Open protocols such as HTTP, which are published, are among the easiest to implement. The source code to reference and sample implementations can often be found. Freely available test suites to test the validity of an implementation often exist for open protocols. In the next example, first some of the details of HTTP GET (though not all by any means) must be examined. Your implementation of a simple stripped-down version of GET can then commence, concluding with a look at some methods for testing the validity of the implementation.

Background on HTTP GET

HTTP GET is probably the most commonly used HTTP request operation. Anytime a user types a URL into the address bar of his or her browser and navigates to that URL, GET is used. GET simply asks the server to retrieve a particular file. The server returns a response code indicating whether or not it was successful and, if successful, returns the file. A sample HTTP GET request looks like this:

489

Chapter 11

GET / HTTP/1.1 Accept: */* Accept-Language: en-nz

Accept-Encoding: gzip, deflate

User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322) Host: www.cnn.com

Connection: Keep-Alive

Notice the format of the request. First the HTTP command line is given:

GET / HTTP/1.1

GET signifies the HTTP GET command. The / signifies the file on the server (in this case the root file) — for example it could be /index.html, which would correspond to the URL http://www.cnn.com/ index.html. The HTTP/1.1 signifies which version of HTTP is being used by this request — this request is using the 1.1 version of the protocol. HTTP/1.0 is the other valid entry in this field.

After the HTTP command line, HTTP headers follow. An HTTP header follows the format:

Key: Value

Headers are optional in HTTP 1.0, but in 1.1 certain headers are defined to be required, though most HTTP servers are lenient and do not enforce these requirements. Many of the optional features of HTTP are built on top of headers. Features, such as compressing responses or setting cookies, are all based on HTTP headers. This part of the book will not delve further into the meaning of individual HTTP headers as this simple implementation of HTTP GET will not make use of them. At the end of the headers, the request is ended by two line-feeds, or new line characters. This notifies the server that no more HTTP headers will be sent, and the server can begin sending the response.

An HTTP response is similar in structure to an HTTP request. The first line of a response contains the HTTP response status code. Headers follow, and then the content of the file requested (in the case of a successful HTTP GET). The response you receive from your request in the previous example looks like this:

HTTP/1.1 200 OK

Server: Netscape-Enterprise/6.1 AOL

Date: Tue, 08 Jun 2004 10:33:25 GMT

Last-modified: Tue, 08 Jun 2004 10:33:23 GMT

Expires: Tue, 08 Jun 2004 10:34:23 GMT

Cache-control: private,max-age=60

Content-type: text/html

Transfer-Encoding: chunked

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN”><html lang=”en”><head><title>CNN.com</title>

... (more html follows)

The first line of the response contains the HTTP protocol version, the status code of the response, and a brief textual message indicating the nature of the response code. Following are headers, and then the actual content of the page requested. An implementation of HTTP GET must be able to read the status code to determine and report back to the user the success or failure to retrieve a page.

490

Communicating between Java Components and Components of Other Platforms

HttpGetter: The Implementation

Our implementation of HTTP GET will be a simple command-line Java application. It will save a remote HTML file specified by the user to a local file. Your application will do four main tasks in a simple sequential order:

1.Parse URL and file location to save the remote file from the command-line parameters.

2.Set up the Socket and InetSocketAddress corresponding to the URL parsed from the command line, and connect to the remote host.

3.Write the HTTP GET request to the Socket’s OutputStream.

4.Read the HTTP GET response from the server from the Socket’s InputStream, and write the remote file to disk in the file location specified in the command line.

To parse the URL from the command line, you will use the java.net.URL class. This class breaks up a URL into its components, such as host, port, and file. The code to parse the URL and local filename to save the URL to disk from the command-line parameters is straightforward:

URL url = new URL(args[0]);

File outFile = new File(args[1]);

Note: Persons experienced with the URL class will note that it already has HTTP protocol capabilities — we will not be using them, as the exercise is to show the HTTP protocol via sockets.

Now that the URL has been successfully parsed, the connection to the remote server can be set up. Using socket programming techniques learned from the previous section, the connection is set up as follows:

Socket socket = new Socket();

int port = url.getPort(); if (port == -1)

port = url.getDefaultPort();

InetSocketAddress remoteAddress = new InetSocketAddress(url.getHost(), port); socket.connect(remoteAddress);

One of the idiosyncrasies of the URL class is that if no port is explicitly set in the URL (like http://www.example.com:1234), getPort() returns -1, meaning you have to check for it. Once you have the port, you can create the InetSocketAddress, representing the endpoint on the remote server to connect, and then connect to it.

Now connected to the remote server, you simply write the request to the socket’s output stream, and then read the HTTP server’s response from the input stream. Since HTTP is a text-based protocol, PrintWriter is the perfect class to wrap your Socket’s OutputStream and use to send character data over the socket. Notice in the code below how the two HTTP headers, User-Agent and Host, are sent. User-Agent tells the HTTP server what client software is making the request. Since your client software is called HttpGetter, that is the value put in the header. This header is mainly a courtesy to the server, since many Web servers return different content based on the value of User-Agent (that is, Netscape compatible pages or Internet Explorer compatible pages). The Host value is simply the hostname of the remote server to which you are connecting:

491

Chapter 11

PrintWriter out = new PrintWriter(socket.getOutputStream());

// write our client’s request

out.println(“GET “ + url.getFile() + “ HTTP/1.0”);

out.println(“User-Agent: HttpGetter”); out.println(“Host: “ + url.getHost()); out.println();

out.flush();

After you send the request, you must now read the response. The first line of any HTTP response contains the status code for the request. That is the first thing you must check — if the response code is anything other than 200 (OK), you do not want to save the contents of the input to a file, since the only content that could be sent back would be some sort of error message. In the first line of the response, the status code is the second of the three groups of information:

HTTP/1.1 200 OK

We want to parse out the 200 in the case above and then continue on in this case, since the 200 is HTTP OK, meaning your request was successfully processed and the content of the page you request will follow. In the following code, first use a BufferedReader to begin reading character data from the remote server. To parse the status code out of the first line, use a StringTokenizer to separate the three groups of values and then choose the second one to convert to an integer:

Note: Since you are using a BufferedReader, you can only read character data from the remote server. This means that your implementation will not be able to request any file in your HTTP GET command that contains binary data (such as an image file, a zip file, and so on).

InputStream in = socket.getInputStream(); boolean responseOK = true;

BufferedReader br = new BufferedReader(new InputStreamReader(in));

String currLine = null;

// get http response code from first line of result currLine = br.readLine();

if (currLine != null) { System.out.println(currLine);

StringTokenizer st = new StringTokenizer(currLine, “ \t”); st.nextToken();

String responseCode = st.nextToken();

int httpResponseCode = Integer.parseInt(responseCode.trim());

if (httpResponseCode != 200) { // response not OK responseOK = false;

}

}else {

System.err.println(“Server returned no response!”);

System.exit(1);

}

492

Communicating between Java Components and Components of Other Platforms

The last step is to print out the headers, and then save the content of the request to the file specified at the command line by the user. The headers follow the status-code line of the response until a blank line is encountered. In the first loop in the code below, simply print the headers out on the standard output stream for the user to see until you encounter a blank line when you break out of your loop, knowing the content will immediately follow. If the status code previously parsed was 200, save the remaining content found in the Socket’s InputStream (which is wrapped in a BufferedReader) to the file specified by the user:

// read headers

while ((currLine = br.readLine()) != null) { System.out.println(currLine);

// done reading headers, so break out of loop if (currLine.trim().equals(“”))

break;

}

if (responseOK) {

FileOutputStream fout = new FileOutputStream(outFile);

int currByte;

while ((currByte = br.read()) != -1) fout.write(currByte);

fout.close();

System.out.println(“** Wrote result to “ + args[1]); } else {

System.out.println(“HTTP response code not OK -- file not written”);

}

The following is the full listing for the code for HttpGetter:

package book;

import java.io.BufferedReader; import java.io.File;

import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; import java.io.PrintWriter; import java.net.InetSocketAddress;

import java.net.MalformedURLException; import java.net.Socket;

import java.net.URL;

import java.util.StringTokenizer;

public class HttpGetter {

public static void main(String[] args) { try {

if (args.length < 2) { System.out.println(“Usage”);

System.out.println(“\tHttpGetter <Http URL> <file to save>”); System.out.println

(“\tExample: HttpGetter http://www.google.com/ google.html”);

493

Chapter 11

System.exit(1);

}

URL url = new URL(args[0]);

File outFile = new File(args[1]);

Socket socket = new Socket();

int port = url.getPort(); if (port == -1)

port = url.getDefaultPort();

InetSocketAddress remoteAddress = new InetSocketAddress(url.getHost(), port);

socket.connect(remoteAddress);

PrintWriter out = new PrintWriter(socket.getOutputStream());

// write our client’s request

out.println(“GET “ + url.getFile() + “ HTTP/1.0”); out.println(“User-Agent: HttpGetter”); out.println(“Host: “ + url.getHost()); out.println();

out.flush();

// read remote server’s response InputStream in = socket.getInputStream(); boolean responseOK = true;

BufferedReader br = new BufferedReader(new InputStreamReader(in));

String currLine = null;

// get http response code from first line of result currLine = br.readLine();

if (currLine != null) { System.out.println(currLine);

StringTokenizer st = new StringTokenizer(currLine, “ \t”); st.nextToken();

String responseCode = st.nextToken();

int httpResponseCode = Integer.parseInt(responseCode.trim());

if (httpResponseCode != 200) { // response not OK responseOK = false;

}

} else {

System.err.println(“Server returned no response!”); System.exit(1);

}

// read headers

while ((currLine = br.readLine()) != null) {

494

Communicating between Java Components and Components of Other Platforms

System.out.println(currLine);

// done reading headers, so break out of loop if (currLine.trim().equals(“”))

break;

}

if (responseOK) {

FileOutputStream fout = new FileOutputStream(outFile);

int currByte;

while ((currByte = br.read()) != -1) fout.write(currByte);

fout.close();

System.out.println(“** Wrote result to “ + args[1]);

}else {

System.out.println(“HTTP response code not OK -- file not written”);

}

socket.close();

}catch (MalformedURLException me) { me.printStackTrace();

}catch (IOException ioe) { ioe.printStackTrace();

}

}

}

Congratulations, you have implemented part of a real protocol. There a couple of things to note about this simple implementation. First, as noted before, your implementation can only read text, not binary, which makes it not too robust, since images and other binary files are frequently served from HTTP servers. Secondly, it does not handle errors gracefully, and in reality would require more of a full-fledged parser than your handyman java.io usage. This implementation is a minimal amount of code and logic to implement HTTP GET.

The command-line screen shot in Figure 11-4 shows a user downloading the root Web page of http://www.google.com/ to google.html.

Figure 11-4

495

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]