Friday 29 March 2013

The Hypertext Transfer Protocol

The Hypertext Transfer Protocol (HTTP) is an application-level protocol with the lightness and speed necessary for distributed, collaborative, hypermedia information systems. It is a generic, stateless, object-oriented protocol which can be used for many tasks, such as name servers and distributed object management systems, through extension of its request methods (commands). A feature of HTTP is the typing and negotiation of data representation, allowing systems to be built independently of the data being transferred.
In the general form HTTP is the protocol that clients and servers use to communicate on the Web. HTTP is the underlying mechanism on which CGI operates, and it directly determines what you can and cannot send or receive via CGI.

HTTP Properties
A comprehensive addressing scheme
The HTTP protocol uses the concept of reference provided by the Universal Resource Identifier (URI) as a location (URL) or name (URN), for indicating the resource on which a method is to be applied. When an HTML hyperlink is composed, the URL (Uniform Resource Locator) is of the general form http://host:port-number/path/file.html. More generally, a URL reference is of the type service://host/file.file-extension and in this way, the HTTP protocol can subsume the more basic Internet services.
Client-Server Architecture
The HTTP protocol is based on a request/response paradigm. The communication generally takes place over a TCP/IP connection on the Internet. The default port is 80, but other ports can be used. This does not preclude the HTTP/1.0 protocol from being implemented on top of any other protocol on the Internet, so long as reliability can be guaranteed.
The HTTP protocol is connectionless and stateless
After the server has responded to the client's request, the connection between client and server is dropped and forgotten. There is no "memory" between client connections. The pure HTTP server implementation treats every request as if it was brand-new, i.e. without context.
An extensible and open representation for data types
HTTP uses Internet Media Types (formerly referred to as MIME Content-Types) to provide open and extensible data typing and type negotiation. When the HTTP Server transmits information back to the client, it includes a MIME-like (Multipart Internet Mail Extension) header to inform the client what kind of data follows the header. Translation then depends on the client possessing the appropriate utility (image viewer, movie player, etc.) corresponding to that data type.

HTTP Header Fields
An HTTP transaction consists of a header followed optionally by an empty line and some data. The header will specify such things as the action required of the server, or the type of data being returned, or a status code.
The header lines received from the client, if any, are placed by the server into the CGI environment variables with the prefix HTTP_ followed by the header name. Any - characters in the header name are changed to _ characters. The server may exclude any headers which it has already processed, such as Authorization, Content-type, and Content-length.
HTTP_ACCEPT
The MIME (Multipurpose Internet Mail Extension) types which the client will accept, as given by HTTP headers. Other protocols may need to get this information from elsewhere. Each item in this list should be separated by commas as per the HTTP spec.
Format: type/subtype, type/subtype
HTTP_USER_AGENT
The browser the client is using to send the request. General format: software/version library/version.
The server sends back to the client:
1).A status code that indicates whether the request was successful or not. Typical error codes indicate that the requested file was not found, that the request was malformed, or that authentication is required to access the file.
2).The data itself. Since HTTP is liberal about sending documents of any format, it is ideal for transmitting multimedia such as graphics, audio, and video files.
3).It also sends back information about the object being returned.
Fields are:
Content-Type
Indicates the media type of the data sent to the recipient or, in the case of the HEAD method, the media type that would have been sent had the request been a GET.
Content-Type: text/html
Date
The date and time at which the message was originated.
Date: Tue, 15 Nov 1994 08:12:31 GMT
Expires
The date after which the information in the document ceases to be valid. Caching clients, including proxies, must not cache this copy of the resource beyond the date given, unless its status has been updated by a later check of the origin server.
Expires: Thu, 01 Dec 1994 16:00:00 GMT
From
An Internet e-mail address for the human user who controls the requesting user agent. The request is being performed on behalf of the person given, who accepts responsibility for the method performed. Robot agents should include this header so that the person responsible for running the robot can be contacted if problems occur on the receiving end.
From: Stars@WDVL.com
If-Modified-Since
Used with the GET method to make it conditional: if the requested resource has not been modified since the time specified in this field, a copy of the resource will not be returned from the server; instead, a 304 (not modified) response will be returned without any data.
If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT
Last-Modified
Indicates the date and time at which the sender believes the resource was last modified. Useful for clients that eliminate unnecessary transfers by using caching.
Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT
Location
The Location response header field defines the exact location of the resource that was identified by the request URI. If the value is a full URL, the server returns a "redirect" to the client to retrieve the specified object directly.
Location: http://WWW.Stars.com/Tutorial/HTTP/index.html
If you want to reference another file on your own server, you should output a partial URL, such as the following:
Location: /Tutorial/HTTP/index.html
Referer
Allows the client to specify, for the server's benefit, the address (URI) of the resource from which the request URI was obtained. This allows a server to generate lists of back-links to resources for interest, logging, optimized caching, etc. It also allows obsolete or mistyped links to be traced for maintenance.
Referer: http://WWW.Stars.com/index.html
Server
The Server response header field contains information about the software used by the origin server to handle the request.
Server: CERN/3.0 libwww/2.17
User-Agent
Information about the user agent originating the request. This is for statistical purposes, the tracing of protocol violations, and automated recognition of user agents for the sake of tailoring responses to avoid particular user agent limitations - such as inability to support HTML tables.
User-Agent: CERN-LineMode/2.15 libwww/2.17b3

HTTP Request Methods
HTTP/1.0 allows an open-ended set of methods to be used to indicate the purpose of a request. The three most often used methods are GET, HEAD, and POST.
The GET Method
Information from a form using the GET method is appended onto the end of the action URI being requested. Your CGI program will receive the encoded form input in the environment variable QUERY_STRING.
The GET method is used to ask for a specific document - when you click on a hyperlink, GET is being used. GET should probably be used when a URL access will not change the state of a database (by, for example, adding or deleting information) and POST should be used when an access will cause a change. Many database searches have no visible side-effects and make ideal applications of query forms using GET. The semantics of the GET method changes to a "conditional GET" if the request message includes an If-Modified-Since header field. A conditional GET method requests that the identified resource be transferred only if it has been modified since the date given by the If-Modified-Since header.
The HEAD method
The HEAD method is used to ask only for information about a document, not for the document itself. HEAD is much faster than GET, as a much smaller amount of data is transferred. It's often used by clients who use caching, to see if the document has changed since it was last accessed. If it was not, then the local copy can be reused, otherwise the updated version must be retrieved with a GET.
The POST Method
This method transmits all form input information immediately after the requested URI. Your CGI program will receive the encoded form input on stdin.

No comments:

Post a Comment