Saturday 30 March 2013

Cookies

Cookies !

Cookies are short pieces of data sent by web servers to the client browser. The cookies are saved to clients hard disk in the form of small text file. Cookies helps the web servers to identify web users, by this way server tracks the user. Cookies pay very important role in the session tracking.
Cookie Class
In JSP cookie are the object of the class javax.servlet.http.Cookie. This class is used to creates a cookie, a small amount of information sent by a servlet to a Web browser, saved by the browser, and later sent back to the server. A cookie's value can uniquely identify a client, so cookies are commonly used for session management. A cookie has a name, a single value, and optional attributes such as a comment, path and domain qualifiers, a maximum age, and a version number.
The getCookies() method of the request object returns an array of Cookie objects. Cookies can be constructed using the following code:
Cookie(java.lang.String name, java.lang.String value)

Methods of Cookie objects
getComment()
Returns the comment describing the purpose of this cookie, or null if no such comment has been defined.
getMaxAge()
Returns the maximum specified age of the cookie.
getName()
Returns the name of the cookie.
getPath()
Returns the prefix of all URLs for which this cookie is targeted.
getValue()
Returns the value of the cookie.
setComment(String)
If a web browser presents this cookie to a user, the cookie's purpose will be described using this comment.
setMaxAge(int)
Sets the maximum age of the cookie. The cookie will expire after that many seconds have passed. Negative values indicate the default behavior: the cookie is not stored persistently, and will be deleted when the user web browser exits. A zero value causes the cookie to be deleted
setPath(String)
This cookie should be presented only with requests beginning with this URL.
setValue(String)
Sets the value of the cookie. Values with various special characters (white space, brackets and parentheses, the equals sign, comma, double quote, slashes, question marks, the "at" sign, colon, and semicolon) should be avoided. Empty values may not behave the same way on all browsers.


Creating & Reading Cookies !

Create a Cookie:
<HTML>
<HEAD>
<TITLE>Reading a Cookie</TITLE>
</HEAD> 

<BODY>
<H1>Reading a Cookie</H1> 

<%
Cookie cookie1 = new Cookie("message", "Hello!");
cookie1.setMaxAge(24 * 60 * 60);
response.addCookie(cookie1);
%>
<P>refresh to see the Cookie</p>
<%
Cookie[] cookies = request.getCookies();

for(int i = 0; i < cookies.length; i++) {
if (cookies[i].getName().equals("message")) {
out.println("The cookie says " + cookies[i].getValue());
}

%>
</BODY
</HTML>

Read a Cookie:
<HTML>
<HEAD>
<TITLE>Setting and Reading Cookies</TITLE>
</HEAD>

<BODY
<%
Cookie c = new Cookie("message", "Hello!");
c.setMaxAge(24 * 60 * 60);
response.addCookie(c);
%>

<%
Cookie[] cookies = request.getCookies();
boolean foundCookie = false;

for(int i = 0; i < cookies.length; i++) {
Cookie cookie1 = cookies[i];
if (cookie1.getName().equals("color")) {
out.println("bgcolor = " + cookie1.getValue());
foundCookie = true;
}


if (!foundCookie) {
Cookie cookie1 = new Cookie("color", "cyan");
cookie1.setMaxAge(24*60*60);
response.addCookie(cookie1);
}
%>
>
<H1>Setting and Reading Cookies</H1>
This page will set its background color using a cookie after refreshing.
</BODY>
</HTML>

 
 

Retrieving the Contents of a HTML form

Forms are, of course, the most important way of getting information from the customer of a web site. In this section, we'll just create a simple color survey and print the results back to the user.
First, create the entry form. Our HTML form will send its answers to form.jsp for processing.
For this example, the name="name" and name="color" are very important. You will use these keys to extract the user's responses.
form.html
<form action="form.jsp" method="get">
<table>
<tr><td><b>Name</b>
<td><input type="text" name="name">
<tr><td><b>Favorite color</b>
<td><input type="text" name="color">
</table>
<input type="submit" value="Send">
</form>
Keeps the browser request information in the request object. The request object contains the environment variables you may be familiar with from CGI programming. For example, it has the browser type, any HTTP headers, the server name and the browser IP address.
You can get form values using request.getParameter object.
The following JSP script will extract the form values and print them right back to the user.
form.jsp
Name: <%= request.getParameter("name") %> <br>
Color: <%= request.getParameter("color") %>

Retrieving a Query String !

An include action executes the included JSP page and appends the generated output onto its own output stream. Request parameters parsed from the URL's query string are available not only to the main JSP page but to all included JSP pages as well. It is possible to temporarily override a request parameter or to temporarily introduce a new request parameter when calling a JSP page. This is done by using the jsp:param action.
In this example, param1 is specified in the query string and is automatically made available to the callee JSP page. param2 is also specified in the query string but is overridden by the caller. Notice that param2 reverts to its original value after the call. param3 is a new request parameter created by the caller. Notice that param3 is only available to the callee and when the callee returns, param3 no longer exists. Here is the caller JSP page:
If the example is called with the URL:
http://hostname.com?param1=a¶m2=b
the output would be:


JSP JAVA Server Pages

Java Server Pages !

An extensible Web technology that uses template data, custom elements, scripting languages, and server-side Java objects to return dynamic content to a client. Typically the template data is HTML or XML elements and The client is often a Web browser.
Java Servlet
A Java program that extends the functionality of a Web server, generating dynamic content and interacting with Web clients using a request-response paradigm.
Static contents
  • Typically static HTML page
  • Same display for everyone
Dynamic contents
  • Contents is dynamically generated based on conditions
  • Conditions could be User identity, Time of the day, User entered values through forms and selections
JSP Page
A text-based document capable of returning both static and dynamic content to a client browser. Static content and dynamic content can be intermixed. Static contents are HTML, XML, Text and Dynamic contents are Java code, Displaying properties of JavaBeans, Invoking business logic defined in Custom tags.
Directives
There are five types of JSP directives and scripting elements. With JSP 1.0, most of your JSP is enclosed within a single tag that begins with <% and ends with %>. With the newer JSP 1.1 specification, there are also XML-compliant versions.
JSP directives are for the JSP engine. They do not directly produce any visible output but instead tell the engine what to do with the rest of the JSP page. They are always enclosed within the <%@ … %> tag. The two primary directives are page and include. The taglib directive will not be discussed but is available for creating custom tags with JSP 1.1.
The page directive is the one you'll find at the top of almost all your JSP pages. Although not required, it lets you specify things like where to find supporting Java classes:
<%@ page import="java.util.Date" %>
where to send the surfer in the event of a runtime Java problem:
<%@ page errorPage="errorPage.jsp" %>
and whether you need to manage information at the session level for the user, possibly across multiple Web pages (more later on sessions with JavaBeans):
<%@ page session="true" %>
The include directive lets you separate your content into more manageable elements, such as those for including a common page header or footer. The page included could be a fixed HTML page or more JSP content:
<%@ include file="filename.jsp" %>
Declarations
JSP declarations let you define page-level variables to save information or define supporting methods that the rest of a JSP page may need. If you find yourself including too much code, it is usually better off in a separate Java class. Declarations are found within the <%! … %> tag. Always end variable declarations with a semicolon, as any content must be valid Java statements: <%! int i=0; %>.
Expressions
With expressions in JSP, the results of evaluating the expression are converted to a string and directly included within the output page. JSP expressions belong within <%= … %> tags and do not include semicolons, unless part of a quoted string:
<%= i %>
<%= "Hello" %>
Code Fragments/Scriptlets
JSP code fragments or scriptlets are embedded within <% … %> tags. This Java code is then run when the request is serviced by the Web server. Around the scriptlets would be raw HTML or XML, where the code fragments let you create conditionally executing code, or just something that uses another piece of code. For example, the following displays the string "Hello" within H1, H2, H3, and H4 tags, combining the use of expressions and scriptlets. Scriptlets are not limited to one line of source code:
<% for (int i=1; i<=4; i++){ %>
>Hello<%=i%>>
<% } %>
Comments
The last of the key JSP elements is for embedding comments. Although you can always include HTML comments in your files, users can view these if they view the page's source. If you don't want users to be able to see your comments, you would embed them within the <%-- … --%> tag:
<%-- comment for server side only --%>

Scripts in JSP

A JSP scriptlet is used to contain any code fragment that is valid for the scripting language used in a page. The syntax for a scriptlet is as follows:
<%
scripting-language-statements
%>
When the scripting language is set to java, a scriptlet is transformed into a Java programming language statement fragment and is inserted into the service method of the JSP page’s servlet. A programming language variable created within a scriptlet is accessible from anywhere within the JSP page.
In the web service version of the hello1 application, greeting.jsp contains a scriptlet to retrieve the request parameter named username and test whether it is empty. If the if statement evaluates to true, the response page is included. Because the if statement opens a block, the HTML markup would be followed by a scriptlet that closes the block.
<%
String username = request.getParameter("username");
if ( username != null && username.length() > 0 ) {
%>
<%@include file="response.jsp" %>
<%
}
%>

JSP Objects and Components !

JSP expressions
If a programmer wants to insert data into an HTML page, then this is achieved by making use of the JSP expression.
The general syntax of JSP expression is as follows:
<%= expression %>
The expression is enclosed between the tags <%= %>
For example, if the programmer wishes to add 10 and 20 and display the result, then the JSP expression written would be as follows:
<%= 10+20 %>

Implicit Objects
Implicit Objects in JSP are objects that are automatically available in JSP. Implicit Objects are Java objects that the JSP Container provides to a developer to access them in their program using JavaBeans and Servlets. These objects are called implicit objects because they are automatically instantiated.
There are many implicit objects available. Some of them are:
request
The class or the interface name of the object request is http.httpservletrequest. The object request is of type Javax.servlet.http.httpservletrequest. This denotes the data included with the HTTP Request. The client first makes a request that is then passed to the server. The requested object is used to take the value from client’s web browser and pass it to the server. This is performed using HTTP request like headers, cookies and arguments.
response
This denotes the HTTP Response data. The result or the information from a request is denoted by this object. This is in contrast to the request object. The class or the interface name of the object response is http.HttpServletResponse. The object response is of type Javax.servlet.http. >httpservletresponse. Generally, the object response is used with cookies. The response object is also used with HTTP Headers.
Session
This denotes the data associated with a specific session of user. The class or the interface name of the object Session is http.HttpSession. The object Session is of type Javax.servlet.http.httpsession. The previous two objects, request and response, are used to pass information from web browser to server and from server to web browser respectively. The Session Object provides the connection or association between the client and the server. The main use of Session Objects is for maintaining states when there are multiple page requests. This will be explained in further detail in following sections.
Out
This denotes the Output stream in the context of page. The class or the interface name of the Out object is jsp.JspWriter. The Out object is written: Javax.servlet.jsp.JspWriter
PageContext
This is used to access page attributes and also to access all the namespaces associated with a JSP page. The class or the interface name of the object PageContext is jsp.pageContext. The object PageContext is written: Javax.servlet.jsp.pagecontext
Application
This is used to share the data with all application pages. The class or the interface name of the Application object is ServletContext. The Application object is written: Javax.servlet.http.ServletContext
Config
This is used to get information regarding the Servlet configuration, stored in the Config object. The class or the interface name of the Config object is ServletConfig. The object Config is written Javax.servlet.http.ServletConfig
Page
The Page object denotes the JSP page, used for calling any instance of a Page's servlet. The class or the interface name of the Page object is jsp.HttpJspPage. The Page object is written: Java.lang.Object
The most commonly used implicit objects are request, response and session objects.

JSP Session Object
Session Object denotes the data associated with a specific session of user. The class or the interface name of the object session is http.HttpSession. The object session is written as:
Javax.servlet.http.httpsession.
The previous two objects, request and response, are used to pass information from web browser to server and from server to web browser respectively. But the Session Object provides the connection or association between the client and the server. The main use of Session Objects is to maintain states when there are multiple page requests.
The main feature of session object is to navigate between multiple pages in a application where variables are stored for the entire user session. The session objects do not lose the variables and the value remains for the user’ session. The concept of maintenance of sessions can be performed by cookies or URL rewriting. A detailed approach of session handling will be discusses in coming sections.
Methods of session Object
There are numerous methods available for session Object. Some are:
  • getAttribute(String name)
  • getAttributeNames
  • isNew()
  • getCreationTime
  • getId
  • invalidate()
  • getLastAccessedTime
  • getMaxInactiveInterval
  • removeAttribute(String name)
  • setAttribute(String, object)
getAttribute(String name)
The getAttribute method of session object is used to return the object with the specified name given in parameter. If there is no object then a null value is returned.
General syntax of getAttribute of session object is as follows:
session.getAttribute(String name)
The value returned is an object of the corresponding name given as string in parameter. The returned value from the getAttribute() method is an object written: java.lang.Object.
For example
String exforsys = (String) session.getAttribute("name");
In the above statement, the value returned by the method getAttribute of session object is the object of name given in parameter of type java.lang. Object and this is typecast to String data type and is assigned to the string exforsys.
getAttributeNames
The getAttributeNames method of session object is used to retrieve all attribute names associated with the current session. The name of each object of the current session is returned. The value returned by this method is an enumeration of objects that contains all the unique names stored in the session object.
General Syntax
session.getAttributeNames()
The returned value by this method getAttributeNames() is Enumeration of object.
For example
exforsys = session.getAttributeNames( )
The above statement returns enumeration of objects, which contains all the unique names stored in the current session object in the enumeration object exforsys.
isNew()
The isNew() method of session object returns a true value if the session is new. If the session is not new, then a false value is returned. The session is marked as new if the server has created the session, but the client has not yet acknowledged the session. If a client has not yet chosen the session, i.e., the client switched off the cookie by choice, then the session is considered new. Then the isNew() method returns true value until the client joins the session. Thus, the isNew() method session object returns a Boolean value of true of false.
General syntax of isNew() of session object is as follows:
session.isNew()
The returned value from the above method isNew() is Boolean


 

JSP Request Objects !

The request object in JSP is used to get the values that the client passes to the web server during an HTTP request. The request object is used to take the value from the client’s web browser and pass it to the server. This is performed using an HTTP request such as: headers, cookies or arguments. The class or the interface name of the object request is http.httpservletrequest.
The object request is written: Javax.servlet.http.httpservletrequest.
Methods of request Object
There are numerous methods available for request object. Some of them are:
  • getCookies()
  • getHeader(String name)
  • getHeaderNames()
  • getAttribute(String name)
  • getAttributeNames()
  • getMethod()
  • getParameter(String name)
  • getParameterNames()
  • getParameterValues(String name)
  • getQueryString()
  • getRequestURI()
  • getServletPath()
  • setAttribute(String,Object)
  • removeAttribute(String)
getCookies()
The getCookies() method of request object returns all cookies sent with the request information by the client. The cookies are returned as an array of Cookie Objects. We will see in detail about JSP cookies in the coming sections.
General syntax of getHeader() of request object is as follows:
request.getHeader("String")
getHeader()request object returned value is a string.
For example:
String onlinemca = request.getHeader("onlinemca");
The above would retrieve the value of the HTTP header whose name is onlinemca in JSP.
getHeader(String name)
The method getHeader(String name) of request object is used to return the value of the requested header. The returned value of header is a string.
eneral syntax of getHeader() of request object is as follows:
request.getHeader("String")
In the above the returned value is a String.
For example:
String online = request.getHeader("onlinemca");
The above would retrieve the value of the HTTP header whose name is onlinemca in JSP.
getHeaderNames()
The method getHeaderNames() of request object returns all the header names in the request. This method is used to find available headers. The value returned is an enumerator of all header names.
General syntax of getHeaderNames() of request object is as follows:
request.getHeaderNames();
In the above the returned value is an enumerator.
For example:
Enumeration onlinemca = request.getHeaderNames();
The above returns all header names under the enumerator onlinemca.
getAttribute(String name)
The method getAttribute() of request object is used to return the value of the attribute. The getAttribute() method returns the objects associated with the attribute. When the attribute is not present, then a null value is returned. If the attribute is present then the return value is the object associated with the attribute.
General syntax of getAttribute() of request object is as follows:
request.getAttribute()
In the above the returned value is an object.
For example:
Object onlinemca = request.getAttribute("test");
The above retrieves the object stored in the request test and returns the object in onlinemca.
getAttributeNames()
The method getAttribute() of request object is used to return the object associated with the particular given attribute. If the user wants to get names of all the attributes associated with the current session, then the request object method getAttributeNames() can be used. The returned value is an enumerator of all attribute names.
General syntax of getAttributeNames() of request object is as follows:
request.getAttributeNames()
For example:
Enumeration onlinemca = request.getAttributeNames();
The above returns all attribute names of the current session under the enumerator: onlinemca.
getMethod()
The getMethod() of request object is used to return the methods GET, POST, or PUT corresponding to the requested HTTP method used.
General syntax of getMethod() of request object is as follows:
request.getMethod()
For example:
if (request.getMethod().equals("POST"))
{
.........
.........
}
In the above example, the method returned by the request.getMethod is compared with POST Method and if the returned method from request.getMethod() equals POST then the statement in if block executes.
getParameter(String name)
getParameter() method of request object is used to return the value of a requested parameter. The returned value of a parameter is a string. If the requested parameter does not exist, then a null value is returned. If the requested parameter exists, then the value of the requested parameter is returned as a string.
General syntax of getParameter() of request object is as follows:
request.getParameter(String name)
The returned value by the above statement is a string.
For example:
String onlinemca = request.getParameter("test");
The above example returns the value of the parameter test passed to the getParameter() method of the request object in the string onlinemca. If the given parameter test does not exist then a null value is assigned to the string onlinemca.
getParameterNames()
The getParameterNames() method of request object is used to return the names of the parameters given in the current request. The names of parameters returned are enumeration of string objects.
General syntax of getParameterNames() of request object is as follows:
request.getParameterNames()
Value returned from the above statement getParameterNames() method is enumeration of string objects.
For example:
Enumeration exforsys = request.getParameterNames();
The above statement returns the names of the parameters in the current request as an enumeration of string object.
getParameterValues(String name)
The getParameter(String name) method of request object was used to return the value of a requested given parameter. The returned value of the parameter is a string. If there are a number of values of parameter to be returned, then the method getParameterValues(String name) of request object can be used by the programmer. The getParameterValues(String name) method of request object is used to return all the values of a given parameter’s request. The returned values of parameter is a array of string objects. If the requested parameter is found, then the values associated with it are returned as array of string object. If the requested given parameter is not found, then null value is returned by the method.
General syntax of getParameterValues of request object is as follows:
request.getParameterValues(String name)
The returned value from the above method getParameterValues() is array of string objects.
For example:
String[] vegetables = request.getParameterValues("vegetable");
The above example returns a value of parameter vegetable passed to the method getParameterValues() of request object and the returned values are array of string of vegetables.
getQueryString()
The getQueryString() method of request object is used to return the query string from the request. From this method, the returned value is a string.
General syntax of getQueryString() of request object is as follows:
request.getQueryString()
Value returned from the above method is a string.
For example:
String onlinemca=request.getQueryString();
out.println("Result is"+exforsys);
The above example returns a string exforsys from the method getQueryString() of request object. The value is returned and the string is printed in second statement using out.println statement.
getRequestURI()
The getRequestURI() method of request object is used for returning the URL of the current JSP page. Value returned is a URL denoting path from the protocol name up to query string.
General syntax of getRequestURI() of request object is as follows:
request.getRequestURI()
The above method returns a URL.
For example:
out.println("URI Requested is " + request.getRequestURI());
Output of the above statement would be:
URI Requested is /Jsp/test.jsp
getServletPath()
The getServletPath() method of request object is used to return the part of request URL that calls the servlet.
General syntax of getServletPath() of request object is as follows:
request.getServletPath()
The above method returns a URL that calls the servlet.
For example:
out.println("Path of Servlet is " + request.getServletPath());
The output of the above statement would be:
Path of Servlet is/test.jsp
setAttribute(String,Object)
The setAttribute method of request object is used to set object to the named attribute. If the attribute does not exist, then it is created and assigned to the object.
General syntax of setAttribute of request object is as follows:
request.setAttribute(String, object)
In the above statement the object is assigned with named string given in parameter.
For example:
request.setAttribute("username", "onlinemca");
The above example assigns the value onlinemca to username.
removeAttribute(String)
The removeAttribute method of request object is used to remove the object bound with specified name from the corresponding session. If there is no object bound with specified name then the method simply remains and performs no function.
General syntax of removeAttribute of request object is as follows:
request.removeAttribute(String);

JSP Response Objects !

The response object denotes the HTTP Response data. The result or the information of a request is denoted with this object. The response object handles the output of the client. This contrasts with the request object. The class or the interface name of the response object is http.HttpServletResponse.
-The response object is written: Javax.servlet.http.httpservletresponse.
-The response object is generally used by cookies.
-The response object is also used with HTTP Headers.
Methods of response Object
There are numerous methods available for response object. Some of them are:
  • setContentType()
  • addCookie(Cookie cookie)
  • addHeader(String name, String value)
  • containsHeader(String name)
  • setHeader(String name, String value)
  • sendRedirect(String)
  • sendError(int status_code)
setContentType()
setContentType() method of response object is used to set the MIME type and character encoding for the page.
General syntax of setContentType() of response object is as follows:
response.setContentType();
For example:
response.setContentType("text/html");
The above statement is used to set the content type as text/html dynamically.
addCookie(Cookie cookie)
addCookie() method of response object is used to add the specified cookie to the response. The addcookie() method is used to write a cookie to the response. If the user wants to add more than one cookie, then using this method by calling it as many times as the user wants will add cookies.
General syntax of addCookie() of response object is as follows:
response.addCookie(Cookie cookie)
For example:
response.addCookie(Cookie exforsys);
The above statement adds the specified cookie exforsys to the response.
addHeader(String name, String value)
addHeader() method of response object is used to write the header as a pair of name and value to the response. If the header is already present, then value is added to the existing header values.
General syntax of addHeader() of response object is as follows:
response.addHeader(String name, String value)
Here the value of string is given as second parameter and this gets assigned to the header given in first parameter as string name.
For example:
response.addHeader("Author", "onlinemca");
The output of above statement is as below:
Author: onlinemca
containsHeader(String name)
containsHeader() method of response object is used to check whether the response already includes the header given as parameter. If the named response header is set then it returns a true value. If the named response header is not set, the value is returned as false. Thus, the containsHeader method is used to test the presence of a header before setting its value. The return value from this method is a Boolean value of true or false.
General syntax of containsHeader() of response object is as follows:
response.containsHeader(String name)
Return value of the above containsHeader() method is a Boolean value true or false.
setHeader(String name, String value)
setHeader method of response object is used to create an HTTP Header with the name and value given as string. If the header is already present, then the original value is replaced by the current value given as parameter in this method.
General syntax of setHeader of response object is as follows:
response.setHeader(String name, String value)
For example:
response.setHeader("Content_Type","text/html");
The above statement would give output as
Content_Type: text/html
sendRedirect(String)
sendRedirect method of response object is used to send a redirect response to the client temporarily by making use of redirect location URL given in parameter. Thus the sendRedirect method of the response object enables one to forward a request to a new target. But one must note that if the JSP executing has already sent page content to the client, then the sendRedirect() method of response object will not work and will fail.
General syntax of sendRedirect of response object is as follows:
response.sendRedirect(String)
In the above the URL is given as string.
For example:
response.sendRedirect("http://xxx.test.com/error.html");
The above statement would redirect response to the error.html URL mentioned in string in Parameter of the method sendRedirect() of response object.
sendError(int status_code)
sendError method of response object is used to send an error response to the client containing the specified status code given in parameter.
General syntax of sendError of response object is as follows:
response.sendError(int status_code)




Friday 29 March 2013

XML

XML stands for EXtensible Markup Language. It is a markup language much like HTML. It was designed to carry data, not to display data. Its tags are not predefined. You must define your own tags. XML is designed to be self-descriptive.

Why do we need XML?
Data-exchange
1).XML is used to aid the exchange of data. It makes it possible to define data in a clear way.
2).Both the sending and the receiving party will use XML to understand the kind of data that's been sent. By using XML everybody knows that the same interpretation of the data is used.
Replacement for EDI
1).EDI (Electronic Data Interchange) has been for several years the way to exchange data between businesses.
2).EDI is expensive, it uses a dedicated communication infrastructure. And the definitions used are far from flexible.
3).XML is a good replacement for EDI. It uses the Internet for the data exchange. And it's very flexible.
More possibilities
1).XML makes communication easy. It's a great tool for transactions between businesses.
2).But it has much more possibilities. You can define other languages with XML. A good example is WML (Wireless Markup Language), the language used in WAPcommunications. WML is just an XML dialect.




What it can do

With XML you can :
  • Define data structures
  • Make these structures platform independent
  • Process XML defined data automatically
  • Define your own tags
With XML you cannot
  • Define how your data is shown. To show data, you need other techniques.

Define your own tags
In XML, you define your own tags.
If you need a tag <TUTORIAL> or <STOCKRATE>, that's no problem.

DTD or Schema
If you want to use a tag, you'll have to define it's meaning. This definition is stored in a DTD (Document Type Definition). You can define your own DTD or use an existing one. Defining a DTD actually means defining a XML language. An alternative for a DTD is Schema.

Showing the results
Often it's not necessary to display the data in a XML document. It's for instance possible to store the data in a database right away. If you want to show the data, you can. XML itself is not capable of doing so. But XML documents can be made visible with the aid of a language that defines the presentation. XSL (eXtensible Stylesheet Language) is created for this purpose. But the presentation can also be defined with CSS (Cascading Style Sheets).

Tags
XML tags are created like HTML tags. There's a start tag and a closing tag.
<TAG>content</TAG>
The closing tag uses a slash after the opening bracket, just like in HTML.
The text between the brackets is called an element.

Syntax
The following rules are used for using XML tags:
1).Tags are case sensitive. The tag <TRAVEL> differs from the tags <Travel> and <travel>.
2).Starting tags always need a closing tag.
3).All tags must be nested properly.
4).Comments can be used like in HTML:
5).Between the starting tag and the end tag XML expects the content. <amount>135</amount> is a valid tag for an element amount that has the content 135.

Empty tags
Besides a starting tag and a closing tag, you can use an empty tag. An empty tag does not have a closing tag. The syntax differs from HTML:
Empty Tag : <TAG/>

Elements and children
With XML tags you define the type of data. But often data is more complex. It can consist of several parts. To describe the element car you can define the tags <car>mercedes</car>. This model might look like this:
<car>
<brand>Ferrari</brand>
<type>v40</type>
<color>RED</color> </car>
Besides the element car three other elements are used: brand, type and color. Brand, type and color are sub-elements of the element car. In the XML-code the tags of the sub-elements are enclosed within the tags of the element car. Sub-elements are also called children.

Relationship between HTML, SGML, and XML !

First you should know that SGML (Standard Generalized Markup Language) is the basis for both HTML and XML. SGML is an international standard (ISO 8879) that was published in 1986.
Second, you need to know that XHTML is XML. "XHTML 1.0 is a reformulation of HTML 4.01 in XML, and combines the strength of HTML 4 with the power of XML."
Thirdly, XML is NOT a language, it is rules to create an XML based language. Thus, XHTML 1.0 uses the tags of HTML 4.01 but follows the rules of XML.

The Document
A typical document is made up of three layers:
  • structure
  • Content
  • Style
Structure
Structure would be the documents title, author, paragraphs, topics, chapters, head, body etc.
Content
Content is the actual information that composes a title, author, paragraphs etc.
Style
Style is how the content within the structural elements are displayed such as font color, type and size, text alignment etc.

Markup
HTML, SGML, and XML all markup content using tags. The difference is that SGML and XML mainly deal with the relationship between content and structure, the structural tags that markup the content are not predefined (you can make up your own language), and style is kept TOTALLY separate; HTML on the other hand, is a mix of content marked up with both structural and stylistic tags. HTML tags are predefined by the HTML language.
By mixing structure, content and style you limit yourself to one form of presentation and in HTML's case that would be in a limited group of browsers for the World Wide Web.
By separating structure and content from style, you can take one file and present it in multiple forms. XML can be transformed to HTML/XHTML and displayed on the Web, or the information can be transformed and published to paper, and the data can be read by any XML aware browser or application.

SGML (Standard Generalized Markup Language)
Historically, Electronic publishing applications such as Microsoft Word, Adobe PageMaker or QuarkXpress, "marked up" documents in a proprietary format that was only recognized by that particular application. The document markup for both structure and style was mixed in with the content and was published to only one media, the printed page.
These programs and their proprietary markup had no capability to define the appearance of the information for any other media besides paper, and really did not describe very well the actual content of the document beyond paragraphs, headings and titles. The file format could not be read or exchanged with other programs, it was useful only within the application that created it.
Because SGML is a nonproprietary international standard it allows you to create documents that are independent of any specific hardware or software. The document structure (what elements are used and their relationship to each other) is described in a file called the DTD (Document Type Definition). The DTD defines the relationships between a document's elements creating a consistent, logical structure for each document.
SGML is good for handling large-scale, long-term information management needs and has been around for more than a decade as the language of defense contractors and the electronic publishing industry. Because SGML is very large, powerful, and complex it is hard to learn and understand and is not well suited for the Web environment.

XML (Extensible Markup Language)
XML is a "restricted form of SGML" which removes some of the complexity of SGML. XML like SGML, retains the flexibility of describing customized markup languages with a user-defined document structure (DTD) in a non-proprietary file format for both storage and exchange of text and data both on and off the Web.
As mentioned before, XML separates structure and content from style and the structural markup tags can actually describe the content because they can be customized for each XML based markup language. A good example of this is the Math Markup Language (MathML) which is an XML application for describing mathematical notation and capturing both its structure and content.
Until MathML, the ability to communicate mathematical expressions on the Web was limited to mainly displaying images (JPG or GIF) of the scientific notation or posting the document as a PDF file. MathML allows the information to be displayed on the Web, and makes it available for searching, indexing, or reuse in other applications.

HTML (Hypertext markup Language)
HTML is a single, predefined markup language that forces Web designers to use it's limiting and lax syntax and structure. The HTML standard was not designed with other platforms in mind, such as Web TV’s, mobile phones or PDAs. The structural markup does little to describe the content beyond paragraph, list, title and heading.
XML breaks the restricting chains of HTML by allowing people to create their own markup languages for exchanging information. The tags can be descriptive of the content and authors decide how the document will be displayed using style sheets (CSS and XSL). Because of XML's consistent syntax and structure, documents can be transformed and published to multiple forms of media and content can be exchanged between other XML applications.
HTML was useful in the part it has played in the success of the Web but has been outgrown as the Web requires more robust, flexible languages to support it's expanding forms of communication and data exchange.

In Short
XML will never completely replace SGML because SGML is still considered better for long-time storage of complex documents. However, XML has already replaced HTML as the recommended markup language for the Web with the creation of XHTML 1.0.
Even though XHTML has not made the HTML that currently exists on the Web obsolete, HTML 4.01 is the last version of HTML. XHTML (an XML application) is the foundation for a universally accessible, device independent Web.

Ways to use XML !

To use XML you need a DTD (Document Type Definition). A DTD contains the rules for a particular type of XML-documents. Actually it's the DD that defines the language.
Elements
A DTD describes elements. It uses the following syntax:
The text <! ELEMENT, followed by the name of the element, followed by a description of the element.
For example:
<!ELEMENT brand (#PCDATA)>
This DTD description defines the XML tag <brand>.
Data
The description (#PCDATA) stands for parsed character data. It's the tag that is shown and also will be parsed (interpreted) by the program that reads the XML document. You can also define (#CDATA), this stands for character data. CDATA will not be parsed or shown.
Sub elements
An element that contains sub elements is described thus:
<!ELEMENT car (brand, type) >
<!ELEMENT brand (#PCDATA) >
<!ELEMENT type (#PCDATA) >
This means that the element car has two subtypes: brand and type. Each subtype can contain characters.
Number of sub elements
If you use <!ELEMENT car (brand, type) >, the sub elements brand and type can occur once inside the element car. To change the number of possible occurrences the following indications can be used:
  • + must occur at least one time but may occur more often
  • * may occur more often but may also be omitted
  • ? may occur once or not at all
The indications are used behind the sub element name.
For example:
<!ELEMENT animal (color+) …
Making choices
With the sign '|' you define a choice between two sub elements. You enter the sign between the names of the sub elements.
<!ELEMENT animal (wingsize|legsize) >
Empty elements
Empty elements get the description EMPTY.
For example:
<!ELEMENT separator EMPTY>
that could define a separator line to be shown if the XML document appears in a browser.
DTD: external
A DTD can be an external document that's referred to. Such a DTD starts with the text
<!DOCTYPE name of root-element SYSTEM "address">
The address is an URL that points to the DTD.
In the XML document you make clear that you'll use this DTD with the line:
<!DOCTYPE name of root-element SYSTEM "address">
that should be typed after the line <?xml version="1.0"?>
DTD: internal
A DTD can also be included in the XML document itself. After the line <?xml version="1.0"?> you must type <!DOCTYPE name of root-element [ followed by the element definitions. The DTD part is closed with ]>

 

Embedding XML into HTML document !

One serious proposal is for HTML documents to support the inclusion and processing of XML data. This would allow an author to embed within a standard HTML document some well delimited, well defined XML object. The HTML document would then be able to support some functions based on the special XML markup. This strategy of permitting "islands" of XML data inside an HTML document would serve at least two purposes:
1).To enrich the content delivered to the web and support further enhancements to the XML-based content models.
2).To enable content developers to rely on the proven and known capabilities of HTML while they experiment with XML in their environments.
The result would look like this:
<HTML>
<body>
<!-- some typical HTML document with
<h1>, <h2>, <p>, etc. -->
<xml>
<!-- The <xml> tag introduces some XML-compliant markup for some specific purpose. The markup is then explicitly terminated with the </xml> tag. The user agent would invoke an XML processor only
on the data contained in the <xml></xml> pair. Otherwise the user agent would process the containing document as an HTML document. -->
</xml>
<!-- more typical HTML document markup -->
</body>
</html>

 

Converting XML to HTML for Display !

There exist several ways to convert XML to HTML for display on the Web.
Using HTML alone
If your XML file is of a simple tabular form only two levels deep then you can display XML files using HTML alone.
Using HTML + CSS
This is a substantially more powerful way to transform XML to HTML than HTML alone, but lacks the full power and flexibility of the methods listed below.
Using HTML with JavaScript
Fully general XML files of any type and complexity can be processed and displayed using a combination of HTML and JavaScript. The advantages of this approach are that any possible transformation and display can be carried out because JavaScript is a fully general purpose programming language. The disadvantages are that it often requires large, complex, and very detailed programs using recursive functions (functions that call themselves repeatedly) which are very difficult for most people to grasp
Using XSL and Xpath
XSL (eXtensible Stylesheet Language) is considered the best way to convert XML to HTML. The advantages are that the language is very compact, very sophisticated HTML can be displayed with relatively small programs, it is easy to re-purpose XML to serve a variety of purposes, it is non-procedural in that you generally specify only what you wish to accomplish as opposed to detailed instructions as to how to achieve it, and it greatly reduces or eliminates the need for recursive functions. The disadvantages are that it requires a very different mindset to use, and the language is still evolving so that many XSL processors in the Web servers are out of date and newer ones must sometimes be invoked through DOS

 

Displaying XML Document using CSS !

CSS stands for Cascading Style Sheets. Styles define how to display HTML elements. Styles are normally stored in Style Sheets. Styles were added to HTML 4.0 to solve a problem. External Style Sheets can save a lot of work. External Style Sheets are stored in CSS files. Multiple style definitions will cascade into one.
A Cascading Style Sheet is a file that contains instrunctions for formatting the elements in an XML document.
Creating and linking a CSS to your XML document is one way to tell browser how to display each of document's elements. An XML document with an attached CSS can be open directly in Internet Explorers. You don't need to use an HTML page to access and display the data.
There are two basic steps for using a css to display an XML document:
  • Create the CSS file.
  • Link the CSS sheet to XML document.
Creating CSS file
CSS is a plain text file with .css extension that contains a set of rules telling the web browser how to format and display the elements in a specific XML document. You can create a css file using your favorite text editors like Notepad, Wordpad or other text or HTML editor as show below:
general.css
employees
{
background-color: #ffffff;
width: 100%;
}
id
{
display: block; margin-bottom: 30pt; margin-left: 0;
}
name
{
color: #FF0000;
font-size: 20pt;
}
city,state,zipcode
{
color: #0000FF;
font-size: 20pt;
}

Linking
To link to a style sheet you use an XML processing directive to associate the style sheet with the current document. This statement should occur before the root node of the document.
<?xml-stylesheet type="text/css" href="styles/general.css">
The two attributes of the tag are as follows:
href
The URL for the style sheet.
type
The MIME type of the document begin linked, which in this case is text/css.
MIME stands for Multipart Internet Mail Extension. It is a standard which defines how to make systems aware of the type of content being included in e-mail messages.
The css file is designed to attached to the XML document as shown below:
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!--This xml file represent the details of an employee-->
<?xml-stylesheet type="text/css" href="styles/general.css">
<employees>
<employee id="1">
<name>
<firstName>Girdhar</firstName>
<lastName>Gopal</lastName>
</name>
<city>Nissing</city>
<state>Haryana</state>
<zipcode>132024</zipcode>
</employee>
<employee id="2">
<name>
<firstName>Gopal</firstName>
<lastName>Girdhar</lastName>
</name>
<city>Kurukshetra</city>
<state>Haryana</state>
<zipcode>136119</zipcode>
</employee>
</employees>

 

Displaying XML Document using XSL !

It is a language for expressing stylesheets. It consists of two parts:
  • A language for transforming XML documents (XSLT)
  • An XML vocabulary for specifying formatting semantics
An XSL stylesheet specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses the formatting vocabulary.
Like CSS an XSL is linked to an XML document and tell browser how to display each of document's elements. An XML document with an attached XSL can be open directly in Internet Explorers. You don't need to use an HTML page to access and display the data.
There are two basic steps for using a css to display an XML document:
  • Create the XSL file.
  • Link the XSL sheet to XML document.
Creating XSL file
XSL is a plain text file with .css extension that contains a set of rules telling the web browser how to format and display the elements in a specific XML document. You can create a css file using your favorite text editors like Notepad, Wordpad or other text or HTML editor as show below:
general.xsl
employees
{
background-color: #ffffff;
width: 100%;
}
id
{
display: block; margin-bottom: 30pt; margin-left: 0;
}
name
{
color: #FF0000;
font-size: 20pt;
}
city,state,zipcode
{
color: #0000FF;
font-size: 20pt;
}

Linking
To link to a style sheet you use an XML processing directive to associate the style sheet with the current document. This statement should occur before the root node of the document.
<?xml-stylesheet type="text/xsl" href="styles/general.xsl">
The two attributes of the tag are as follows:
href
The URL for the style sheet.
type
The MIME type of the document begin linked, which in this case is text/css.
MIME stands for Multipart Internet Mail Extension. It is a standard which defines how to make systems aware of the type of content being included in e-mail messages.
The css file is designed to attached to the XML document as shown below:
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!--This xml file represent the details of an employee-->
<?xml-stylesheet type="text/xsl" href="styles/general.xsl">
<employees>
<employee id="1">
<name>
<firstName>Girdhar</firstName>
<lastName>Gopal</lastName>
</name>
<city>Nissing</city>
<state>Haryana</state>
<zipcode>132024</zipcode>
</employee>
<employee id="2">
<name>
<firstName>Gopal</firstName>
<lastName>Girdhar</lastName>
</name>
<city>Kurukshetra</city>
<state>Haryana</state>
<zipcode>136119</zipcode>
</employee>
</employees>

 

The Futute of XML !

The future of XML is still unclear because of conflicting views of XML users. Some say that the future is bright and holds promise. While others say that it is time to take a break from the continuous increase in the volume of specifications.
In the past five years, there have been substantial accomplishments in XML. XML has made it possible to manage large quantities of information which don't fit in relational database tables, and to share labeled structured information without sharing a common Application Program Interface (API). XML has also simplified information exchange across language barriers.
But as a result of these accomplishments, XML is no longer simple. It now consists of a growing collection of complex connected and disconnected specifications. As a result , usability has suffered. This is because it takes longer to develop XML tools. These users are now rooting for something simpler. They argue that even though specifications have increased, there is no clear improvement in quality. They think in might be better to let things be, or even to look for alternate approaches beyond XML. This will make XML easier to use in the future. Otherwise it will cause instability with further increase in specifications.
The other side paints a completely different picture. They are ready for further progress in XML. There have been discussions for a new version, XML 2.0. This version has been proposed to contain the following characteristics:
  • § Elimination of DTDS
  • § Integration of namespace
  • § XML Base and XML Information Set into the base standard
Research is also being carried out into the properties and use cases for binary encoding of the XML information set.
Future of XML Applications
The future of XML application lies with the Web and Web Publishing. Web applications are no longer traditional. Browsers are now integrating games, word processors and more. XML is based in Web Publishing, so the future of XML is seen to grow as well.

 

Proxies

web browsers do not interact directly with web servers; instead they communicate via a proxy. HTTP proxies are often used to reduce network traffic, allow access through firewalls, provide content filtering, etc. Proxies have their own functionality that is defined by the HTTP standard.
A proxy server is a server that acts as an intermediary between a workstation user and the Internet so that the enterprise can ensure security, administrative control, and caching service. A proxy server is associated with or part of a gateway server that separates the enterprise network from the outside network and a firewall server that protects the enterprise network from outside intrusion.
A proxy server receives a request for an Internet service (such as a Web page request) from a user. If it passes filtering requirements, the proxy server, assuming it is also a cache server , looks in its local cache of previously downloaded Web pages. If it finds the page, it returns it to the user without needing to forward the request to the Internet. If the page is not in the cache, the proxy server, acting as a client on behalf of the user, uses one of its own IP addresses to request the page from the server out on the Internet. When the page is returned, the proxy server relates it to the original request and forwards it on to the user.
To the user, the proxy server is invisible; all Internet requests and returned responses appear to be directly with the addressed Internet server. (The proxy is not quite invisible; its IP address has to be specified as a configuration option to the browser or other protocol program.)
An advantage of a proxy server is that its cache can serve all users. If one or more Internet sites are frequently requested, these are likely to be in the proxy's cache, which will improve user response time. In fact, there are special servers called cache servers. A proxy can also do logging.
The functions of proxy, firewall, and caching can be in separate server programs or combined in a single package. Different server programs can be in different computers. For example, a proxy server may in the same machine with a firewall server or it may be on a separate server and forward requests through the firewall.


Browser Requests and CGI Server Responses

A browser is an HTTP client because it sends requests to an HTTP server (Web server), which then sends responses back to the client. The standard (and default) port for HTTP servers to listen on is 80, though they can use any port.
HTTP is used to transmit resources, not just files. A resource is some chunk of information that can be identified by a URL (it's the R in URL). The most common kind of resource is a file, but a resource may also be a dynamically-generated query result, the output of a CGI script, a document that is available in several languages, or something else. All HTTP resources are currently either files or server-side script output.
Like most network protocols, HTTP uses the client-server model: An HTTP client opens a connection and sends a request message to an HTTP server; the server then returns a response message, usually containing the resource that was requested. After delivering the response, the server closes the connection (making HTTP a stateless protocol, i.e. not maintaining any connection information between transactions).
The format of the request and response messages are similar, and English-oriented. Both kinds of messages consist of:
  • an initial line
  • zero or more header lines
  • a blank line (i.e. a CRLF by itself)
  • an optional message body (e.g. a file, or query data, or query output)
Put another way, the format of an HTTP message is:
<initial line, different for request vs. response>
Header1: value1
Header2: value2
Header3: value3
<optional message body goes here, like file contents or query data;
it can be many lines long, or even binary data $&*%@!^$@ >
Initial Request Line
The initial line is different for the request than for the response. A request line has three parts, separated by spaces: a method name, the local path of the requested resource, and the version of HTTP being used. A typical request line is:
GET /path/to/file/index.html HTTP/1.0
Important Points:
1).GET is the most common HTTP method; it says "give me this resource". Other methods include POST and HEAD-- more on those later. Method names are always uppercase.
2).The path is the part of the URL after the host name, also called the request URI (a URI is like a URL, but more general).
3).The HTTP version always takes the form "HTTP/x.x", uppercase.
Initial Response Line (Status Line)
The initial response line, called the status line, also has three parts separated by spaces: the HTTP version, a response status code that gives the result of the request, and an English reason phrase describing the status code. Typical status lines are:
HTTP/1.0 200 OK
or
HTTP/1.0 404 Not Found
Header Lines
Header lines provide information about the request or response, or about the object sent in the message body. The header lines are in the usual text header format, which is: one line per header, of the form "Header-Name: value", ending with CRLF. It's the same format used for email and news postings
The Message Body
An HTTP message may have a body of data sent after the header lines. In a response, this is where the requested resource is returned to the client (the most common use of the message body), or perhaps explanatory text if there's an error. In a request, this is where user-entered data or uploaded files are sent to the server.
If an HTTP message includes a body, there are usually header lines in the message that describe the body. In particular:
1).The Content-Type: header gives the MIME-type of the data in the body, such as text/html or image/gif.
2).The Content-Length: header gives the number of bytes in the body.

HTTP Request Methods
HTTP/1.0 allows an open-ended set of methods to be used to indicate the purpose of a request. The three most often used methods are GET, HEAD, and POST.
The GET Method
Information from a form using the GET method is appended onto the end of the action URI being requested. Your CGI program will receive the encoded form input in the environment variable QUERY_STRING.
The GET method is used to ask for a specific document - when you click on a hyperlink, GET is being used. GET should probably be used when a URL access will not change the state of a database (by, for example, adding or deleting information) and POST should be used when an access will cause a change. Many database searches have no visible side-effects and make ideal applications of query forms using GET. The semantics of the GET method changes to a "conditional GET" if the request message includes an If-Modified-Since header field. A conditional GET method requests that the identified resource be transferred only if it has been modified since the date given by the If-Modified-Since header.
The HEAD method
The HEAD method is used to ask only for information about a document, not for the document itself. HEAD is much faster than GET, as a much smaller amount of data is transferred. It's often used by clients who use caching, to see if the document has changed since it was last accessed. If it was not, then the local copy can be reused, otherwise the updated version must be retrieved with a GET.
The POST Method
This method transmits all form input information immediately after the requested URI. Your CGI program will receive the encoded form input on stdin. 

CGI Server Responses !

Like client requests, Server responses always contain HTTP headers and an optional body. The structure of the headers for the response is the same as for requests. The first header line has a special meaning, and is referred to as the status line. The remaining lines are name-value header field lines.
The Status Line
The first line of the header is the status line, which includes the protocol and version just as in HTTP requests, except that this information comes at the beginning instead of at the end. This string is followed by a space and the three-digit status code, as well as a text version of the status.
Status codes are grouped into five different classes according to their first digit:
1xx
These status codes were introduced for HTTP 1.1 and used at a low level during HTTP transactions. You won't use 100-series status codes in CGI scripts.
2xx
200-series status codes indicate that all is well with the request.
3xx
300-series status codes generally indicate some form of redirection. The request was valid, but the browser should find the content of its response elsewhere.
4xx
400-series status codes indicate that there was an error and the server is blaming the browser for doing something wrong.
5xx
500-series status codes also indicate there was an error, but in this case the server is admitting that it or a CGI script running on the server is the culprit.
Server Headers
After the status line, the server sends its HTTP headers. Some of these server headers are the same headers that browsers send with their requests.
The common server headers are:
Content-Base: Specifies the base URL for resolving all relative URLs within the document
Content-Length: Specifies the length (in bytes) of the body
Content-Type: Specifies the media type of the body
Date: Specifies the date and time when the response was sent
ETag: Specifies an entity tag for the requested resource
Last-Modified: Specifies the date and time when the requested resource was last modified
Location: Specifies the new location for the resource
Server: Specifies the name and version of the web server
Set-Cookie: Specifies a name-value pair that the browser should provide with future requests
WWW-Authenticate: Specifies the authorization scheme and realm


Uniform Resource Locator

URL stands for Uniform Resource Locator, the global address of documents and other resources on the World Wide Web. The first part of the address is called a protocol identifier and it indicates what protocol to use, and the second part is called a resource name and it specifies the IP address or the domain name where the resource is located. The protocol identifier and the resource name are separated by a colon and two forward slashes.
For example
http://www.tallysolutions.com/website/html/PartnerDetails/622894.php
The URLs above specifies a Web page that should be fetched using the HTTP protocol

Elements of a URL
Every URL is made up of some combination of the following: the scheme name (commonly called protocol), followed by a colon, then, depending on scheme, a hostname (alternatively, IP address), a port number, the pathname of the file to be fetched or the program to be run, then (for programs such as CGI scripts) a query string[4][5], and with HTML files, an anchor (optional) for where the page should start to be displayed.
Scheme
The scheme represents the protocol, and for our purposes will either be http or https. https represents a connection to a secure web server.
<scheme>:<scheme-specific-part>
A URL contains the name of the scheme being used (<scheme>) followed by a colon and then a string (the <scheme-specific-part>) whose interpretation depends on the scheme. Scheme names consist of a sequence of characters. The lower case letters "a"--"z", digits, and the characters plus ("+"), period ("."), and hyphen ("-") are allowed. For resiliency, programs interpreting URLs should treat upper case letters as equivalent to lower case in scheme names (e.g., allow "HTTP" as well as "http").
Host
The hostname part of the URL should be a valid Internet hostname such as www.tallysolutions.com. It can also be an IP address such as 204.29.207.217
Port Number
The port number is optional. It's not necessary if the service is running on the default port, 80 for http servers.
Path Information
The path points to a particular directory on the specified server. The path is relative to the document root of the server, not necessarily to the root of the file system on the server. In general a server does not show its entire file system to clients. Indeed it may not really expose a file system at all. (Amazon's URLs, for example, mostly point into a database.) Rather it shows only the contents of a specified directory. This directory is called the server root, and all paths and filenames are relative to it. Thus on a Unix workstation all files that are available to the public might be in /var/public/html, but to somebody connecting from a remote machine this directory looks like the root of the file system.
The filename points to a particular file in the directory specified by the path. It is often omitted in which case it is left to the server's discretion what file, if any, to send. Many servers will send an index file for that directory, often called index.html. Others will send a list of the files in the directory. Others may send an error message.
Fragment identifier
The fragment identifier is used to reference a named anchor or ID in an HTML document. A named anchor is created in HTML document with an A element with a NAME attribute like this one:
<a name="anchor" >Here is the content you're after...</a>

Absolute and Relative URLs
Absolute URL
URLs that include the hostname are called absolute URLs. An example of an absolute URL is:
http://localhost/cgi/script.cgi.
Relative URL
URLs without a scheme, host, or port are called relative URLs. These can be further broken down into full and relative paths:
Full paths
Relative URLs with an absolute path are sometimes referred to as full paths (even though they can also include a query string and fragment identifier). Full paths can be distinguished from URLs with relative paths because they always start with a forward slash. Note that in all these cases, the paths are virtual paths, and do not necessarily correspond to a path on the web server's filesystem. An example of an absolute path is /index.html.
Relative paths
Relative URLs that begin with a character other than a forward slash are relative paths. Examples of relative paths include script.cgi and ../images/photo.jpg.

URL Character Encoding Issues
URLs are sequences of characters, i.e., letters, digits, and special characters. A URLs may be represented in a variety of ways: e.g., ink on paper, or a sequence of octets in a coded character set. The interpretation of a URL depends only on the identity of the characters used.
In most URL schemes, the sequences of characters in different parts of a URL are used to represent sequences of octets used in Internet protocols. For example, in the ftp scheme, the host name, directory name and file names are such sequences of octets, represented by parts of the URL. Within those parts, an octet may be represented by the chararacter which has that octet as its code within the US-ASCII [20] coded character set.
In addition, octets may be encoded by a character triplet consisting of the character "%" followed by the two hexadecimal digits (from "0123456789ABCDEF") which forming the hexadecimal value of the octet. (The characters "abcdef" may also be used in hexadecimal encodings.)
Octets must be encoded if they have no corresponding graphic character within the US-ASCII coded character set, if the use of the corresponding character is unsafe, or if the corresponding character is reserved for some other interpretation within the particular URL scheme.
No corresponding graphic US-ASCII
URLs are written only with the graphic printable characters of the US-ASCII coded character set. The octets 80-FF hexadecimal are not used in US-ASCII, and the octets 00-1F and 7F hexadecimal represent control characters; these must be encoded.
Unsafe
Characters can be unsafe for a number of reasons. The space character is unsafe because significant spaces may disappear and insignificant spaces may be introduced when URLs are transcribed or typeset or subjected to the treatment of word-processing programs. The characters < and > are unsafe because they are used as the delimiters around URLs in free text; the quote mark (""") is used to delimit URLs in some systems. The character "#" is unsafe and should always be encoded because it is used in World Wide Web and in other systems to delimit a URL from a fragment/anchor identifier that might follow it. The character "%" is unsafe because it is used for encodings of other characters. Other characters are unsafe because gateways and other transport agents are known to sometimes modify such characters. These characters are "{", "}", "|", "\", "^", "~", "[", "]", and "`".
All unsafe characters must always be encoded within a URL. For example, the character "#" must be encoded within URLs even in systems that do not normally deal with fragment or anchor identifiers, so that if the URL is copied into another system that does use them, it will not be necessary to change the URL encoding.
Reserved
Many URL schemes reserve certain characters for a special meaning: their appearance in the scheme-specific part of the URL has a designated semantics. If the character corresponding to an octet is reserved in a scheme, the octet must be encoded. The characters ";", "/", "?", ":", "@", "=" and "&" are the characters which may be reserved for special meaning within a scheme. No other characters may be reserved within a scheme.
Usually a URL has the same interpretation when an octet is represented by a character and when it encoded. However, this is not true for reserved characters: encoding a character reserved for a particular scheme may change the semantics of a URL.
Thus, only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL. On the other hand, characters that are not required to be encoded (including alphanumerics) may be encoded within the scheme-specific part of a URL, as long as they are not being used for a reserved purpose. 

The Hypertext Transfer Protocol

The Hypertext Transfer Protocol (HTTP) is an application-level protocol with the lightness and speed necessary for distributed, collaborative, hypermedia information systems. It is a generic, stateless, object-oriented protocol which can be used for many tasks, such as name servers and distributed object management systems, through extension of its request methods (commands). A feature of HTTP is the typing and negotiation of data representation, allowing systems to be built independently of the data being transferred.
In the general form HTTP is the protocol that clients and servers use to communicate on the Web. HTTP is the underlying mechanism on which CGI operates, and it directly determines what you can and cannot send or receive via CGI.

HTTP Properties
A comprehensive addressing scheme
The HTTP protocol uses the concept of reference provided by the Universal Resource Identifier (URI) as a location (URL) or name (URN), for indicating the resource on which a method is to be applied. When an HTML hyperlink is composed, the URL (Uniform Resource Locator) is of the general form http://host:port-number/path/file.html. More generally, a URL reference is of the type service://host/file.file-extension and in this way, the HTTP protocol can subsume the more basic Internet services.
Client-Server Architecture
The HTTP protocol is based on a request/response paradigm. The communication generally takes place over a TCP/IP connection on the Internet. The default port is 80, but other ports can be used. This does not preclude the HTTP/1.0 protocol from being implemented on top of any other protocol on the Internet, so long as reliability can be guaranteed.
The HTTP protocol is connectionless and stateless
After the server has responded to the client's request, the connection between client and server is dropped and forgotten. There is no "memory" between client connections. The pure HTTP server implementation treats every request as if it was brand-new, i.e. without context.
An extensible and open representation for data types
HTTP uses Internet Media Types (formerly referred to as MIME Content-Types) to provide open and extensible data typing and type negotiation. When the HTTP Server transmits information back to the client, it includes a MIME-like (Multipart Internet Mail Extension) header to inform the client what kind of data follows the header. Translation then depends on the client possessing the appropriate utility (image viewer, movie player, etc.) corresponding to that data type.

HTTP Header Fields
An HTTP transaction consists of a header followed optionally by an empty line and some data. The header will specify such things as the action required of the server, or the type of data being returned, or a status code.
The header lines received from the client, if any, are placed by the server into the CGI environment variables with the prefix HTTP_ followed by the header name. Any - characters in the header name are changed to _ characters. The server may exclude any headers which it has already processed, such as Authorization, Content-type, and Content-length.
HTTP_ACCEPT
The MIME (Multipurpose Internet Mail Extension) types which the client will accept, as given by HTTP headers. Other protocols may need to get this information from elsewhere. Each item in this list should be separated by commas as per the HTTP spec.
Format: type/subtype, type/subtype
HTTP_USER_AGENT
The browser the client is using to send the request. General format: software/version library/version.
The server sends back to the client:
1).A status code that indicates whether the request was successful or not. Typical error codes indicate that the requested file was not found, that the request was malformed, or that authentication is required to access the file.
2).The data itself. Since HTTP is liberal about sending documents of any format, it is ideal for transmitting multimedia such as graphics, audio, and video files.
3).It also sends back information about the object being returned.
Fields are:
Content-Type
Indicates the media type of the data sent to the recipient or, in the case of the HEAD method, the media type that would have been sent had the request been a GET.
Content-Type: text/html
Date
The date and time at which the message was originated.
Date: Tue, 15 Nov 1994 08:12:31 GMT
Expires
The date after which the information in the document ceases to be valid. Caching clients, including proxies, must not cache this copy of the resource beyond the date given, unless its status has been updated by a later check of the origin server.
Expires: Thu, 01 Dec 1994 16:00:00 GMT
From
An Internet e-mail address for the human user who controls the requesting user agent. The request is being performed on behalf of the person given, who accepts responsibility for the method performed. Robot agents should include this header so that the person responsible for running the robot can be contacted if problems occur on the receiving end.
From: Stars@WDVL.com
If-Modified-Since
Used with the GET method to make it conditional: if the requested resource has not been modified since the time specified in this field, a copy of the resource will not be returned from the server; instead, a 304 (not modified) response will be returned without any data.
If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT
Last-Modified
Indicates the date and time at which the sender believes the resource was last modified. Useful for clients that eliminate unnecessary transfers by using caching.
Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT
Location
The Location response header field defines the exact location of the resource that was identified by the request URI. If the value is a full URL, the server returns a "redirect" to the client to retrieve the specified object directly.
Location: http://WWW.Stars.com/Tutorial/HTTP/index.html
If you want to reference another file on your own server, you should output a partial URL, such as the following:
Location: /Tutorial/HTTP/index.html
Referer
Allows the client to specify, for the server's benefit, the address (URI) of the resource from which the request URI was obtained. This allows a server to generate lists of back-links to resources for interest, logging, optimized caching, etc. It also allows obsolete or mistyped links to be traced for maintenance.
Referer: http://WWW.Stars.com/index.html
Server
The Server response header field contains information about the software used by the origin server to handle the request.
Server: CERN/3.0 libwww/2.17
User-Agent
Information about the user agent originating the request. This is for statistical purposes, the tracing of protocol violations, and automated recognition of user agents for the sake of tailoring responses to avoid particular user agent limitations - such as inability to support HTML tables.
User-Agent: CERN-LineMode/2.15 libwww/2.17b3

HTTP Request Methods
HTTP/1.0 allows an open-ended set of methods to be used to indicate the purpose of a request. The three most often used methods are GET, HEAD, and POST.
The GET Method
Information from a form using the GET method is appended onto the end of the action URI being requested. Your CGI program will receive the encoded form input in the environment variable QUERY_STRING.
The GET method is used to ask for a specific document - when you click on a hyperlink, GET is being used. GET should probably be used when a URL access will not change the state of a database (by, for example, adding or deleting information) and POST should be used when an access will cause a change. Many database searches have no visible side-effects and make ideal applications of query forms using GET. The semantics of the GET method changes to a "conditional GET" if the request message includes an If-Modified-Since header field. A conditional GET method requests that the identified resource be transferred only if it has been modified since the date given by the If-Modified-Since header.
The HEAD method
The HEAD method is used to ask only for information about a document, not for the document itself. HEAD is much faster than GET, as a much smaller amount of data is transferred. It's often used by clients who use caching, to see if the document has changed since it was last accessed. If it was not, then the local copy can be reused, otherwise the updated version must be retrieved with a GET.
The POST Method
This method transmits all form input information immediately after the requested URI. Your CGI program will receive the encoded form input on stdin.