Too Cool for Internet Explorer

Monday, October 6, 2008

How does this wide “Web World” works ?

The first and obvious step: the user connects to the internet. OK, everybody knows that, but this means, he is now able to connect his computer to others around the globe (or outside it I guess).

Each computer (like any other device), connected to the WWW has a unique address, this is called an IP address, like this: “172.16.254.1”.

Since each connected computer is identified, we just need they talk to each other.

To send and receive packages of information between them, they use a common suite of two protocols named the TCP/IP protocol for short.

There are two basic kind of computers connected to the internet: Clients like ourselves, and Servers, where the data and applications we need are stored.

Since the computers are connected they need a higher level protocol and a program on each one to establish a “conversation”.

This high level protocol is the HTTP, and the programs are: a Web browser on the client side and a Web server on the server side.

Most of the time we are sending and receiving files trough the internet, so, we will use the internet to send a file like this: “file:///C:/MySfuff/Pictures/Christmas/PIC002_20071225.jpg” to another computer, or ask for a file like this: “http://204.2.249.122/business/world/prexy.html?hp”, from another one.

The strings above, completely identifies a resource. A resource here is anything that can be identified, named, addressed or handled, in any kind of networked information system. These strings are called URI - Uniform Resource Identifier, and the popular URL - Uniform Resource Locator, is just a sub-set of it.

The language that let’s the client and the server really establish a “conversation” is the Hyper Text Markup Language (HTML for short), most of the time.

That is the language the web browsers understand, so, that is what the web servers delivers for their clients: HTML files on the first instance and all the associated files, like images, sounds, animations, videos, etc.

Wrapped around the HTML language, there are many operations that could be done. And any kind of these operations must be supplied by different kinds of servers. For instance, if you want to send or receive emails you will need an e-mail server, if you want to download or upload files, you may need a FTP server (short for File Transfer Protocol).

Now, we can establish a conversation, between the Client and the Server through the browser, here is an example:

A user types a URL into his browser and click ENTER, or simply clicks on a link addressed to it:

http://www.nypublished.com/pages/business/index.html

The browser will translate this URL into an HTTP request, like this:

GET //pages/business/index.html HTTP/1.1 Host: www.nypublished.com

The browser wraps this request into the TCP/IP protocol and sends it.

When the web server on www.nypublished.com, receives it, it will append the given path to the path of its root directory, “/nyp/www/”, by instance. The result is the local file system resource path:

/nyp/www/pages/business/index.html

At this moment there are a few alternatives to the web server, but the common action is trying to find the file locally and sends a response back to the client’s web browser.

It sounds a bit simpler to be true. OK it is a simplification here; it could be true on the early days of internet with just a few static pages, but now, with all that interactivity, no way. We need to add more elements to the equation.

And besides the browser evolved a lot all this years, the big difference is on the server side.

Let’s take a look on another example, accessing an interactive (application) page, like this:

http://www.campinghq.com/mastercamp/forums/1

As you can see, there is no file name at the end. What will be happening on this case?

The browser will translate this URL into an HTTP request, like this:

POST /mastercamp/forums.rb HTTP/1.0 From: newbie@webworld.com User-Agent: HTTPTool/1.0 Content-Type: application/x-www-form-urlencoded Content-Length: 32 home=forum=1

The browser wraps this request into the TCP/IP protocol and sends it through the internet (1).

Now we need to go deeply into the server side. On this case, there is no file with the content you need on the web server itself. That is a situation where the web server needs help from other servers.

This is what we call a dynamic page. In fact, there is not an HTML file on the web server that simply could be returned to you in response. The file containing your response will be created “on the fly” in this case.

There is a three tier structure (not necessary three different machines), that will be used to give you the answer. There is a Web Server, which receives the request, depicts it, and tries to find an answer to it; this is the first tier (2).

In case the Web Server couldn’t find an answer to the request, it sends the request to an Application Server, the second tier, which in turn tries to find an application and functionality to send the request (3).

In this example, “mastercamp/forums”are respectively the application and functionality to be used.

There is an application called “mastercamp”, and there is a “forums” functionality on it that will be called.

The last parameter, “1”, will be used to pick information (data) from the Database Server, the third tier (4).

So, the “forums” functionality gatherers information from the database (5) and prepares it, the “mastercamp” application, transforms it into an HTML file, and send it back to the Web Server (6).

The Web Server then generates an HTTP response, like this:

HTTP/1.1 200 OK Server: Apache/2.2.9 Date: Wed, 14 Jul 2007 09:48:04 GMT Content-type: text/html X-Powered-By: Ruby/1.8.6 Set-Cookie: rbrd2mysql_data=a%3A0%3A%7B%7D; expires=Thu, 14-Jul-2007 09:48:04 GMT; path=/ Set-Cookie: rbrd2mysql_sid=b389d63f8226cc6c8ad349b3aadf41f3; path=/ Refresh: 0; URL=http://SERVER/rbrd2foobar Content-Length: 0 HTTP/1.0 200 OK Content-Length: 7

Wrap this response into the TCP/IP protocol and send it back to the internet (7) to get it back to the client (8).

Here is “The Big Picture”:




That is a short description on “How the Internet works”.

I do not go deep in any details but all the basic points are here, with links to more detailed explanation.

No comments: