Table of Contents
In essence, HTTP creates a global repository of interconnected information accessible worldwide. It empowers users to seamlessly navigate between hypertexts or documents, allowing the transfer of files and leveraging the services of TCP. HTTP employs TCP connections for sending client requests and receiving server replies.
Let's explore its features to gain insights into how HTTP operates regarding communication, data transmission, and the fundamental principles guiding interactions between clients and servers on the World Wide Web.
HTTP Features:
1. Stateless Protocol:
- HTTP is a stateless protocol, meaning that each request is independent. The server does not retain session information, and the connection between the browser and server is lost after the response.
- Stateless nature enhances simplicity and scalability, as each request is self-contained.
2. Plain-Text Communication:
- HTTP operates as a plain-text protocol, implying that communication between the website and the server is not encrypted. Data is transmitted across the network in clear text without encryption.
- This lack of encryption makes it easier to inspect the content but may pose security concerns for sensitive data.
3. Request-Response Cycle:
- HTTP follows a request-response cycle. When a client wants to access a resource on a remote server, it sends an HTTP request using the standard defined by RFC2616.
- The server interprets the request, processes it, and responds with the requested resource in HTML format. The response also includes status codes and HTTP response headers.
4. TCP Connection:
- HTTP uses TCP (Transmission Control Protocol) connections for sending and receiving data on the web. Before sending a request to the server, a 3-way TCP handshake is required to establish a connection.
- The TCP connection ensures reliable and ordered delivery of data between the client and server.
5. Default Port 80:
- The default port for HTTP communication is port 80. Ports are logical paths or communication channels used to identify processes or services on remote or local machines.
- Port 80 serves as the default channel for exchanging information with different servers and machines over the internet.
HTTP Transaction
Step Involved
HTTP transactions provide insight into how clients and servers communicate, exchange information, and deliver web content in the interconnected landscape of the World Wide Web.
The interaction between a client and a server in HTTP involves the initiation and exchange of messages, forming a transaction.
Here's a breakdown of the key components and steps in an HTTP transaction:
1. Initiation:
The client kickstarts the transaction by sending a request message to the server. The server responds by sending a corresponding response message.
2. Message Types:
There are two main types of HTTP messages: request messages and response messages.
- A Request Message comprises a request line, headers, and an optional body.
- A Response Message includes a status line, headers, and sometimes a body.
3. Request–Response Protocol:
HTTP operates as a request–response protocol within the client–server computing model. A client, such as a web browser, submits an HTTP request message to a server, typically hosting a web application.
The server processes the request, provides resources performs functions on behalf of the client, and returns a response message.
4. User Agents:
User agents, like web browsers, act as clients in the HTTP model. Other user agents include web crawlers, mobile apps, voice browsers, and software-consuming web content.
5. Intermediate Network Elements:
HTTP is designed to accommodate intermediate network elements that enhance communication between clients and servers. Web cache servers, for example, optimize response times by delivering content on behalf of upstream servers.
HTTP proxy servers facilitate communication for clients without globally routable addresses.
6. Protocol Adaptability:
While TCP is commonly used as the reliable transport layer protocol, HTTP can adapt to use other protocols like UDP (User Datagram Protocol), as seen in HTTPU and SSDP (Simple Service Discovery Protocol).
7. Resource Identification:
HTTP resources are identified and located on the network using Uniform Resource Locators (URLs). Uniform Resource Identifiers (URIs) with schemes such as http and https facilitate the interlinking of hypertext documents in HTML.
Uniform Resource Locators(URLs) |
HTTP Specifications
The HTTP specification is overseen by the World Wide Web Consortium (W3C) and can be accessed at http://www.w3.org/standards/techs/http. Currently, there are two versions of HTTP: HTTP/1.0 and HTTP/1.1. The original version, HTTP/0.9 (1991), authored by Tim Berners-Lee, served as a basic protocol for transferring raw data over the Internet. HTTP/1.0 (1996), defined in RFC 1945, enhanced the protocol by introducing MIME-like messages. However, HTTP/1.0 lacked provisions for addressing issues like proxies, caching, persistent connections, virtual hosts, and range downloads. These features were incorporated into HTTP/1.1 (1999), as specified in RFC 2616.
HTTP Server
To delve into the intricacies of the HTTP protocol, one requires an HTTP server, with notable examples being the Apache HTTP Server or Apache Tomcat Server.
The Apache HTTP server stands out as a widely used, robust production server developed by the Apache Software Foundation (ASF) and can be explored further at www.apache.org. ASF operates as an open-source software foundation, indicating that the Apache HTTP server is freely available, complete with its source code.
Initially crafted by Tim Berners-Lee at CERN (European Center for Nuclear Research) in Geneva, Switzerland, the birthplace of HTML, the Apache HTTP server has a significant legacy. Its foundation lies in the NCSA (National Center for Supercomputing Applications, USA) "httpd 1.3" server, dating back to early 1995. The name " Apache " might derive from a combination of original code and patches, possibly referencing the name of an American Indian tribe.
For guidance on installing and configuring the Apache HTTP server, one can refer to the " Apache How-to ." Similarly, the " Tomcat How-to " provides insights into installing and initiating the Apache Tomcat Server.
HTTP Request & Response Format
When a user requests a resource from the server, specific information is sent to the web server, including the HTTP Verb, HTTP Headers, and HTTP Message Body. The web server analyzes and processes the request based on the HTTP Standard Request Format, which consists of the following components:
- HTTP Verb
- HTTP Headers
- HTTP Message Body
Key Points to Note:
- /r represents a carriage return, used to request the web server to provide access to a resource and return the content or webpage through the same request.
- /n represents the next line.
- After every request line, there is a space between the request lines and the headers.
- Format of HTTP Request: HTTP VERB/Resource Path/HTTP/Version, e.g., GET/login.php? HTTP/1.1.
- The HTTP Message Body contains data or information about the request, and it can be empty or blank.
HTTP Methods/Verb
HTTP methods represent the types of requests or operations performed on a requested resource by the client, executed by the web server. The HTTP verb, indicating the action to be taken on the resource, is a crucial aspect of these methods.
Here are the main HTTP methods used on the Internet:
Method |
Purpose |
Usage |
Caution |
GET |
Retrieve information or request a resource from the server |
Parameters are included in the GET request for obtaining user input. Processed by a PHP script for server storage. |
N/A |
POST |
Upload files and data to the remote server |
Employed to submit forms, files, and data to the server. |
N/A |
OPTIONS |
Display all available methods for requesting the resource |
N/A |
N/A |
HEAD |
Similar to GET but does not retrieve message content |
N/A |
N/A |
TRACE |
Utilized for network diagnostics and troubleshooting |
N/A |
N/A |
DELETE |
Delete a resource or file on the server |
N/A |
Considered the most potent request; requires careful handling as it can delete files or data on the server. |
HTTP Headers
HTTP Header fields play a crucial role in both request and response messages of the Hypertext Transfer Protocol (HTTP), residing within the header section. These fields serve as key components defining parameters that enable clients and servers to communicate additional information alongside their requests or responses. Each HTTP header comprises a name-value pair , and custom headers are identified by the 'X' prefix, carrying details about the resource to be fetched or information about the client.
These headers facilitate the exchange of vital information between the client and the server, and they fall into four distinct categories: general header, request header, response header, and entity-header. In the context of a request message, it can include general, request, and entity-headers. Conversely, a response message can incorporate general, response, and entity-headers, illustrating the nuanced role each category plays in enhancing the communication between the client and the server.
Here are some commonly used HTTP headers:
General header: The general-header gives general information about the message and can be present in both a request and a response.
Header |
Description |
Cache-control |
Specifies information about caching |
Connection |
Shows whether the connection should be closed or not |
Date |
Shows the MIME version used |
Upgrade |
Specifies the preferred communication protocol |
Request header: The request header can be present only in a request message. It specifies the client's configuration and the client's preferred document format.
Header |
Description |
Accept |
Shows the medium format the client can accept |
From |
Shows the e-mail address of the user |
Response header: The response header can be present only in a response message. It specifies the server's configuration and special information about the request.
Header |
Description |
Accept-range |
Shows if the server accepts the range requested by the client |
Server |
Shows the server name and version number |
Entity header: The entity-header gives information about the body of the document.
Header |
Description |
Allow |
List valid methods that can be used with a URL |
Content-encoding |
Specifies the encoding Scheme |
Content-language |
Specifies the language |
Content-length |
Shows the length of the document |
Content-range |
Specifies the range of the document |
Content-type |
Specifies the medium type |
Etag |
Gives an entity tag |
Expires |
Gives the date and time when the contents may change |
Last-modified |
Gives the date and time of the last change |
Location |
Specifies the location of the created or moved document |
Body: The body can be present in a request or response message
HTTP Response
After receiving and interpreting a request message , a server responds with an HTTP response message.
HTTP response |
Message Status-Line:
A status line consists of the protocol version followed by a numeric status code and its associated textual phrase. The elements are separated by space (SP) characters.
Status-Line Format:
`HTTP-Version SP Status-Code SP Reason-Phrase CRLF`
HTTP Version:
A server supporting HTTP version 1.1 will return the following version information: `HTTP/1.1`
Status code:
The Status-Code element is a 3-digit integer where the first digit defines the class of the response, and the last two digits do not have any categorization role. There are 5 values for the first digit:
Code and Description:
- 1xx Informational: The request was received, and the process is continuing.
- 2xx Success: The action was successfully received, understood, and accepted.
- 3xx Redirection: Further action must be taken to complete the request.
- 4xx Client Error: The request contains incorrect syntax or cannot be fulfilled.
- 5xx Server Error: The server failed to fulfill an apparently valid request.
Code |
Phrase |
Description |
Informational |
||
100 |
Continue |
The initial part of the request has been received, and the client may continue. |
101 |
Switching |
The server is complying with a client request to switch protocols. |
Success |
||
200 |
OK |
The request is successful. |
201 |
Created |
A new URL is created. |
202 |
Accepted |
The request is accepted, but not immediately acted upon. |
204 |
No content |
There is no content in the body. |
Redirection |
||
301 |
Moved permanently |
The requested URL is no longer used by the server. |
302 |
Moved temporarily |
The requested URL has been moved temporarily. |
304 |
Not modified |
The document has not been modified. |
Client Error |
||
400 |
Bad request |
There is a syntax error in the request. |
401 |
Unauthorized |
The request lacks proper authorization. |
403 |
Forbidden |
Service is denied. |
404 |
Not Found |
The requested URL or document is not found. |
405 |
Method not allowed |
The method is not supported in this URL. |
406 |
Not acceptable |
The format requested is not acceptable. |
Server Error |
||
500 |
Internal server error |
There is an error, such as a crash, at the server site. |
501 |
Not Implemented |
The action requested cannot be performed. |
503 |
Service Unavailable |
The service is temporarily unavailable but may be requested in the future. |
HTTP status codes are extensible, and HTTP applications are not required to understand the meaning of all registered status codes.
HTTPS
HTTPS (Hypertext Transfer Protocol Secure) is an extension of HTTP designed for secure communication over computer networks. It enhances security by providing additional TLS/SSL (Transport Layer Security/Secure Socket Layer) support. The key feature of HTTPS is encrypting communication between users and web servers, preventing interception of messages over the network.
Here are some essential points about HTTPS:
1. Encryption Mechanism: HTTPS encrypts packet content using SSL or TLS, ensuring the confidentiality and integrity of the transmitted data.
2. Certificate Authority (CA): HTTPS works with a Certificate Authority, which validates the identity of the web server over the internet. This validation is crucial for establishing trust in the communication.
3. Digital Certificate: An HTTPS connection establishes an encrypted link using a digitally signed document known as a digital certificate. This certificate is verified by a Certificate Authority.
4. X.509 Standard: Digital certificates adhere to the X.509 standard, providing a structured format for establishing the validity of certificates.
5. Certificate Validation: After verification by a Registration Authority (RA), the CA provides the digital certificate to the web server, allowing users to view credentials, inspect the public key, and verify server identity.
6. 4-Way Handshake: HTTPS communication begins with a 4-way handshake between the client and server. This handshake process ensures secure communication, and data is encrypted throughout the transaction.
HTTPS 4-way Handshake |
In summary, HTTPS offers a secure layer over HTTP by implementing encryption, digital certificates, and a rigorous validation process through Certificate Authorities and Registration Authorities. This ensures the confidentiality and integrity of data transmitted between users and web servers.
Key differences between HTTP & HTTPS
Feature |
HTTP |
HTTPS |
Data Transmission |
Unencrypted or plain text |
Encrypted or in cipher-text |
Security |
No encryption, less secure |
Uses SSL or TLS, highly secure |
Protocol Nature |
Stateless |
Stateful (uses cookies) |
Default Port |
80 |
443 |