What happens when you type https://www.holbertonschool.com in your browser and press Enter

Rodrigo Sierra Vargas
12 min readAug 26, 2019
The flow of the request created when you type https://www.holbertonschool.com in your browser and press Enter

DNS PROCESS:

First, right after you type the domain name into a web browser, your computer will start resolving the hostname, https://www.holbertonschool.com, your computer will look for the IP address associated with the domain name in its local DNS cache, which stores DNS information recently saved. If your computer does not have the data stored, it will perform a DNS query to retrieve the information.

Then the computer will query the recursive DNS servers from your (ISP) Internet Service Provider. Recursive DNS servers have their local DNS cache. Given that many of the ISP’s customers are using the same recursive DNS servers, there is a chance that common domain names already exists in its cache. If the domain is cached, the DNS query will end here.

If a recursive DNS server does not have the information stored in its cache memory, the DNS query continues to the authoritative DNS server that has the data for a specific domain. These authoritative name servers are responsible for storing DNS records for their respective domain names.

To find out the IP address for https://www.holbertonschool.com, it will query the authoritative name server for the address record (A record). The Recursive DNS server accesses the A record from the authoritative name servers and stores the record in its local DNS cache. If other DNS queries request the A record for https://www.holbertonschool.com, the recursive server will have the answer and will not have to repeat the DNS lookup process. All DNS records have a time-to-live value, which shows when a DNS record will expire. After some time has passed, the recursive DNS server will ask for an updated copy of the DNS record.

The Recursive DNS server has the information and returns the A record to your computer. Your computer will store the DNS record in its local DNS cache, will read the IP address from the DNS record, and pass this information to your browser, then the web browser will connect to the web server associated with the A records IP.

Taken from: https://www.liquidweb.com/kb/understanding-the-dns-process/

DNS can be used to:

Convert hostnames to IP addresses
Convert IP addresses to hostnames (this is called inverse or pointer query)
Transfer information between DNS servers
Look up other names elements such as MX records (mail exchange)

There are lots of types of DNS queries. Here are the main ones:
Type “A” for IPv4 addresses
Type “AAAA” for IPv6 addresses
Type “CNAME” (Canonical Names) — specifies a domain name that has to be queried in order to resolve the original DNS query
Type “PTR” (Pointer) that specified a reverse query (requesting the FQDN — Fully Qualified Domain Name — corresponding to the IP address you provided)
Type “NS” (Name Server) to get information about the authoritative name server

Interestingly, not all DNS records are public. Today, in addition to allowing employees to use DNS to find things on the internet, organizations use DNS so their employees can find private, internal servers. When an organization wants to keep server names and IP addresses private, or not directly reachable from the internet, they don’t list them in public DNS servers. Instead, organizations list them in private, or internal DNS servers — internal DNS servers store names and IP addresses for internal file servers, mail servers, domain controllers, database servers, application servers, etc. — all the important stuff.

TCP/IP protocols:

Now that the computer has the IP address to connect to the webserver it needs a way to do that process and there are rules.

Three of the most common TCP/IP protocols
HTTP — Used between a web client and a web server, for non-secure data transmissions. A web client (i.e., the Internet browser on a computer) sends a request to a webserver to view a web page. The web server receives that request and sends the web page information back to the web client.
HTTPS — Used between a web client and a web server, for secure data transmissions (our case https://www.holbertonschool.com). Often used for sending credit card transaction data or private data from a web client to a web server.
FTP — Used between two or more computers. One computer sends data to or receives data from another computer directly.

What are the different layers of TCP/IP?

Layer 1: Host-to-network Layer
The lowest layer is used to connect to the host so that the packets can be sent over it. Varies from host to host and network to network.

Layer 2: Internet layer
Selection of a packet switching network which is based on a connectionless internetwork layer is called an internet layer.
It is the layer which holds the whole architecture together.
It helps the packet to travel independently to the destination.
Order in which packets are received is different from the way they are sent.
IP (Internet Protocol) is used in this layer.
The various functions performed by the Internet Layer are:
Delivering IP packets, Performing routing, Avoiding congestion

Layer 3: Transport Layer
It decides if data transmission should be on a parallel path or single path.
Functions such as multiplexing, segmenting or splitting into the data is done by the transport layer.
The applications can read and write to the transport layer.
Transport layer adds header information to the data.
Transport layer breaks the message (data) into small units so that they are handled more efficiently by the network layer.
The transport layer also arranges the packets to be sent, in sequence.

Layer 4: Application Layer
The TCP/IP specifications described a lot of applications that were at the top of the protocol stack.

Taken from: https://www.stemjar.com/osi-vs-tcp-ip-model/

ABOUT FIREWALLS:

A firewall is a system designed to prevent unauthorized access to or from a private network. You can implement a firewall in either hardware or software form, or a combination of both. Firewalls prevent unauthorized Internet users from accessing private networks connected to the internet, especially intranets. All messages entering or leaving the intranet (the local network to which you are connected) must pass through the firewall, which examines each message and blocks those that do not meet the specified security criteria.

Several types of firewalls exist:

Packet filtering: The system examines each packet entering or leaving the network and accepts or rejects it based on user-defined rules. Packet filtering is fairly effective and transparent to users, but it is difficult to configure. In addition, it is susceptible to IP spoofing.
Circuit-level gateway implementation: This process applies security mechanisms when a TCP or UDP connection is established. Once the connection has been made, packets can flow between the hosts without further checking.
Acting as a proxy server: A proxy server is a type of gateway that hides the true network address of the computer(s) connecting through it. A proxy server connects to the internet, makes the requests for pages, connections to servers, etc., and receives the data on behalf of the computer(s) behind it. The firewall capabilities lie in the fact that a proxy can be configured to allow only certain types of traffic to pass (for example, HTTP files, or web pages). A proxy server has the potential drawback of slowing network performance since it has to actively analyze and manipulate traffic passing through it.
Web application firewall: A web application firewall is a hardware appliance, server plug-in, or some other software filter that applies a set of rules to an HTTP conversation. Such rules are generally customized to the application so that many attacks can be identified and blocked.
In practice, many firewalls use two or more of these techniques in concert.

What is an SSL Certificate?

SSL stands for Secure Sockets Layer and, in short, it’s the standard technology for keeping an internet connection secure and safeguarding any sensitive data that is being sent between two systems, preventing criminals from reading and modifying any information transferred, including potential personal details. The two systems can be a server and a client (for example, a shopping website and browser) or server to server (for example, an application with personally identifiable information or with payroll information).

It does this by making sure that any data transferred between users and sites, or between two systems remain impossible to read. It uses encryption algorithms to scramble data in transit, preventing hackers from reading it as it is sent over the connection. This information could be anything sensitive or personal which can include credit card numbers and other financial information, names and addresses.

TLS (Transport Layer Security) is just an updated, more secure, version of SSL. We still refer to our security certificates as SSL because it is a more commonly used term, but when you are buying SSL from Symantec you are actually buying the most up to date TLS certificates with the option of ECC, RSA or DSA encryption.

When HTTPS is used, Which Element of the Communication is Encrypted?
Once the HTTPS handshake is complete all communications between the client and the server are encrypted. This includes the full URL, data (plain text or binary), cookies and other headers.

The only part of the communication not encrypted is what domain or host the client requested a connection. This is because when the connection is initiated an HTTP request is made to the target server to create the secure connection. Once HTTPS is established the full URL is used.

This initialization only needs to occur once for each unique connection. This is why HTTP/2 has a distinct advantage over HTTP/1.1 since it multi-plexes connections instead of opening multiple connections.

How do load balancers work?

When one application server becomes unavailable, the load balancer directs all new application requests to other available servers in the pool.

To handle more advanced application delivery requirements, an application delivery controller (ADC) is used to improve the performance, security, and resiliency of applications delivered to the web. Modern ADCs are not only load balancers, but software-centric application delivery solutions designed to provide a high-quality user experience for web, traditional, and cloud-native applications regardless of where they are hosted.

Application Delivery Controller (ADC) architecture diagram
Load balancing algorithms and methods
Load balancing uses various algorithms, called load balancing methods, to define the criteria that the ADC appliance uses to select the service to which to redirect each client request. Different load balancing algorithms use different criteria.

The Least Connection Method
The default method, when a virtual server is configured to use the least connection, it selects the service with the fewest active connections.

The Round Robin Method
This method continuously rotates a list of services that are attached to it. When the virtual server receives a request, it assigns the connection to the first service in the list and then moves that service to the bottom of the list.

The Least Response Time Method
This method selects the service with the fewest active connections and the lowest average response time.

The Least Bandwidth Method
This method selects the service that is currently serving the least amount of traffic, measured in megabits per second (Mbps).

The Least Packets Method
This method selects the service that has received the fewest packets over a specified period of time.

The Custom Load Method
When using this method, the load balancing appliance chooses a service that is not handling any active transactions. If all of the services in the load balancing setup are handling active transactions, the appliance selects the service with the smallest load.

Taken from: https://docs.microsoft.com/en-us/azure/load-balancer/load-balancer-distribution-mode

What does Web Server mean?

Web server” can refer to hardware or software, or both of them working together. Is a system that delivers content or services to end-users over the internet.

On the hardware side, a web server is a computer that stores web server software and a website’s component files (e.g. HTML documents, images, CSS stylesheets, and JavaScript files). It is connected to the Internet and supports physical data interchange with other devices connected to the web.
On the software side, a web server includes several parts that control how web users access hosted files, at the minimum an HTTP server.

What is an application server?

A server is a program or computer which connects applications across different devices and distributes hardware resources to achieve maximum efficiency and convenience. Furthermore, an application server acts as the middleman in inter-computer communication. Alternatively said:

An application server is a server program in a computer in a distributed network that provides the business logic for an application program. An application server is a server which provides software applications with services such as security, data services, transaction support, load balancing, and management of large distributed systems.

The Internet functions on a three-tier (three-level) structure, the tiers being 1) the user interface or client, 2) the business logic application server, and 3) the larger database and the transaction server. The interface, most commonly a web browser or some other application for viewing web pages, translates computer outputs into recognizable human content and inputs into computer commands. These commands are then processed and transmitted by the application server, which provides the necessary structure and security to ensure the information reaches its intended recipient as well as applying any kinds of data and logical transformations needed for the application to make decisions. The third-tier server, in turn, performs data operations based on input from the application server, stores data and sends the results back out through the application server and the loop is complete. Different approaches to forwarding requests and web pages include the Common Gateway Interface (CGI), Active Server Pages (ASP) by Microsoft, and Java Server Pages (JSP).

Why should companies use application servers?

Data and Update Sharing: Most corporations frequently distribute certain files, like the coming-year budget, and ideas in need of feedback, and update the company applications. In order to ensure everybody has the correct version, application servers are a one-stop place where all such changes can be made.
Easy Layout Change: When renovations have to be made, or the hardware improved, it is much easier to update and configure only a single device
Improved Security: It is much harder to achieve end-point (terminal) security because users pay less attention than IT specialists and their hardware has less “firepower” than it is to protect a centralized server.
Performance: Corporate environments frequently utilize extremely demanding applications and through an application server, which has superior computing capabilities, employees will perform their tasks much quicker compared to a normal PC terminal.
Cost Savings: All such improvements in the digital infrastructure predispose to huge savings. By distributing the software from one place companies will spend less on licenses and replacement of underperforming parts, losses on hacker attack will be minimized and the corporation will gain a competitive advantage due to improved efficiency.

Taken from: https://www.javatpoint.com/server-web-vs-application

Database

Alternatively referred to as a databank or a datastore, and sometimes abbreviated as a DB, a database is a large quantity of indexed digital information. It can be searched, referenced, compared, changed or otherwise manipulated with optimal speed and minimal processing expense.

A database is built and maintained by using a database programming language. The most common database language is SQL, but there are multiple “flavors” of SQL, depending on the type of database being used. Each flavor of SQL has differences in the SQL syntax and are designed to be used with a specific type of database. For example, an Oracle database uses PL/SQL and Oracle SQL (Oracle’s version of SQL). A Microsoft database uses T-SQL (Transact-SQL).

Database components

Schema — A database contains one or more schemas, which is basically a collection of one or more tables of data.
Table — Each table contains multiple columns, which are similar to columns in a spreadsheet. A table can have as little as two columns and as many as one hundred or more columns, depending on the type of data being stored in the table.
Column — Each column contains one of several types of data or values, like dates, numeric or integer values, and alphanumeric values (also known as varchar).
Row — Data in a table is listed in rows, which are like rows of data in a spreadsheet. Often there are hundreds or thousands of rows of data in a table.

References:

--

--