S 4.359 Overview of the web server components
Initiation responsibility: Head of IT, IT Security Officer
Implementation responsibility: Administrator
In order to be able to provide a website, both hardware and software components are required. Depending on the functionality of the web application, different server types are required. The basic components include the web server and the web application server. Normally, the services for the web server and the web application server are installed on different IT systems. Database servers are mostly used for persistent data storage of the content. Additionally, directory servers are also frequently used for simple operations, with the clients only being able to gain read access to the data on those servers. Such directories are used for storing user data, for example.
Web server
A web server is a software component that can be used to provide websites using HTTP and HTTPS. This provides for a framework, the functions of which may be used by the web application. Frequently, the hardware a web server software is installed on is also referred to as web server.
The web server is the core component of every website. It receives the user queries and returns the corresponding response itself, if possible. For example, this is applicable to static web applications. With these, queried content is returned immediately without the web server retrieving any dynamic functions. With dynamic web applications, the web server usually forwards the query to the web application server. Dynamic functions are performed on this server, e.g. the structure of a website based on database content, and the result is returned to the web server. In some web servers, the web application server for some programming languages is already integrated (e.g. the Apache web server supports the PHP script language (acronym for "PHP: Hypertext Preprocessor")). In this case, the application is executed locally on the server and does not have to be forwarded to a web application server.
Since the web server is accessed directly by the users, it also constitutes the most exposed point of attack for an attacker. The web server must ensure that only legitimate queries are forwarded to background systems, that users are only granted access to the content they are authorised for, and that the web server cannot be compromised by exploiting the software vulnerabilities.
A website provides the users with information and functions. The users use a browser to access the website. In order to be able to operate websites providing dynamic functions, the following architectures lend themselves:
- Implementation of the dynamic function using external programs retrieved with the help of interfaces. Examples include CGI (Common Gateway Interface) and SSI (Server Side Includes). When using a website, the programs are called directly on the web server. The results of the program call are embedded in the result that is returned to the browser of the user.
- Implementation of the dynamic function in the form of functions or modules integrated into the web server. An example includes PHP as module in Apache. The essential difference to the first point is that no external programs are called. The dynamic function is embedded directly into the websites to be displayed, for example in the form of script code. Prior to delivering the website to the browser, the script code is interpreted and the result is included in the server's response.
- Implementation of the dynamic function on a separate web application server such as JBoss, Weblogic, or WebSphere. When using this form of architecture, queries are initially received and processed by the web server. Those parts of the query requiring the use of dynamic functions are forwarded to a web application server. This server performs the required functions and the possible communication with background systems and returns the result to the web server. The web server then embeds the result of the web application server into the response sent to the browser of the client.
Web application server
Numerous websites need further systems in addition to the web server. For example, separate web application servers are required if websites are provided with the help of Java or .NET. Programming languages such as PHP, ASP, or Perl work without an additional web application server for the most part, since the required functions mostly are integrated directly into the web server.
A web application server is used in order to be able to provide dynamic websites. Here, queries from the web server are forwarded to the web application server together with the corresponding parameters. This server then activates the scripts, methods, or functions required for processing the query. Depending on the type of activation, the application server also performs queries to background systems, for example to database or directory servers. The result of the web application server is then returned to the web server.
A web application server unites a couple of important functions and frameworks frequently required for operating websites. By encapsulating and abstracting interfaces to other systems (e.g. background systems), clean separation between application and data storage is possible. Frameworks that may be provided on web application servers comprise numerous functions for communicating with background systems. For example, abstract functions for reading out and manipulating database content are offered which require little knowledge of the database actually used. Furthermore, various security functions have already been implemented in frameworks (e.g. prepared statements for SQL queries preventing the exploitation of SQL Injection vulnerabilities). Scaling is possible by adding further web application servers to a cluster, without having to modify the application for this.
Database server
For permanent data storage, databases are usually used in the context of websites. These databases are usually operated in combination with the related database management systems (DBMSs) on dedicated database servers. As a matter of principle, the following forms of databases are differentiated:
- Hierarchical databases represent data objects in a parent-child relation.
- Network-based databases are able to link data objects via networks.
- Relational databases represent data objects in tables. These tables may be related to each other.
- Object-oriented databases store data objects as objects within the meaning of object-oriented programming. This means that objects in an object-oriented database have the same characteristics as objects in programming.
Database servers require special protection (see also S 5.7 Databases). For example, the DBMSs contained on them must only be accessed by authorised resources. Furthermore, access must be clearly defined and maintained on data object level according to the minimum principle. Just like any other software, DBMSs may also have vulnerabilities that may be used by attackers in order to access confidential data. In addition to restricting the access, safeguards intended to eliminate known vulnerabilities must be taken. Specifically for database systems, safeguards for protecting against SQL Injection must be taken in the website, since attackers may be able to read out or change content of the database without authorisation otherwise.
Particularly confidential content of a database (e.g. passwords) should be encrypted in order to protect it against unauthorised access. This requires appropriate key management and prevents the data from being read out in clear text.
Directory service
With the help of a directory service, the user data and the rights can be administrated in a centralised manner. This data is usually stored to a hierarchical database. Normally, LDAP (Lightweight Directory Access Protocol) is used in order to access a directory service. This protocol builds upon TCP/IP and allows for querying and modifying information on the directory service server.
Since directory services often contain sensitive data, security-relevant factors must be taken into consideration as well (see S 5.15 General directory service ). On the one hand, sensitive data (e.g. passwords) should be protected against unauthorised access by using appropriate cryptographic procedures. On the other hand, LDAP provides the option of manipulating the queries, just like SQL. Therefore, suitable safeguards must be taken in order to prevent so-called LDAP Injections.
Reverse proxy
Proxies are generally used on the client side in order to use websites. Additionally, it is also possible to use these on the server side in order to optimise accesses (caching) and/or the perform filtering, however. If the proxy is located on the server side, this is referred to as a "reverse proxy" (see also S 4.223 Integration of proxy servers into the security gateway). All queries directed to the related web server are initially received by the proxy. The proxy uses a configurable set of rules in order to decide whether it is able to respond to the query itself (caching), whether it forwards the query to the web server and/or one of the web servers in the cluster, or whether it rejects the query for security-related reasons. The most important functions of a reverse proxy are explained shortly below:
- Caching
Static content such as images or static HTML text can be buffered in a reverse proxy. If this content is queried, the corresponding queries can be responded directly by the reverse proxy. This caching function can shorten the response times and reduce the utilisation of the web servers.
However, caching may also cause security-critical problems. For example, if content is buffered on the reverse proxy for which no authorisation is normally required, it must be ensured that this content is only delivered to authorised users. - Load balancing
If a reverse proxy is able to respond to the query itself, because it has the required data available in the cache, it is not necessary to forward the query to the downstream web server. This helps to reduce the load on the web servers. Since all queries are initially directed to the reverse proxy, the proxy is also able to distribute queries to several servers. This way, a load balancer function can be provided. - Authentication
A reverse proxy allows swapping of the authentication from the web server. This allows for a single sign-on if the reverse proxy is responsible for authenticating users for several web servers. Once the user has logged in to the reverse proxy, he/she will be able to use several servers without having to log in again. Additionally, the forwarding of queries may be made conditional on an authentication at the reverse proxy. - Encryption
An end-to-end encryption (e.g. when using HTTPS via TLS or SSL) can be terminated at the reverse proxy. It is only possible to use this proxy for filtering queries if the decryption procedure is already performed on the reverse proxy Furthermore, terminating the encryption at the reverse proxy relieves the downstream web server, since the web server does not have to use additional resources for decryption. Another advantage of this variant includes the independence of the encryption channel of the web server actually used. This way, consecutive queries may also be processed by different servers without having to change the encryption channel. If the encryption and filtering of the data are terminated on a proxy, the corresponding data protection aspects must be taken into consideration, however. For example, IP addresses of each client, login times, and the websites opened could be logged. - Restriction of communication connections
All queries coming from the untrustworthy network can be routed through the reverse proxy. Undesired connection requests can be rejected here, the administration of the security gateway is facilitated, and the likelihood of misconfigurations is reduced. The IP stack of the web server is separated from the untrustworthy network. - Obfuscation
It is not necessary to publish the IP addresses of the actual web servers, since these only communicate with the reverse proxy and not with the clients. The direct and often undesirable establishment of connections to the web server is made more difficult, since the information necessary for this must first be determined. Reverse proxies ensure that information is not transmitted to the clients using the internal network structure. Error messages indicative of the web server application used and providing information on the compromising can also be intercepted centrally. However, it is recommendable that the actual web servers do not provide this information, but reverse proxies can be used as a "second line of defence".