Web and Application Server Failures

an article added by: Ben Smeider at 11272007


In: Categories » Computers and technology » Servers » Web and Application Server Failures

The bugs that can strike a database can also affect a web server. Of course, many web servers are part of client/server applications that query back-end database servers to service client requests. So, anything affecting the database server will have an adverse effect on the web server as well. However, there are many other places within the web server environment where things might go awry. There are many new places for bugs to crop up, including in the Common Gateway Interface (CGI), Perl, Java, JavaScript, or Active Server Page (ASP) code that manages the web page. If some set of circumstances causes a CGI program to get stuck in a loop, the web page it manages will never display, most likely causing the user to try another site. New technology turns up all over the world of web servers. Much of this technology has been tested in a relatively short time by lots of users, and so it is often quite reliable. However, web server refers to a collection of applications: the httpd or web server that handles requests for items on a web page (hits) and returns the HTML or image files; the CGI scripts that get executed to generate web pages or take action on forms or other postings from web clients; and whatever back-end database or file servers are used to manage the content and state information on the web site. A failure in any of these components appears to be a web site failure. Sitting in front of the web server, load-balancing hardware and software may be used to distribute requests over multiple, identical web servers, and on the client side a proxy cache server sits between a client and server and caches commonly accessed pages so that requests don’t have to go outside of the client’s network. Again, these systems can fail, making it appear that the web server has gone away. Much like database servers, disks, filesystems, and logs can fill; memory or CPU can be exhausted; and other system resources can run out, causing hangs, crashes, unresponsiveness, and other nasty behavior.

How do your CGI programs react if they cannot write to a required file? Or if they don’t get a necessary response from some upstream application? Make sure they continue to operate. Usually a web server is a front end for an application server, that nexus of business logic, data manipulation routines, and interfaces to existing back-end systems that does the “heavy lifting” of a web site. In the Java world, application servers are the frameworks that run the J2EE environment. The overall reliability of the application server is a function of all of the components that are layered on top of it. If you’re using J2EE, for example, but call a non-Java native method written in C or C++, that code poses a risk to the application server. Step on memory with some poorly written native code and it’s possible to corrupt the memory footprint of the application server. Since web and application servers represent the top of the software stack, they’re affected by the reliability and correct operation of all of the network pieces underneath them. Peter Deutsch, networking, security, and risk evaluation expert, points out that you can’t control the security, reliability, or latency of a network. Furthermore, there’s rarely a single point of control or single point of accountability for a network of systems.

Denial-of-Service Attacks

Not all problems involve the failure of a component. Making a resource unavailable to requestors is just as harmful to system availability as having that resource crash or fail. For example, a web server that is being bombarded by an exceptionally high request rate, absorbing all of the network connection capacity of the servers, appears to have failed to any web users attempting to access it. It hasn’t failed in the downtime sense, but it has been made unavailable through a denial-of-service (DoS) attack. DoS approaches have been well known in security circles for several years, and in February 2000, the first wide-scale distributed DoS (DDoS) attack known as trin or trin00 affected several popular web sites. DoS attacks consume all available bandwidth for a given resource, ranging from network bandwidth and network connections to web server request handling and database requests.

The following are some characteristic DoS attacks:

Network bandwidth flooding. A torrent of packets, valid or garbage, consumes all of your incoming network bandwidth. This type of attack typically causes routers or switches to begin dropping legitimate traffic.

Network connection flooding. An attempt by an attacker to send a series of ill-formed connection requests, filling up the incoming connection queue on web servers, database servers, or other back-end machines. The half-completed connection requests remain enqueued until they timeout, and of course more requests arrive during that interval, keeping the server’s network stack overloaded.

CGI or web interface abuse. A script or utility accessed by your web server is called repeatedly, or called with arguments that cause it to consume CPU, memory, or network resources. In many cases, these attacks may be signs of a security intrusion, with an attacker looking for a mail relay or other unsecured entry point. The SQLslammer attack in early 2003 fit this model loosely, with incoming requests attempting to access back-end database servers through well-known interfaces on publicfacing web sites.

DoS attacks are likely to become more common. Vendors frequently offer patches or workarounds for the more common DoS approaches; for example, one way to combat the network connection flooding attack is to put in-progress connections in a separate queue and then prune the badly behaved connections out of the queue at shorter intervals, freeing up connection slots for real work.

Confidence in Your Measurements

Measuring the availability of a system, or making investment decisions aimed at improving that availability, depends on having confidence in the data you collect. The relative value of your measurements is colored by three factors: Is the metric the right one, is the metric valid over time, and can you use the metric to drive a process of improvement?

legal notice

Our website is not responsible for the information contained by this article. Web-articles is a free articles resource.
Suggestion: If you need fresh, daily updated content for your website, feel free to use our service. Click here for more information.

Useful tools and features

Link to this article from your page    Send this article to you or to a friend
If you like this article (tutorial), please link to it from your web page using the information above.

related articles

1. Direct and Indirect Costs of Downtime
The Costs of Downtime The only way to convince the people who control the purse strings that there is value in protecting uptime is to approach the problem from a dollars-andcents perspective. In this section, we provide some ammunition that should help make the case to even the most stubborn manager. Direct Costs of Downtime The most obvious cost of downtime is probably not the most expensive one: lost user productivity. The actual cost of that downtime is dependent upon what work your user...

2. COST OF DOWNTIME IS NOT A CONSTANT
Further complicating matters is the fact that the cost of downtime is not a constant. We will assume it to be constant for the purposes of our calculations (it makes them much, much simpler), but in reality, the cost of downtime increases as the duration of an outage increases. Consider again the effects of downtime on an e-commerce site. If the site suffers a brief outage (a few seconds), the cost will be minimal, perhaps even negligible. An outage of a minute or less probably will not affect business too badly: All...

3. The Politics of Availability
To persuade others of the value of your ideas, it is necessary to delve into the dark, shadowy world of organizational politics. Fundamentally, this means that you achieve your goals by helping (or if you aren’t particularly scrupulous, appearing to help) others around you achieve their goals, so that they then help you achieve yours. Start Inside Probably the best way to convince others of the value of your ideas is to first convince them that your ideas will help them achieve their own goals. To do that, yo...

4. Rational case that explains in nontechnical terms
Start Building the Case Once you have learned what you need to know, the next step is to begin to put together a calm and rational case that explains in nontechnical terms what the vulnerabilities, risks, and costs are. The case must include a discussion of the risks of inaction. Find Allies Ask around your organization. Look for friends and colleagues who share your concerns. Maybe you’ll find someone who has tried to convince management of something in the past. At the very l...

5. 20 Key High Availability Design Principles 1
#20: Don’t Be Cheap One of the basic rules of life in the 21st century is that quality costs money. Whether you are buying ice cream (“Do I want the Ben & Jerry’s at $4.00 per pint, or the store brand with the little ice crystals in it for 79 cents a gallon?”), cars (Rolls-Royce or Saturn), or barbecue grills, the higher the quality, the more it costs. The decision to implement availability is a business decision. It comes down to dollars and cents. If you look at the business decis...

6. Consolidate Your Servers
#16: Consolidate Your Servers   The trend over the last few years in many computing circles has been to consolidate servers that run similar services. Instead of having many small singlepurpose machines or lots of machines running a single instance of a database, companies are rolling them together and putting all the relevant applications onto one or more larger servers with a capacity greater than all of the replaced servers. This setup can significantly reduce the complexity of your computing envir...

7. Documentation provides audit trails to work that has been completed
#13: Document Everything The importance of good, solid documentation simply cannot be overstated. Documentation provides audit trails to work that has been completed. It provides guides for future system administrators so that they can take over systems that existed before they arrived. It can provide the system administrator and his management with accomplishment records. (These can be very handy at personnel review time.) Good documentation can also help with problem solving. 1. The first audience is the...