Friday 25 December, 2009

Case Study : Java Web Application Production Issue (Part-I)

This is the case study about the Java based Web application, which we have had developed for one of our clients. This is a business application and used by them to maintain their day to day transactions and other reference data.

Some days back they reported a weird issue wherein system was failing to respond in case of search screen. After observing what they were doing, we came to conclusion that system used to stop responding once the user made such search request, which fetched large no. of records ( > 4000 records). Although the application displays the records in paginated grid, the records are fetched upfront and kept in a Session until user closes the search screen or makes a new request.

Strangely, the large search was working fine but the subsequent search requests were failing miserably. In order to debug this problem we progressed step-by-step. Although many of our steps and assumptions were proven wrong ! it was a good learning.

Before going ahead a summary of the production environment,

Production Server Environment

Components

Description

Server Type Shared Private Hosting (VPS)
Server Config 256 MB RAM burstable to 333 MB
Server OS Cent OS 32-Bit
JRE Java SE 6
Tomcat 6.1.18
MySQL 5.1
Blame it on DBMS

The application is using MySQL as backend database and iBatis 2.3.3 as a Data access component. This combination served us a very good purpose of easy and flexible data access design.

In order to start debugging, first thing I did was to turn on the detailed logging to track down the exact failure point. After setting up the DEBUG level logging, I re-ran the scenario and carefully observed the logs.

The search requests I made were,

  • Blank search (with no criteria)
  • Like search on code field (where code like ‘%A%’)

Since the no. of records were quite high, and logging level was more granular system started taking more time (because it was logging each and every record into log file). After careful observation, I could make out that while serving second request, the system stopped fetched the records through java.sql.ResultSet. In other words, I could see that the last record, which was logged into the file was around 2000 ! On front-end the system just stopped loading the records in grid (it was an Ajax request). For information, we have used fabulous jqGrid to render data-grid on the pages.

I assumed that it should be a problem with MySQL which just gets mad for search like where code like ‘%A%’.

First thing I did, was putting up a question on StackOverflow. In a few moments people replied and the common tone of all the responses was don’t use ‘%A%’ but use ‘A%’ or ‘%A’ as criteria. Those answers made my doubt more strong on MySQL.

So I finally thought to test these SQLs using phpMyAdmin, which I have installed on the server for remote access to db. In my amazement, phpMyAdmin took around 100 - 300 millisecons to get the results ! This really shook my doubt on MySQL. Although phpMyAdmin was putting LIMIT / TOP keywords, the same query executed through MySQL prompt did not take much longer time !

I was completely wrong and stupid in underestimating the strength of MySQL therefore my next target became MySQL jdbc driver and iBatis ! So to resolve the doubt, I took the most recent versions of both of them and put them into application, but they did not solve the problem. Thus they were also out of the “criminal list”.

Now I was sure that there was something wrong with our code or some problem with the environment.

No comments: