Jatan's Corner

Tuesday, 29 December 2009

Case Study : Java Web Application Production Issue (Part-III)

As we saw in the 2nd part of this series, the real problem was fetching large no. of records, which in turn forcing JVM GC to kick in to free up the heap memory.

In order to fix this problem, we decided to implement pagination in search methods of the system. Till that point of time, the pagination was there only on UI side but from database all the applicable records were being fetched !

Pagination using iBatis

We have been lucky to use iBatis as our Data Access component. It provides very transparent and easy pagination functionality via its queryForList overloaded methods of SqlMapClient.

So we specifically used,

public java.util.List queryForList(java.lang.String id,
                                   int skip,
                                   int max)
                            throws java.sql.SQLException

to pass the pagination related information to database while selecting the records for specific search criteria.

Two important parameters for pagination are calculated as,

skip : (pageNo – 1) * records_per_page

max : records_per_page

This change drastically reduced the no. of records being fetched from around 4000 to 16 (records per page). However this solution came with a minor overhead.

The overhead was to fire one query with same search criteria to compute the total no. of records that could be returned by this criteria. This count is required to calculate the total no. of pages, which will later be shown in data grid. However it is not a great overhead and database work quite fast to return the count.

This optimization reduced the need of memory by great amount and system continues to have ample heap memory for new objects.

Learning

This debugging exercise came as a great learning to me. It took around 3 days for finding the problem, understanding the reason, and fixing it.

Technically, I learnt quite a few new things,

Remote debugging of web application, using JPDA and Eclipse
jmap and jhat profiling tools.
pagination with iBatis

This case study will always encourage me to do large data set testing of whatever application we would develop in future. Now I believe the fact that large no. of data records may reveal some hidden bugs and such testing would ensure the removal of such bugs and ultimately lowering the maintenance hassles and in turn the cost.

Monday, 28 December 2009

Case Study : Java Web Application Production Issue (Part-II)

As I deduced in previous post, the root problem was lying somewhere in our code. The typicality of this problem was that it was happening only on our production server but not on our local development / test machines ! This made my life difficult and finally I decided to do remote debugging of the application using JPDA and Eclipse.

Remote Debugging On Tomcat + Eclipse

After searching through Google / Bing, I came across a few pages about how to setup Tomcat for remote debugging. I spent quite a few hours following the various instructions and with many failed attempts. Finally I could find out how to enable remote debugging by reading catalina.sh (Tomcat invocation / termination shellscript).

Essentially, the Java has standardized the debugging infrastructure in name of JPDA. So most of the Java based application servers and IDEs support this infrastructure to make poor developer’s life easy. In order to enable Tomcat for remote debugging, you just need to start Tomcat via following command,

$TOMCAT_HOME/bin/catalina.sh jpda start

Generally we use startup.sh to start and shutdown.sh to stop the tomcat server.

Here observe the first command line arg to catalina.sh (i.e. jpda), this will instruct catalina.sh to setup the remote debugging and bind the JPDA listener at 8000 TCP port. You can always change this 8000 port but then you might have to setup JPDA_ADDRESS environment variable.

Before starting Tomcat server, please make sure that 8000 is open in firewall for inbound connections. Now we will see what to do in Eclipse (it’s fairly simple),

Open the code workspace
Go to Run → Debug Configurations
Create new Remote Java Application configuration, as shown below,
Give valid project name.
In Connection Properties, enter your host server’s name / IP and port is 8000 (or whatever you set)
Your Debug button should be enabled (mine is disabled because I have given wrong Project Name :) )
Click on Debug to start debugging.

Replicating the issue in debugging

After having remote debugging on, I could simulate the issue by opening the application simultaneously in two different browsers (i.e. Firefox & IE). I initiated blank search request first on Firefox and then on IE. The breakpoints were kept in DAO layer to see exactly where it is getting stuck. The request thread initiated from Firefox stopped at breakpoint and after doing Step Over debugging, the request successfully fetched 4000 records and returned to web layer. Whereas the request thread initiated via IE came to DAO layer and went for fetching the records and after some time the thread just died without any trace. The debugging pointer just went away and no trace within Stack Trace / Console window. This was unique, I had never anticipated that the request thread would die like this.

After this first faceoff, I studied the logs and the record fetching stopped at roughly same point where it stopped in earlier attempts. I was going nowhere with this issue and suddenly it struck to my mind that what about using profiling tools. So after a few rounds of searching, I settled on two nifty profiling tools (bundled with JSDK), i.e. jmap (Java Memory Map) and jhat (Java Heap Analysis Tool).

Using Profiling Tools (First breakthrough…)

I will give a very short introduction of these tools. These profiling tools are forming a pair like data generator and report presentation tools. You guessed it right, jmap is a data generator tool whereas jhat is report presentation tool.

jmap dumps the entire heap memory objects into binary file with all the reference information. In other words it dumps entire memory allocated and used by JVM to store the run time data structures of objects.

jhat on other hand, reads the binary file created by jmap and works as a parsing and presentation front-end. It has inbuilt HTTP server and it generally listens to 7000 TCP port. jhat provides HTML based interface to access the information about the complex data structures of objects (Object graphs) present in dump file. It also supports Object Query Language (however I don’t know how to use that :) ).

So I restarted Tomcat and replayed the same scenario with remote debugging on. When the second request stuck, I ran jmap against the Tomcat Process and it generated the dump file. I executed following command on server,

jmap –dump:file=webappheap.dmp 42961

Here 42961 is the process id of Tomcat process. You may execute following Unix command to get the process id of Tomcat,

ps –xf | grep java

I download the file on my local machine and supplied it to jhat using following command,

jhat webappheap.dmp

Browser View:

After going through the object graphs presented, I found the objects allocated during search. After going through the entire object graph of any particular object, I roughly calculated the size of object was around 4Kb and there were around 8000 objects were being created which would roughly equal to 32 Mb of memory. This figure jolted me. It was too much of memory for just two search requests. Now the clouds of confusion were quite clear. I deduced the reason of the second thread’s termination. It was Garbage Collection which would kick-in because of very limited memory available to server and the request required atleast 16 Mb. So in order to make the future provision for free memory, JVM was invoking GC to free up the memory and this was causing the current thread to go to halt state and ultimately timeout !!

I checked the –Xmx parameter for Tomcat and it was just 64 Mb !! and my two requests were eating up 32 Mb memory (50% of entire Tomcat server).

Quick Fix

This is definitely because of our short sightedness and bit of bad programming. In order to make quick fix to this problem, the easiest way was to increase the –Xmx parameter’s value from 64Mb to at least 192 Mb. This will accommodate at least around 9 to 10 simultaneous full search requests without GC invocation.

Setting –Xmx parameter for Tomcat

In order to set the custom –Xmx parameter for Tomcat, you may set JAVA_OPTS environment variable or you can modify catalina.sh and set JAVA_OPTS like,

JAVA_OPTS=”-Xmx192m”

However this will not work if you are starting your tomcat through services in CentOS / Ubuntu. For example, if you run following command to start Tomcat server,

/sbin/service tomcat6 start

then it will neither take JAVA_OPTS environment variable into consideration nor will use catalina.sh to start the server. It uses its own application wrapper to start / stop the server. This application wrapper script would be lying at /etc/init.d/tomcat6

So for such cases you would have to change the tomcat configuration file at /etc/tomcat6/tomcat.conf. The following property needs to be changed,

# Maximum Java Heap Size (in MB)
wrapper.java.maxmemory=192

Restart the server and it will start allocating heap memory up to 192 Mb. You may execute

ps –xf –U tomcat | grep java

to check the java process executed under tomcat user. The output would look like,

3557 ? Sl 1:24 \_ /usr/java/j2sdk/bin/java -server -Dorg.apache.jasper.compiler.Parser.STRICT_QUOTE_ESCAPING=false -Dcatalina.base=/opt/tomcat6 -Dcatalina.home=/opt/tomcat6 -Xms64m -Xmx192m -Djava.library.path=/opt/tomcat6/lib

Current Status

Currently the application is working fine even with larger search requests but we need to fix the code so that it would not exploit the server memory. The memory is a precious resource (at least in our server environment), it should not be wasted like that.

In next (and last) part, I will show what optimizations we made to code to reduce the memory load.

Friday, 25 December 2009

Case Study : Java Web Application Production Issue (Part-I)

This is the case study about the Java based Web application, which we have had developed for one of our clients. This is a business application and used by them to maintain their day to day transactions and other reference data.

Some days back they reported a weird issue wherein system was failing to respond in case of search screen. After observing what they were doing, we came to conclusion that system used to stop responding once the user made such search request, which fetched large no. of records ( > 4000 records). Although the application displays the records in paginated grid, the records are fetched upfront and kept in a Session until user closes the search screen or makes a new request.

Strangely, the large search was working fine but the subsequent search requests were failing miserably. In order to debug this problem we progressed step-by-step. Although many of our steps and assumptions were proven wrong ! it was a good learning.

Before going ahead a summary of the production environment,

Production Server Environment

Components	Description
Server Type	Shared Private Hosting (VPS)
Server Config	256 MB RAM burstable to 333 MB
Server OS	Cent OS 32-Bit
JRE	Java SE 6
Tomcat	6.1.18
MySQL	5.1

Blame it on DBMS

The application is using MySQL as backend database and iBatis 2.3.3 as a Data access component. This combination served us a very good purpose of easy and flexible data access design.

In order to start debugging, first thing I did was to turn on the detailed logging to track down the exact failure point. After setting up the DEBUG level logging, I re-ran the scenario and carefully observed the logs.

The search requests I made were,

Blank search (with no criteria)
Like search on code field (where code like ‘%A%’)

Since the no. of records were quite high, and logging level was more granular system started taking more time (because it was logging each and every record into log file). After careful observation, I could make out that while serving second request, the system stopped fetched the records through java.sql.ResultSet. In other words, I could see that the last record, which was logged into the file was around 2000 ! On front-end the system just stopped loading the records in grid (it was an Ajax request). For information, we have used fabulous jqGrid to render data-grid on the pages.

I assumed that it should be a problem with MySQL which just gets mad for search like where code like ‘%A%’.

First thing I did, was putting up a question on StackOverflow. In a few moments people replied and the common tone of all the responses was don’t use ‘%A%’ but use ‘A%’ or ‘%A’ as criteria. Those answers made my doubt more strong on MySQL.

So I finally thought to test these SQLs using phpMyAdmin, which I have installed on the server for remote access to db. In my amazement, phpMyAdmin took around 100 - 300 millisecons to get the results ! This really shook my doubt on MySQL. Although phpMyAdmin was putting LIMIT / TOP keywords, the same query executed through MySQL prompt did not take much longer time !

I was completely wrong and stupid in underestimating the strength of MySQL therefore my next target became MySQL jdbc driver and iBatis ! So to resolve the doubt, I took the most recent versions of both of them and put them into application, but they did not solve the problem. Thus they were also out of the “criminal list”.

Now I was sure that there was something wrong with our code or some problem with the environment.

Friday, 11 December 2009

Welcome Intel Core i7 and Windows 7 !!!

If I take DATEDIFF of the current post and the last post on this blog, it would come around 488 days. I know, it has been long silence partly due to my carelessness attitude towards maintaining the blog and some work reasons too.

So I am dedicating this blog post to lovely thing called Windows 7. I just bought Dell XPS Studio 16 (with Intel Core i7 processor and ATI Radion 4670). The installed OS was Windows 7. After using it for around a day, I must say this is the best thing came out from MS camp after .NET.

Do you see some rhythm here !! Core i7 and Windows 7 :). I just don’t see, I can feel it too… Intel made fabulous advances in processor design. Just imagine this tiny thing has 4 cores and each core is capable to run two threads (Hyper-Threading tech.). See the below snap of the task manager,

If you are hardware junkie like me, you will feel the heart pounding at seeing 8 CPUs phew !!!

It took me around total 20 to 25 mins for all the tasks listed below,

Booting PC with Windows 7 CD
Starting fresh installation of Windows 7
Dropping the current partition and creating 2 different partitions
Installing OS
Selecting localization information.
Installing necessary drivers
Formatting the unused space, which was 100 GB. You won’t believe, it just took 30-40 seconds to format 100 GB with NTFS !!!

One diversion here,

If you have Dell XPS Studio 1645 and you have reinstalled the OS then you might get warning for unrecognized “base system device” in device manager of Windows 7 control panel. The reason is Ricoh Card Reader. You need to install device driver from Dell Utilities CD for Ricoh Card Reader and that warning gone.

See the beauty of Windows 7,

Somebody said, beauty lies in person’s eye. I would say for Windows 7 it is not correct, it is really a beauty.

Saturday, 9 August 2008

Revived Again :)

Yesss !! after around 22 days I am back ... I know it's very bad to remain silent for these much days in the world of blogosphere. But you know what, I am bit lazy person and it needs some "kick" to get me writing the blog every day or two. Unfortunately for such "kick" I need to depend on myself and there's what the catch is.

Anyway, in this entry I will sum up the important activities till 9th August 2008 in very macro form .. obviously :)

Started reading Alchemist but after completing 86th page, I came to know that the book was wrongly bound by publisher ... the surprising thing is the book restarts (right with preface) and the last page of the book is 86. So I have to go to book vendor and try to get the corrected edition (if I am lucky !)
Went to Maharashtra Stamp Duty collection office at MMRDA, BKC to check the status of my pending refund of Stamp Duty. I really got impressed that without any "chai-pani" they informed me that application is sent to head office for further approval. Let's hope it really works out.
My relatives (in-laws, those are : twity (mira) and her parents) had visited us and it was real fund. Playing with innocent and lovely mira (who's just 1.5 year old), was the best thing I had in recent days. We visited Tikuji Ni Vadi and it's value for money.
Did quite a bit shopping for my proposed US trip.
After some days, my US trip got canceled !!! (who will pay my CC bills now ....)
Repaid some amount towards my HDFC loan so that interest rate hike would affect less. RBI should at least think that what about existing customers first you banks allowed us to get loans at lower rates and once we are in trap ... you radically increased rates !!!

Ok so these are the personal memoirs for past days...

Technorati Tags: personal

Tuesday, 15 July 2008

Tough Day

Okey !! so I am again starting the blogging activity. Hope this should continue and I do get time as well as inspiration from within to keep it continue.

Anyway today, 15 Jul 2008, the day started with light mood and cool breeze of rainy air however it started turning difficult as the rain started pouring heavily when I was riding the bike to office. I was supposed to reach office by 9:45 A.M. but reached bit late (around 9:55 A.M.), hence I parked the bike and packed my rain coat hastily and just reached in office with bit wet cloths. All onshore guys were waiting for me for telecon, and I was substantiating IST (Indian Standard Time) idiom.

Office day went fine. I did some experiments with PHP script, checked out everything from changed repositories, setup build environments on test server, etc... (sort of admin. work rather than development). Oh yes ... I did big time formatting of Word documents. My colleagues had prepared some technical documentation in Word, which was just a bundle of paragraphs and images. There was no formatting and consistency. Hence I spent 2-3 hours formatting them by creating and applying various styles, creating Table Of Contents, etc... By the way, this was the first time I created TOC :) and after pretty much finding, I found how the TOC automatically takes the titles (hint : use Level property in Paragraph style).

So I was ready to leave by 6:15 pm and even confirmed with Sweety that I will pick her up from Seepz. So happily I reached at elevator, commanded elevator to take me to parking lot, reached to bike, donned rain coat (as it was raining outside), and .... suspense ... I was finding no keys !!! My bike keys were lost. I checked my bag twice and myself thrice (including rain coat pockets) but in vain.

Then suddenly I realized that I might have forgotten them on my desk, so I called up my friend there and he said negative... I was bit tense now because Sweety would have been waiting for me and here I was looking for my keys and running here and there, bending myself to find the keys below parked cars / bikes. But my efforts did not pay up and lastly I had to call her so that she can leave on her own and here I could find out some way out. I thought that I might have dropped them in elevator. So I checked them (the one I went up in and the one I came down). Still no clue. I asked some janitors and other guys but NO NO NO...

At last I saw one guy who seemed to be kind of looking after the parking lot. I asked him and first time a guy who replied positively. He said that somebody of them had found two keys ("HURRAY") so he asked me to wait and he wen to get the keys from his closet. He did come with two keys and I could see the ignition key even if he was around 15 feet away. But (the BIG BUT) the petrol key was still missing. The second key was of someone else. So it gave me mix feelings (pleasure of getting ignition key and pain of loosing petrol lock key).

Now the question was what should I do to get bike running. So I bet on the remaining fuel in the carburetor. I applied the choke, started the bike, and gave full accelerator so that it can climb four slopes to come out of the building. And like Chetak, it did come out of "Castle". With that remaining fuel, I went around 100 meters ahead and then I had to drag the bike... till I get some key maker (rather key hacker).

Fortunately a kid working at some garage offered me a help and he magically used his "master key" to open bike's petrol lock. He solved the problem in just 30 seconds... I thanked God and that kid. To settle my Karmic account, I offered him Rs. 10 that he was not eager to take but he took (I felt satisfied). Since bike was running by then I started riding it towards home and along the way I found my colleague waiting for bus, so I offered him a lift (crediting Karmic account ;) ).

So end of the day, I reached home and Sweety had already prepared tasty dinner so I relished it and watched an interesting movie "Fasted Indian".

Ok now ... it's time to go to bed.