storage selection for a database

It is amazing how friendships help business relations. I have a friend who is working at NetApps and we are both calling on the same account. We started to recommend conflicting information to the customer based on the knowledge of our own worlds. Once we talked, I understand his perspective and I think that he understands mine. We are now talking about how we can jointly present to this customer and even coordinate our messages with each other and through our recommended consulting organization.

Here’s the situation. We have a customer that is running an old version of Oracle (9i) on an old operating system (Solaris 8) on old hardware (Fujitsu some older model) attached to a NetApps filer (three generations ago). The customer is experiencing significant performance problems. They have two applications that they are running against the database. One is an ERP system that has a relatively low transaction count. The other is a business objects application that is consuming all of the resources. Normally this would not be a problem but the business objects app is consuming all of the resources and causing the ERP system to lock up and interrupt business.

Simple solution right? Find the bottleneck and fix it. Well, it isn’t that simple. NetApps did a detailed analysis and found that the disk is 90% busy. The network is 60% busy. If they double the disk speed and reduce the latency by half, the network will become a problem. Fortunately, the business objects app is configured to be a read only analysis of the database. If we can split the database out, we can significantly improve the ERP performance.

Given a computer science/electrical engineering background, my first impression was that NFS mounting the data was the biggest problem. If they could move to direct attach disks, it would improve performance. This would get rid of the network loading and allow them to purchase faster disks at the same time. What I didn’t realize was that they would also need to purchase a fiber channel switch and learn how to manage this as well. I did know that NFS management was easy, I’ve been doing it for years. I didn’t realize that SAN management is an art to itself and is significantly more complex and expensive to manage. Looking into this subject a little more I was amazed to find that NFS does not substantially effect performance if tuned properly. An NFS system is only 12% slower than a fiber attached system and 2% slower than iSCSI.

The recommendation that we are moving forward with is a second database system that replicates the primary data using DataGuard with physical replication. Once this is done, redirect the business object app to hit the standby database and purchase a second netapp filer to offload the original system. The added benefit is that this configuration is the foundation for disaster recovery since both apps are currently running on one machine. Even though the standby system will be in the same datacenter it will provide redundancy for the primary system.

The performance results and complexity of SAN storage amazed me. Both were contrary to my thinking. Once I looked at it from this perspective, it makes sense. Live and learn…..