do I really need Veritas Cluster software or Sun Cluster software

A common question these days comes up when we start taking about replacing older multi-processor systems with newer systems. Many people want to look at getting rid of their cluster software, either Veritas or Sun cluster software, because the maintenance prices are relatively high. The question comes up, do I need to purchase RAC to use the Oracle Clusterware (http://www.oracle.com/technology/products/database/clusterware/index.html). In short, the answer is no. You do not need to purchase RAC software to use Oracle Clusterware. It is a free software package that you can download for most operating systems.

The next question that comes up is how does it compare to Veritas and the Sun suite. I typically refer everyone asking this question to:
http://www.oracle.com/technology/products/database/clusterware/index.html
a whitepaper
an example of how to protect non-Oracle apps
an article on how to protect an application server
sample code to keep an http server up and running
– and a support forum

The net of all these references is that you can create a farm like instance of an application and run it across multiple computers. These computers run an instance of the application and restart the application if it halts or fails on one computer. It is important to note that it does not maintain the state of the application but does make sure that the application runs on at least one of the servers in the farm. If, for example, you want to make sure that you have a web server running that is presenting static information, you can attach multiple computers to an nfs mounted file system or network share disk and run the httpd on one of the computers in the http farm. If the httpd fails, it is restarted on the same or different computer. If this web server hosts dynamic information like a shopping cart, the application needs to make sure that state is stored in the database repository and not the memory of the web server or the shopping cart will be lost. If you have an application that keeps state in memory you will need something like Coherence to share memory between instance of an application server. Some prime examples that you typically want to keep up and running is an ldap server, a database listener, a dns server. Any service that connects to a repository to look up data and report the results back to the client is what is desired.

What is the key difference between a server farm and a cluster? A server farm is a group of servers that respond to a request in a round robin or load balanced configuration. This typically requires an expensive router to manage this service. Cluster ware is a software management system that performs the same function using the operating system and resources available to it to provide high availability for a service.

The key sub-components that make this software work are: virtual IP addresses, cluster ready services, and cluster synchronization services. A virtual IP address (VIP) is needed so that application connect to a single IP address and not a physical machine in the cluster. The cluster software manages which physical machine answers the virtual address. The cluster ready services launches and monitors applications on the various nodes to make sure that requests are being processed and answered. The cluster synchronization services manage shared resources like a disk or a network resource and acts like an arbitrator and broker if any node gets confused.

In general the cluster software allows for loosely coupled systems to act as a tightly coupled system for a given application. The three components run on all nodes and communicate with each other to manage the application running on one or multiple nodes. This configuration can respond to load balancing requests or just make sure that at least one node is available to respond to requests. Given that is a free software package, it is worthy looking at as a general purpose tool instead of paying for services from Sun or Veritas.