Managing servers and instances in the cloud

Managing servers and instances has been an ongoing issue since the introduction of the first computer. Recently with the advent of virtualization the idea of a management console to control what processors are running what services and what storage is allocated to what operating system has gained popularity. Many people are familiar with VMWare VSphere where you get a view of processors. We get a view of a server and can see virtual images deployed on this server. We can see how well the resources (memory, cpu, and disk) are being utilized. We can allocate more or less resources since this is a dynamic allocation and make sure that we are not over allocating resources and wasting them or under allocating them and causing applications to run slower.

In this example we can see that we have two processors, 2 GB of memory, and just under 300 GB of disk on this computer. We have five virtual machines running on this computer and can dive into each operating system and look at what operating system is installed and how the limited resources are allocated and utilized. What we can’t see is what applications are installed and how the applications are running. For example, is the Windows Home Server 2011 running an Apache Web Server and how many hits did the web server get in the past four days? Monitoring tools beg the question of what are you monitoring. If you are managing limited resources and making sure that you have not over or under allocated services, tools like VSphere are excellent tools. Unfortunately, you will need other tools to dive into another tool. EMC, for example, has a storage manager that lets you look not only at a logical unit level but a controller and disk level. It understands VMWare and lets you look at how disks are related to virtualization engines and how they are consuming resources.

Again, this is a very good tool to look at how well a disk is performing, how well data is laid out across spindles, and how well your data network is being transmitted between disk and server. We can see hot spots. We can see disks that are over and under utilized. We can manage a scarce resource and make sure that it is properly utilized.

When we talk about monitoring we need to shift our thought process. Yes, it is important to manage compute, memory, and storage resources but it is also important to realize that these resources are commodities. If we run low, we get more. If we use too much we are wasting resources. We should be able to automate allocation of resources and size up or size down resources without manual monitoring. What we are really interested in is how well is our company running. If we are a university we might be interested in the latency of delivering online video classes. We might be interested in how many classes are being added to a student schedule during registration. If we are a ticket retailer we might be interested in how many tickets were requested and paid for on a minute by minute basis. Note that we are not talking about how well a disk drive is allocated or if we have enough processors allocated to a virtual machine, we are talking in term of business terms. We are looking at tying revenue generating services back to computer resources and trying to figure out what is causing a problem. In the online video classroom example, we might have our processors allocated properly, storage tuned to the last IOP, and memory allocated to buffer data and reduce disk reads. If we are on the same network as the athletic department and our basketball team made it to the elite eight during March madness and the athletic department live streams the game on the same network as our classroom servers our classes will be offline due to demand to watch the basketball game. Tools from EMC and VMware will show that everything is working fine and life is good. Meanwhile the help desk is getting calls from students off campus that can’t access their assignments during midterms and their Thursday class is not available. What we need is a monitoring system that can look at systems and incorporate more than just processor and disk. What we need is a tool that can look at systems and services and not just resources. We would like to look at the video distribution system and be able to dive into the disk, network, or processor and see what the bottleneck is and fix it quickly.

Oracle released a tool years ago called Enterprise Manager. The tool started out as a database monitoring tool that allowed you to dive into sql calls and figure out why it was taking longer than necessary. With acquisitions of companies like BEA and Sun Microsystems the tool expanded to look at how Java was performing inside a WebLogic server and how disk drives were performing that were serving up requests for the database and WebLogic server. Acquisitions of companies like JD Edwards and PeopleSoft drove the monitoring tools in the opposite direction and screens showing how many purchase orders were being processed on an hourly basis were suddenly available. You could look at what was the bottleneck in closing your books for the end of month reconciliation. Was it a manual process waiting on a report to drop into a directory or was it a sql statement that was taking minutes rather than seconds to complete? You could start looking at a process like purchase orders and dive into a database to see if a table was reaching storage limits as well as figure out that someone recently patched the database which caused an index to not look at a new column that was created and searches are now going against this column so select statements are doing a full table scan rather than using an index to report answers quicker. Adding more storage in this case will be a waste of time. Yes, we are running out of storage on a table but the real issue is we need to re-index the database or execute a new sql execution plan. Below is a screen shot of how well a database is performing with links to look at all the sub-components of the database.

Books have been written on Enterprise Manager. We are not going to cover everything in this blog to make you an expert on the subject.

There are also a number of blogs related to Enterprise Manager

This is a partial list of blogs returned by a Google search. I am sure I missed a few. Note that the list of books and blogs is not a short list. There are classes offered by Oracle University that you can take virtually or in a classroom (both cost money).

The way that Oracle Enterprise Manager is paid for is simple. The base system is free and you pay for the options that you want to use. Unfortunately, the Technology Price Guide is not very clear as to what is and is not Enterprise Manager and what is an option on the database. For example, on page 7, most of the management packs are listed. If you want diagnostics for the database you will need to license your database at $7,500 per processor and not Enterprise Manager. You can license at $150 per named user but the licensing metrics for your database need to match the licensing for you management pack. You could have a two processor license for production and a 25 named user license for development and testing so you will need to blend these licenses into Enterprise Manager with the management packs. Diagnostics is specifically confusing because you enable or disable this feature in Enterprise Manager and not in the database. The telemetry data is being collected for the database but the reporting on the results of the analysis is not being done in the database. You could turn on the reporting in Enterprise Manager without involving the DBA thus incurring an additional license fee that you had not paid for. There is no license key or email that is sent to Oracle saying that you enabled the license it is a simple checkbox in Enterprise Manager that says turn on diagnostic reporting. In recent versions a warning screen pops up telling you that this is not a free feature. In OEM 10g the feature was turned on by default and you had to turn it off. This has changed in recent releases. If you try to turn this feature on when connecting to an Enterprise Edition in the Oracle Public Cloud you will get a feature not available message. You need to go with High Performance or Extreme Performance edition of the database to get the diagnostics enabled.

There are also management packs for Oracle Applications and the pricing for these products can be found in the Oracle Applications Price List. You need to search for the word “packs” to find the price of the management packs in this list. You can get a list of all the management packs from the Oracle Tech Network page for Enterprise Manager

It is important to note that the Enterprise Manager that runs in your data center monitoring your servers and Oracle hardware and software products is the same tool that you can use to monitor and manage PaaS and IaaS resources in the Oracle Public Cloud. You can connect to the instance in the cloud using ssh and read the telemetry from the cloud instance as if it were installed on one of your servers. You can use extensions to the latest version of Enterprise Manager, 13c, to clone a pluggable database instance from your on site installation to a cloud instance.

You can also setup reporting and self service requests to have end users ask for a new service to be provisioned either on site or in the cloud. Below is a screen shot of how to do this for a database. We could do something similar for a WebLogic server, an Apache Web server, a PeopleSoft instance for dev/test, or any layer of the Oracle stack.

In summary, selection of a management tool is important. Tools are good to understand and properly use. At some point you need to step back and ask what is the questions that I need answers to. Am I diving too deep on trying to optimize something that is not that worth deep analysis? Could I automate this and not have to monitor it at all? If I run out of processing power does it make sense to automatically scale up the number of processors? Should I scale out by spinning up more web servers? Do I need to re-architect my network topology to isolate disk traffic from client traffic? If I generate a report who will consume the results? Is the report for someone in IT? Purchasing? The process owner? Is it a technology or financial report? Products like Enterprise Manager allow you to generate all of these reports using different management extensions. My suggestion is to look at some of the introductory videos on the Oracle Tech Network to get an introduction to the problem that you are trying to solve then figure out how much it will cost to measure what is important to you.