Data Protection Down Pat – Page 5 – Protecting data explained

Database as a Service

Today we are going to dive into Database as a Service offered from Oracle. This product is the same product offered by Oracle as a perpetual processor license or perpetual named user license for running database software in your data center. The key different is that the database is provisioned onto a Linux server in the cloud and rather than paying $47,500 for a processor license and 22% annually after that, you pay for the database services on an hourly or monthly basis. If you have a problem that needs only a few weeks, you pay for the service for a few weeks. If you have a problem that takes a very large number of processors but for a very short period of time, you can effectively lease the large number of processors in the cloud and purchase a much smaller number of processors in your data center. Think of a student registration system. If you have 20K-30K students that need to log into a class registration system, you need to size this server for the peak number of students going through the system. In our example, we might need an 8 core system to handle the load during class registration. Outside the two or three weeks for registration, this system sits idle at less than 10% utilization because it is used to record and report grades during the semester. Rather than paying $47.5K times 8 cores times 0.5 for an x86 or Sparc server ($190K), we only have to pay $47.5K times 2 cores times 0.5 for x86 or Sparc cores ($47.5K) and lease the additional processors in the cloud for a month at $3K/core/month ($24K). We effectively reduced the cost from $190K to $71.5K by using the cloud for the peak period. Even if we do this three times during the year the price is $119.5K which is a cost savings of $70.5K. The second year we would be required to pay $41.8K in support cost for the larger server. By using the smaller server we drop the support cost to $10.5K. This effectively pays for leasing a third of the cloud resources by using a smaller server and bursting to the cloud for high peak utilization.

Now that we have looked at one of our use cases and the cost savings associated with using the cloud for peak utilization and reducing the cost of on servers and software in our data center, let’s dive into the pricing and configuration of Database as a Service (DBaaS) offered by Oracle in the public cloud services. If we click on the Platform -> Database menu we see the following page.

If we scroll down to the bottom we see that there are effectively three services that we can use in the public cloud. The first is Database Schema as a Service. This allows you to access a database through a web interface and write programs to read and present data to the users. This is the traditional Application Express interface or APEX interface that was introduced in Oracle 9. This is a shared service where you are given a database instance that is shared with other users. The second service is Database as a Service. This is the 11g or 12c database installed on a Linux installation in the cloud. This is a full installation of the database with ssh access to the operating system and sqlplus access to the database from a client system. The third service is Exadata as a Service. This is the Oracle database on dedicated hardware that is optimized to run the Oracle database.

The Schema as a Service is also known as Application Express. If you have never played with apex.oracle.com, click on the link and register for a free account. You can create an instance, a database schema, and store upto 10 MB or 25 MB of data for free. If you want to purchase a larger storage amount it is sold in 5 GB, 20 GB, or 50 GB increments.

The 10 or 25 MB instance is free. The 5 GB instance is $175/month. The 20 GB is $900/month, and the 50 GB is $2,000/month.

Tomorrow we will dive a little deeper into Schema as a Service. In summary, this is a database instance that can contain multiple tables and has an application development/application web front end allowing you to access the database. You can not attach with sqlplus. You can not attach with port 1521. You can not put a Java or PHP front end in front of your database and use it as a back end repository. You can expose database data through applications and REST api interfaces. This instance is shared on a single computer with other instances. You can have multiple instances on the same computer and the login give you access to your applications and your data in your instance.

The Database as a Service (DBaaS) is slightly different. With this you are getting a Linux instance that has been provisioned with a database. It is a fully deployed, fully provisioned database based on your selection criteria. There are many options when you provision DBaaS. Some of the options are virtual vs full instance, 11g vs 12c, standard edition vs enterprise edition vs enterprise edition high performance vs enterprise edition extreme performance. You need to provide an expected data size and if you plan on backing up the data and a cloud object repository if you do. You need to provide ssh keys to login as oracle or opc/root to manage the database and operating system. You also need to pick a password for the sys/system user inside the database. Finally, you need to pick the processor and memory shape that will run the database. All of these options have a pricing impact. All of these options effect functionality. It is important to know what each of these options means.

Let’s dive into some of these options. First, virtual vs full instance. If you pick a full instance you will get an Oracle Enterprise Linux installation that has the version of the database that you requested fully installed and operational. For standard installations the file system is the logical volume manager and the file system is provisioned across four file systems. The /u01 file system is the ORACLE_HOME. This is where the database binary is installed. The /u02 file system is the +DATA area. This is where table extents and table data is located. The /u03 file system is the +FRA area. This is where backups are dropped using the RMAN command which should run automatically every night for incremental backups and 2am on Sunday morning for a full backup. You can change the times and backup configurations with command line options. The /u04 area is teh +RECO area. This is where change logs and other log files are dropped. If you are using Data Guard to replicate data to another database or from another database, this is where the change logs are found.

If you pick a virtual instance you basically get a root file system running Oracle Enterprise Linux with a tar ball that contains the oracle database. You can mount file systems as desired and install the database as you have it installed in your data center. This configuration is intended to mirror what you have on-premise to test patches and new features. If you put everything into /u01 then install everything that way. If you put everything in the root file system, you have the freedom to do so even though this is not the recommended best practice.

The question that you are not asked when you try to create a DBaaS is if this service is metered or non-metered. This question is asked when you create your identity domain. If you request a metered service, you have the flexibility to select the shapes that you want and if you are billed hourly or monthly. The rates are determined by the processor shape, amount of memory, and what database option you select (standard, enterprise, high performance, or extreme performance). More on that later. With the metered option you are free to stop the database (but not delete it) and retain your data. You suspend the consumption of the database license but not the compute and storage. This is a good way of saving a configuration for later testing and not getting charged for using it. Think of it as having an Uber driver sit outside the store but not charge you to sit there. When you get back in the car the charge starts. A better analogy would be the Cars2Go. You can reserve a car for a few hours and drive it from Houston to Austin. You park the car in the Cars2Go parking slot next to the convention center and don’t pay for parking. You come out at the end of your conference, swipe your credit card and drive the car back to Houston. You only get charged for the car when it is between parking lots. You don’t get charged for it while it is parked in the reserved slot. You pay a monthly charge for the service (think of compute and storage) at a much lower rate. If you think of a non-metered service as renting a car from a car rental place, you pay for the car that they give you and it is your until you return it to the car rental place. You can’t not pay for the car while you are in your convention as with Card2Go. You have to pay for parking at the hotel or convention center. You can’t decide half way into your trip that you really need a truck instead of a car or a mini-van to hold more people and change out cars. The rental company will end your current agreement and start a new one with the new vehicle. Non-metered services are similar. If you select an OC3M shape then you can’t upgrade it to an OC5 to get more cores. You can’t decide that you need to use the diagnostics and tuning and upgrade from enterprise edition to enterprise edition high performance. You get what you started with and have 12 months to consume the services reserved for you.

The choice of 11g or 12c is a relatively simple one. You get 11.2.0.4 running on Oracle Enterprise Linux 6.6 or you get 12.1.0.2 running on Oracle Enterprise Linux 6.6. This is one of those binary questions. You get 11g or 12c. It really does not effect any other question. It does effect features because 12c has more features available to it but this choice is simple. Unfortunately, you can’t select 11.2.0.3 or 10.whatever or 9.whatever. You get the latest running version of the database and have an option to upgrade to the next release when it is available or not upgrade. Upgrades and patches are applied after you approve them.

The next choice is the type of database. We will dive into this deeper in a couple of days. The basic is that you pick Standard Edition or Enterprise Edition. You have the option of picking just the base Enterprise Edition with encryption only, with most of the options in the High Performance Option, or all of the options with Extreme Performance Option. The difference between High Performance and Exterme Performance is the Extreme included Active DataGuard, In-Memory options, and Real Application Clustering options. Again, we will dive into this deeper in a later blog entry.

The final option is the configuration of the database. I wanted to include a screen shot here but the main options that we look at are the CPU and memory shape which dictates the database consumption cost as well as the amount of storage for table space (/u02) and backup space (/u03 and /u04). There are additional charges above 128 GB for table storage and for backups.
We will not go into the other options on this screen in this blog entry.

In summary, DBaaS is charged on a metered or un-metered basis. The un-metered is a lower cost option but less flexible. If you know exactly what you need and the time that it is needed, this is a better option. Costs are fixed. Expenses are predictable. If you don’t know what you need, metered service might be better. It gives you the option of starting and stopping different processor counts, shutting off the database to save money, and select different options to test out different features. Look at the cost option and a blog that we will do in a few days analyzing the details on cost. Basically, the database can be mentally budgeted as $3K/OCPU/month for Enterprise Edition, $4K/OCPU/month for High Performance, and $5K/OCPU/month for Extreme Performance. Metered options typically cross over at 21 days. If you use metered service for more than 21 days your charges will exceed this amount. If you use it for less, it will cost less.

The Exadata as a Service is a special use case of Database as a Service. In this service you are getting a quarter, half, or full rack of hardware that is running the database. You get dedicated hardware that is tuned and optimized to run the Oracle database. Storage is dedicated to your compute nodes and not one else can use these components. You get 16, 56, or 112 processors dedicated to your database. You can add additional processors to get more database power. This service is available in a metered or non-metered option. All of the database options are available with this product. All of the processors are clustered into one database and you can run one or many instances of a database in this hardware. With the 12c option you get multi-tenant features so that you can run multiple instances and manage them with the same management tools but give users full access to their instance but not other instances running on the same database.

Exadata cost for metered services

Exadata cost for non-metered services

In summary, there are two options for database as a service. You can get a web based front end to a database and access all of your data through http and https calls. You can get a full database running on a Linux server or Linux cluster that is dedicated to you. You can consume these services on a an hourly, monthly, or yearly basis. You can decide on less expensive or more expensive options as well as how much processor, memory, and storage that you want to allocate to these services. Tomorrow, we will dive a little deeper into APEX or Schema as a Service and look at how it compares to services offered by Amazon and Azure.

Intro to PaaS

Today we are going to move up the stack. We will first focus on the Oracle solutions talking about the different platform as a service offerings. It is important to spend a little time reviewing this layer because what one company calls PaaS, another calls SaaS. The best way to get started is to go to cloud.oracle.com and look at the pull downs at the top of the screen. We see Infrastructure, Platform, and Applications.

When we pull down the Platform menu we see that there are different areas that we can dive into.

Data management is the first area that we will review. This is basically a way to aggregate and look at data. We can store data in a database, store on-premise databases into the cloud, store data in NoSQL repositories, and do analytics on a variety of data with Big Data Preparation and Big Data services. All of these involve pulling data into a repository of some type and performing queries against the repository. The key difference is the way that the data is stored, how we can ask questions, and the results that we get back. At this point we will not dive into any of these deeply but at a later point dive deep into the database and database backup.

The Application Development is moving farther away from the technology of storing data and moving closer to how we present data to users. The Java platform, for example, allows us to do things like create a shopping cart or hosting more complex applications in a Java repository or container. The Mobile Cloud Service allows us to dive into existing applications and present a user interface to iPhones, Android Phones, and tablets. The idea is to customize existing web and fat clients into a mobile format that can be consumed on mobile devices. The Messaging Cloud Service is a messaging protocol that allows for transactions in the cloud. If you are looking at connecting different cloud services together it allows you to serialize the communication between vendors for a true transactional experience. The Application Container Cloud is a lightweight Java container allowing you to upload and run java applications but without access to the operating system. This is a shared multi-tenant version of a WebLogic server. The Developer Cloud Service is a DevOps integration for the Java and Database services. This service is an aggregation of public domain components used to develop microservices at the database or java layer. The Application Builder Cloud Service is a cloud based REST api development interface allowing you to integrate with Application software in the Oracle Cloud as well as other Clouds. The API Catalog is a way of publishing the REST apis that you have and expose them to your customers.

The Content and Process Cloud Services are an aggregation of services that address group communications as well as business process flow. The Documents Cloud Service is a way of file sharing on the web. The Process Cloud Service is an extension that allows you to launch business processes (think Business Process Manager or BPM) in the cloud. The Sites Cloud Service is a web portal interface that takes documents and processes and aggregates them into a single cloud site allowing you to take a wiki like presentation but put business processes into the presentation. The Social Network Cloud Service allows you to integrate social network services like Facebook and Twitter into your web presence. It allows you to integrate these services as well as search these repositories for information relating to your company.

The Business Analytics part of Platform services provides data visualization and analytic tools as well as data aggregation utilities. The Business Intelligence component is the traditional BI package that allows users to create custom queries into your database. The Big Data Preparation allows you to aggregate data from a variety of sources into a Big Data repository. The Big Data Discovery allows you to look at your data in a variety of ways and generate reports based on your data and views of data. The Data Visualization Cloud Service allows you to view and analyze your data from different perspectives. This is similar to the BI and Big Data but looks at data slightly differently. The Internet of Things Cloud Service allows you to aggregate monitoring and measuring devices into a repository.

The Cloud Integration part of Platform services is the traditional data aggregation tools from other repositories. The Integration Cloud Service allows you to aggregate traditional SaaS vendors to unify fields like how a customer is defined or what data elements are incorporated into a purchase order. The SOA Cloud Service is implementation of the Oracle SOA Suite in the cloud. The GoldenGate Cloud Service is an implementation of the Oracle Golden Gate software that allows you to take data from different databases and synchronize the different repositories independent of the database vendor. The Internet of Things Cloud Service is the same listed in the Business Analytics section mentioned before.

The Cloud Management part of Platform services allows you to take the log files that you have inside your data center and analyze them for a variety of things. You can aggregate your log files into the Log Analytics Cloud Services to look for patterns, intrusion attempts, and problems or issues with services. The IT Analytics Cloud Service looks at log files and looks for trends like disks filling up, processors being used or not used appropriately. The Application Performance Cloud Service looks at log files to look at how systems and applications are operating rather than how systems are working rather than how components are working.

In Summary, we looked at an overview of the Platform as a Services offered by Oracle. Unfortunately, the variety of topics are too great for one blog. We did a high level overview of these services. In upcoming blogs we will dive deeper into each of these services and look at not only what they are but how they work and how to provision these services. We will also compare and contrast how these services compare to services offered by Amazon and Azure as we dive into each service.

storage cloud appliance in the cloud

Last week we focused on getting infrastructure as a service up and running. I wanted to move up the stack and talk about platform as a service but unfortunately, I got distracted with yet another infrastructure problem. We were able to install the storage cloud appliance software in a virtual machine but how do you install this in a compute cloud instance? This brings up two issues. First, how do you run a Linux 7 – 3.10 kernel in the Oracle Compute Cloud Service. Second, how do you connect and manage this service both from an admin perspective and client from another compute engine in the cloud service.

Let’s tackle the first problem. How do you spin up a Linux 7 – 3.10 kernel in the Oracle Compute Cloud Service? If we look at the compute instance creation we can see what images that we can boot from.

There is not Linux 7 – 3.10 kernel so we need to download and import and image that we can boot from. Fortunately, Oracle has gone through a good importing a bootable image tutorial. If we follow these steps, we need to first download a CentOS 7 bootable image from cloud.centos.org. The cloud instance that we use is the CentOS-7-x86_64-OracleCloud.raw.tar.gz. We first download this to a local directory then upload it to the compute cloud image area. This is done by going to the compute console and clicking on the “Images” tab at the top of the screen.

We then upload the tar.gz file that is a bootable image. This allows us to create a new storage instance that we can boot from. The upload takes a few minutes and once it is complete we need to associate it with a bootable instance. This is done by clicking on the “Associate Image” button where we basically enter a name to use for the operating system as well as description.

Note that the OS size is 9 GB which is really small. We don’t have a compute instance at this point. We either need to create a bootable storage element or compute instance based on this image. We will go through the storage create first since this is the easiest way of getting started. We first have to change from the Image tab to the Storage tab. We click on the Create Storage Volume and go through selection of the image, storage name, and size. We went with the storage size rather than resizing the storage we are creating.

At this point we should be able to create a compute instance based on this boot disk. We can clone the disk, boot from it, or mount it on another instance. We will go through and boot from this instance once it is created. We do this by going to the Instance tab and clicking on Create Instance. It does take 5-10 minutes to create the storage instance and need to wait till it is completed before creating a compute instance. An example of a creation looks like

We select the default network, the CentOS7 storage that we previously created, the 2016 ssh keys that we uploaded, and review and launch the instance.

After about 15 minutes, we have a compute instance based on our CentOS 7 image. Up to this point, all we have done is create a bootable Linux 7 – 3.10 kernel. Once we have the kernel available we can focus on connecting and installing the cloud storage appliance software. This follows the making backup better blog post. There are a couple of things that are different. First, we connect as the user centos rather than oracle or opc. This is a function of the image that we downloaded and not a function of the compute cloud. Second, we need to create a second user that allows us to login. When we use the centos user and install the oscsa_install.sh script, we can’t login with our ssh keys for some reason. If we create a new user then whatever stops us from logging in as the centos user does not stop us from logging in as oracle, for example. The third thing that we need to focus on is creating a tunnel from our local desktop to the cloud instance. This is done with ssh or putty. What we are looking for is routing the management port for the storage appliance. It is easier to create a tunnel rather than change the management port and opening up the port through the cloud firewall.

From this we execute the commands we described in the maker backup better blog. We won’t go through the screen shots on this since we have done this already. One thing is missing from the screenshot, you need to disable selinux vy editing /etc/sysconfig/selinux. You need to disable SELINUX by editing the file and rebooting. Make sure that you add a second user before rebooting otherwise you will get locked out and the ssh keys won’t work once this change is made.

The additional steps that we need to do are create a user, copy the authorized_keys from an existing user into the .ssh directory, change the ownership, and assign a password to the new user, and add the user to /etc/sudoers.

useradd oracle
mkdir ~oracle/.ssh
cp ~centos/.ssh/authorized_keys ~oracle/.ssh
chown -R oracle ~oracle
passwd oracle
vi /etc/sudoers

The second major step is to create an ssh tunnel to allow you to connect in from your localhost into the cloud compute service. When you create the oscsa instance it starts up a management console using port 32769. To tunnel this port we use putty to connect.

At this point we should be able to spin up other compute instances and mount this file system internally using the command

mount -t nfs -o vers=4,port=32770 e53479.compute-metcsgse00028.oraclecloud.internal:/ /local_mount_point

We might want to use the internal ip address rather than the external dns name. In our example this would be the Private IP address of 10.196.89.62. We should be able to mount this file system and clone other instances to leverage the object storage in the cloud.

In summary, we did two things in this blog. First, we uploaded a new operating system that was not part of the list of operating systems presented by default. We selected a CentOS instance that conforms to the requirements of the cloud storage appliance. Second, we configured the cloud storage appliance software on a newly created Linux 7 – 3.10 kernel and created a putty tunnel so that we can manage the directories that we create to share. This gives us the ability to share the object storage as an nfs mount internal to all of our compute servers. It allows for things like spinning up web servers or other static servers all sharing the same home directory or static pages. We can use these same processes and procedures to pull data from the Marketplace and configure more complex installations like JD Edwards, PeopleSoft, or E-Business Suite. We can import a pre-defined image, spin up a compute instance based on that image, and provision higher level functionality onto infrastructure as a service. Up next, platform as a service explained.

private cloud vs public cloud

Today is our last day to talk about infrastructure as a service. We are moving up the stack into platform as a service after this. The higher up the stack we get, the more value it has to the business and end users. It is interesting to talk about storage and compute in the cloud but does this help us treat patients in a medical practice any better, find oil and gas faster, deliver our manufactured product any cheaper? Not really but not having them will negatively effect all of them. We need to make sure that these services are there so that we can perform higher functions without having to worry about triple redundant mirroring of a disk or load balancing compute servers to handle higher loads.

One of the biggest complaints with cloud services is that there are perception problems of security, latency, and governance. Why would I put my data on a computer that I don’t control. There is a noisy neighbor issue where I am renting part of a computer, part of a disk, part of a network. If someone wants to play heavy metal at the highest volume (remember our apartment example) while I am trying to file a monthly report or do some analytics, my resources will suffer and it will take me longer to finish my job due to something out of my control.

Many people have decided to go with private hosted cloud solutions like VCE, VBlock, Cisco UCS clusters, and other products that provide raw compute, raw storage, and “hyper-converged” infrastructure to solve the cloud problem. I can create a golden master on my VMWare server and provision a database to my configuration as many times as I want. Yes, I run into a licensing issue. Yes, it is a really big licensing issue running Oracle on VMWare but Microsoft is not far behind with SQL Server licensing on a per core basis either. Let’s look at the economics of putting together one of these private cloud servers.

It is important to dive into the licensing issue. The Oracle Database and WebLogic servers are licensed either on a processor or named user basis. Database licensing is detailed in a pdf on the Oracle site. The net of the license says that the database is licensed based on the core count of the processor running on the server. There is a multiplication factor (0.5 for X86) based on the chip type that factors into the license cost. A few years ago it was easy to do this calculation. If I have a dual core, dual socket system, this is a four core computer. The license price of the computer would be 4 cores x 0.5 (Intel x86 chip) x $47,500. The total price would be $95K. Suddenly the core count of computers went to 8, 16, or 32 cores per chip. A single system could easily have 64 cores on a single board. If you aggregate multiple boards as is done in a Cisco UCS system you can have 8 board or 256 cores that you can use. There are very few applications that can take advantage of 256 cores so a virtualization engine was placed on top of these cores so that you could sub-divide the system into smaller chunks. If you have a 4 core database problem, you can allocate 4 cores to it. If you need 8 cores, allocate 8 cores. Products like VMWare and HyperV took advantage of this and grew rapidly. These virtualization packages added features like live migration, dynamic sizing, and bursting utilization. If you allocate 4 cores and the processor goes to 90%, two more cores will be made available for a short burst. The big question comes up as to how you now license on a per core basis. If you can flex to more processors without rebooting or live migrate from a 2 core to a 24 core system, which do you license for?

Unfortunately, Oracle took a different position from the rest of the industry. None of the Oracle products contain a license key. None of the products require that you go to a web site and get a token to allow you to run the software. The code is wide open and freely available to load and run on any system that you want. Unfortunately, companies don’t do a good job of tracking utilization. If someone from the sales or purchasing department rolls out a golden master onto a new virtual machine, no one is really tracking that. People outside of IT can easily spin up more licenses. They can provision a database in a cloud service and assume that the company has enough licenses to cover their project. After a while, licensing gets out of control and a license review is done to see what is actually being used and how it is being used. Named user licenses are great but you have to have a ratio of users to cores to meet minimums. You can’t for example, buy a 5 user license and deploy it on a 64 core system. You have to maintain a typical ratio of 25 users to a core of 40 users to a core based on the product that you are using. You also need to make sure that you understand soft partitioning vs hard partitioning. Soft partitioning is the ability to flex or change the core count without having to reconfigure or reboot the system. A hard partition puts hard limits on the core count and does not allow you exceed it. Products like OracleVM, Solaris, and AIX contain hard partition virtualization. Products like HyperV and VMWare contain soft partitions. With soft partitions, you need to pay for all of the cores in the cluster since in theory you can consume all of the cores. To be honest, most people don’t understand this and get in trouble with license reviews.

When we talk about cloud services, licensing is also important to understand. Oracle published cloud license rules to detail limits and restrictions. The database is still licensed on a per core basis. The Linux operating system is licensed on per server instance and is limited to 8 virtual cores. If you deploy the Oracle database or WebLogic server in AWS or Azure or any other cloud vendor, you have to own a perpetual license for the database using the formulas above. The license must correlate to the high water mark for the core count that you provision. If you provision a 4 core system, you need a 2 processor license. If you run the database for six months and shut it off, you still need to own the perpetual license. The only way to work around this is to purchase database as a service in the Oracle cloud. You can pay for the database license on an hourly or monthly basis with metered services or on an annual basis with non-metered services. This provides a great cost savings because if we only need a database for 6 months we only need to pay for 6 months x the number of cores x the database edition type. If, for example, we want just the Database Enterprise Edition, it is $3K/core/month. If we want 4 cores that is $12K per month. If we want 6 months then we get it for $72K. We can walk away from the license and not have to pay the 22% annual maintenance on the $95K. We save $23K the first year and $20K annually by only using the database in the cloud for six months. If we wanted to use the database for 9 months, it is cheaper to own the license and lease processor and storage. If we go to the next higher level of database, Database High Performance Edition at $4K/core/month, it becomes cheaper to use the cloud service because it contains so many options that cost $15K/processor. Features like partitioning, compression, diagnostics, tuning, real application testing, and encryption are part of this cloud service. Suddenly the economics is in favor of cloud hosting a database rather than hosting in a data center.

Let’s go back to the Cisco UCS and network attached storage discussion. If we purchase a UCS 8 blade server with 32 cores per blade we are looking at $150K or higher for the server. We then want to attach a 100 TB disk array to it at about $300K (remember the $3K/TB). We will then have to pay $300K for VMWare. If we add 10% for hardware and 20% for software we are at just over $1M for a base solution. With a three year amortization we are looking at about $330K per year just to have a compute and storage infrastructure. We have to have a VMWare admin who doles out processors and storage, loads operating systems, creates golden masters, and acts as a traffic cop to manage and allocate resources. We still need to pay for the Oracle database license which is licensed on a per core basis. Unfortunately, with VMWare we must license all of the cores in the cluster so we either have to sub-divide the cluster into one blade and license all 32 cores or end up paying for all 256 cores. At roughly $25K/core that gets expensive quickly. Yes, you can run OracleVM or Solaris on one of the blades and subdivide the database into two cores and only pay for two cores since they both support hard partitioning but you would be amazed at how many people fight this solution. You now have two virtualization engines that you need to support with two different file formats. No one in mass wants two solutions just to solve a licensing issue.

Oracle has taken a radically different approach to this. Rather than purchasing hardware, storage, and a virtualization platform, run everything in the cloud and pay for it on a monthly basis. The biggest objection is that the cloud is in another city and security, latency, … you get the picture. The new solution is to run this hardware in your data center with the Oracle Public Cloud Machine. The cost of this solution is roughly $260K/year with a three year commit. You get 200 plus cores and 100 ish TB of storage to use as you want. You don’t manage it with VSphere but manage it with the same web page that you manage the public cloud services. If you want to script everything then you can manage it with REST apis or perl/java/insert your lanaguage scripts. The key benefit to this is that you no longer need to worry about what virtualization engine you are using. You manage the higher level and lease the database or weblogic or SOA license on an hourly or monthly basis.

Next week we will move up the stack and look at database hosting. Today we talked about infrastructure choices and how it impacts database license cost. Going with AWS or Azure still requires that you purchase the database license. Going with the Oracle public cloud or public cloud machine allows you to not own the database license but effectively lease it on an hourly or monthly basis. It might not be the right solution for 7x24x365 operation but it might be. It really is the right solution for bursty needs list holiday peak periods, student registration systems, development and testing where you only need a large footprint for a few weeks and don’t need to buy for your highwater mark and run at 20% the rest of the year.

Archive Storage Services vs Archive Cloud Services

Yesterday we started talking about the cost comparison for storage in the cloud. We briefly touched on the cost of long term archive in the cloud. How much does it cost to backup data for long term archive and what is the best way to do this? Years ago the default way of doing this was to copy your data on disk to a tape unit and put the tape in a box. The box was then put in an environmentally controlled room to extend the lifetime of tape and a person was put on staff to pull the data off the shelf when the data was needed. The data might be a backup of data on disk or a secondary copy just in case the disk failed. Tape was typically used to provide separation of duties required by Sarbanes-Oxly to keep people who report on financial data separate from the financial data. It also allowed companies to take large volumes of data, like seismic data, and not keep it on spinning disks. The traces were reloaded when geophysicists wanted to look at the data.

The first innovation in this technology was to provide a robot to load and unload tapes as a tape unit gets full or needs to be reloaded. Magazines were created that could hold eight tapes and the robots had bar code readers so that they could seek to the right tape in the magazine, pull it out of the series of tapes and inserted into the tape unit for reading or writing. Management software got more advanced and understood the bar code values and could sequence the whopping 800 GB of data that could be written to an LT04 tape. Again, technology gets updated and the industry moved to LT05 and LT06 tapes with significantly higher densities. A single LT06 could hold 2.5 TB per tape unit. Technology marches on and compression allows us to store 6 TB on these disks. If we go back to our 120 TB case that we talked about yesterday this means that we will need 20 tapes (at $30-$45 for each tape) and $25K for a single tape drive unit. Most tape drive systems support 8 tapes per magazine so we are talking about something that will support three magazines. To support three magazines, we need a second shelf in our tape storage so the price goes up by about $20K. We are sitting at about $55K to backup our 120 TB and $5.5K in support annually for the hardware. We also need about $1K in tape for the number of full and incremental backups that we want which would be $20K for four months of retention before we recycle the tapes. These tapes are good for a dozen re-writes so every three years we will need to repurchase tapes. If we spread the cost of the tape unit, tape drives, and tapes across three years we are looking at $2K/month to backup our 120 TB. We also need to factor in $60/week for tape pickup and storage fees at a service like Iron Mountain and a couple of $250 charges to retrieve tapes in the event of a catastrophic failure to drive tapes back to our data center from cold storage. This bumps the cost to $2.2K/month which is significantly cheaper than the $10K/month for network storage in our data center or $3.6K/month for cloud storage services. Unfortunately, a tape unit requires someone to care and feed it and you will pay that person more than $600/month but not $7.8K/month which you would with the cloud or disk solutions.

If you had a ton of data to archive you could purchase a tape silo that supported hundreds or thousands of magazines. Unfortunately, this expandability cones at a cost. The tape backup unit grew from an eighth of a rack to twenty full racks. There isn’t much in between. You can get an eighth of a rack solution, a full rack solution, or a twenty full rack solution. The larger solution comes in at hundreds of thousands of dollars rather than tens of thousands.

Enter cloud solutions. Amazon and Oracle offer tape solutions in the cloud. Both companies offer the twenty full rack solution but only charge a per tape charge to consumers. Amazon Glacier charges $7/TB/month to store data. Oracle charges $1/TB/month for the same service. Both companies charge for data restoration and outbound transfer of data. The Amazon Glacier cost of writing 120 TB and reading back 10% of it comes in at $2218/month. This is the same cost as having the tape unit on site. The key difference is that we can recover the data by requesting it from Amazon and get it back in less than four hours. There is no emergency recovery charges. There is not the weekly pickup charges. We can expand the amount that we backup and the bulk of this cost is reading back the data ($1300). Storage is relatively cheap for our backups, we just need to plan on the cost of recovery and try to limit this since it is the bulk of the cost.

We can drop this cost even more using the Oracle Archive Cloud Services. The price from Oracle is $1/TB/month but the recovery and transmission charges are about the same. The same archive service with Oracle is $1560/month with roughly $1300 being the charges for restoring and outbound transfer of the data. Unfortunately, Oracle does not offer an un-metered archive service so we have to guestimate how much we are going to restore on a monthly basis.

Both services use REST apis to write, restore, and read data. When a container (Oracle Archive) or bucket (Amazon Glacier) is created, a PUT call is done to the endpoint of the service. The first step required by both are authentication to provide credentials into the service. Below we show the Oracle authentication and creation process through the REST api.

The important part of this is the archive header extension. This differentiates if the container is spinning disk or if it is tape in the cloud.

Amazon recommends using a windows based tool like s3browser, CloudBerry, or using a language like Java, .NET, or Ruby and their published SDKs. CloudBerry works for the Oracle Archive as well. When you create a container you have the option of pulling down storage or archive as the container type.

Both services allow you to encrypt and compress the data as it is written with HTML Headers changing the characteristics and parameters of the container. Both services require you to issue a PUT request to write the data to tape. Below we show the Oracle REST api.

For CloudBerry and the other gui based tools, uploading is just a drag and drop from your local file system to the tape storage in the cloud.

Amazon details the readback procedure and job system that shows the status of the restore request. Oracle has a similarly defined retrieval policy as well as an archive tutorial. Both services offer a 4 hour window to allow for restoration. Below is an example of a restore request and checking on the job status of the job spawned to load the tape and transfer the data for reading. The file is ready to read when the completedPercentage is 100.

We can do the same thing with the S3 browser and Amazon Glacier. We need to request the restore, check the job status, then download the restored files. The files change color when they are ready to read.

In summary, we have looked at how to reduce cost of archives and backups. We looked at using a secondary disk at our data center or another data center. We looked at using on site tape units. We looked at disk in the cloud. Today we looked at tape in the cloud. It is important to remember that not one of these solutions is the answer. A combination of any or all of them are needed. Daily and weekly backups should happen to a secondary disk locally. This data is most likely to be restored on a regular basis. Once you get a full backup or two under your belt, move the data to another site. It might be spinning disk, it might be tape but something needs to be offsite in the event of a true catastrophic failure like a communication link going out (think Dell PowerVault and a thunderstorm) and you loose your primary lun and secondary lun that contains your backups. The whole idea of offsite backups are not for restore but primary for insurance and regulation compliance. If someone needs to see the old data, it is there. You are betting that you won’t need to read it back and the cloud vendors are counting on that. If you do read it back on a regular basis you might want to significantly increase your budget, pass the charges onto the people who want to read data back, or look for another solution. Tape storage in the cloud is a great way of archiving data for a long time at a low cost.

metered vs un-metered vs dedicated services

One of the newest concepts that has been introduced for cloud services is the concept of un-metered or dedicated services. Before we dive into this subject, let’s review what a cloud service really is. When you boil it all down, you are basically leasing computer resources on a computer that you don’t own. You are taking a slice of a compute engine, slice of a disk drive, part of a network connection. You are renting space. Think of it as living in an apartment. Yes, this is a silly analogy but if you think about it, it makes sense. You can rent an efficiency, one, two, or three bedroom apartment. You can get parking with or without a roof over your car. You can get a storage closet or a garage to store stuff that you are not using but want to keep around. There are benefits to apartments. You don’t have to cut the grass. You typically have access to a pool but don’t need to maintain it. If the toilet backs up or the gas stops working you call the super and they come fix it. You still have to replace your own light bulbs that burn out. You still need to clean your own bathroom and kitchen and take out your trash on a regular basis. On the grander scale, you don’t need to drop 10% down and get a mortgage to live there. Monthly rents are typically cheaper than paying down a mortgage. Your taxes are bundled into your rent cost. You basically show up, use the apartment and go on with your life. On the flip side, there are drawbacks to apartments. If your upstairs neighbor likes to play heavy metal at 2am or throw wild parties on the weekends it does make it hard to sleep. Someone might park in your parking spot so you need to park farther away from your front door. You can’t pull into your garage to unload your groceries and have to potentially carry them in the rain across the parking lot and up the stairs to your third floor apartment. The super might decide that Tuesday they are going to repaint all bathrooms another color and you need to be out of the way for a day and put up with the smell even though you planned a dinner party the next night. It is difficult to grill on your balcony and you can’t really sit out without sharing the space with all of your neighbors. The true downfall is that twenty years from now, you will still be renting your apartment (and the rent probably went up every other year) while your college buddies are celebrating a mortgage burning party and the only thing that they owe on a monthly basis is the taxes that the government takes annually.

Yes, our analogy is silly. Yes, our analogy is relevant. It is easy to decide that you want another job in another city so you hire a mover, pack up all your stuff, and move to another apartment. This is where our analogy breaks down. Cloud vendors charge you for every piece of furniture that you take out of the building. They charge you to use the stairs or elevator. They charge you every time a moving van exits the building full of furniture and boxes of clothes. It is free to bring stuff in because it locks you into the apartment. Just don’t try to take anything out. Remember that storage closet or garage that you got with your apartment, you can open the door and put stuff in for free but if you carry anything out (even if you just relocate it to your apartment) you get charged per item that you carry across the threshold.

If you look at storage from any cloud vendor they offer a metered storage service. The same is true for compute services. You can lease a virtual processor and memory and grind on data all that you want. The catch is when you want to transfer your files or report results of you analysis to your desktop computer, you get charged on a per gigabyte transferred across the internet. Cost calculators help you calculate these costs but they are a little hard to estimate and use to calculate outbound data charges. Amazon, for example, has a calculators that you can use. The AWS pricing calculator allows you to look at the cost of all cloud services.

Let’s walk through the cost of Amazon Glacier. The price list says that you should pay $0.007/GB/month or $7/TB/month to keep things in cold storage. We will use 120 TB as our basis for analysis. We put this as the amount to store and see the cost of storing the data is $860/month.

If we plan on reading back 10% of this data during the month the price goes up to $2217. The bulk of these charges are the outbound charges. The cost goes to $921 if we read the data to an EC2 instance and not all the way back to our data center or desktop computer. To use our apartment analogy, you are paying $860 to get a storage garage. You pay $61 every time you take something out and move it to your apartment. You can put all you want into the storage area (as long as your don’t exceed the space of the storage unit) but taking something out will cost you. If you put your recently retrieved item in your car or a truck and drive it out the gate you get a surcharge of $1300. It is important to remember that pulling more stuff out of your storage will cost you more. Putting this in terms of computer archive, you can store all of your emails, contracts, customer transactions, patient records in long term storage. If your on-site storage fails for some reason or if you get a legal request to review five years of data, you can pull the data back from cold storage. It will cost you to pull the data back but it is still cheaper than keeping the seven years of longer data on spinning disk in your data center (estimate $3K/TB plus 10% per year for spinning disk in your data center).

We can do the same calculation for cloud storage using S3. We can store 120 TB for roughly $3950/month. If we want to read back 10%, or 12 TB, of that data, it will cost us $5150 or $1200 additional.

We can reduce the cost by using lower speed storage in the cloud. We put the S3 data into the infrequent category to save money. This drops the cost to just over $3K which does save us about $2100/month. We agree to pay a lower cost to get higher latency and longer retrieval times. It is better than using tape in the cloud and we can save some money with this option.

We can opt for reduced redundancy storage (aka non-mirrored and non-replicated data) but we risk data loss since we will only have a single copy in the cloud. This drops the cost to $4300 with the data retrieval but we have to weigh the cost vs data loss risks.

Let’s not pick on Amazon. How does this compare to Azure? Unfortunately, we can’t start with Microsoft tape in the cloud, they don’t offer the service. We must start with block storage in the Azure cloud. Microsoft has an Azure pricing calculator that you can use to perform the same calculations. The calculator and pricing is a little difficult to use when you first get into it. You basically need to put together the calculation a piece at a time. You need to factor in the cost of the storage and the cost of transferring the data from the cloud to your data center. This is done in two different pieces. An example of what we are looking for can be seen below.

We need to piece together the calculator. First we add the storage component then the bandwidth component. There is a transaction component but this amount is trivial and we are going to ignore it for simplicity.

If we look at the options for Azure storage, we can basically select blob storage in different zones. In the grand scheme of things, the cost is not siginificantly higher one way or another. The basic cost is about the same.

The third class of storage that are going to look at is the Oracle Storage Cloud Service. We can look at Oracle Storage Cloud Service as well as Oracle Archive Cloud Service. The Archive service compares directly to the Amazon Glacier service except that it is $1/TB/month and suffers the same transfer charges for outbound data. The Oracle Storage Cloud Service is similar to the Amazon S3 and Azure Blob Storage Service but it is offered either as a metered service (as is S3 and Blobs) and un-metered services. Unfortunately, Oracle does not provide a cost calculator for general use. The Value Added Distributors are given a copy of the calculator but it is not generally available. The key difference with the Oracle storage services is that there are two significant flavor differences; metered and non-metered. The metered services are charged just like the Microsoft and Amazon services. You pay for what you use on a per GB basis and pay for outbound data transfer. An example of the pricing calculator is shown below. Note that we do need to have a good guestimate on how much data we will transfer outbound across the internet. These charges are not consumed if you are reading the data to a compute engine in the cloud unlike S3 which still consumes cost just for reading the data off the disk.

The most significant differential in storage offerings is the non-metered storage. Oracle offers storage in blocks that you reserve and allocate for 12 months. This is different from the metered storage in that metered can start with 10 TB and grow to 120 TB over the year. With the non-metered storage, you start with 120 TB and end with 120 TB. You can extend your contract and grow storage but you basically extend a new contract for more storage. You can not shrink your storage and pay for less. The benefit of this is that you don’t have to pay for outbound data transfer. You can read and write as much as you want and not get charged for transferring the data across the internet. A pricing calculator for this is simple. How much do you need and how long do you need it?

If we piece all of this together and look a price comparison between the three service providers, the answer of which is cheapest comes down to it depends. Oracle non-metered storage has a significant advantage if you are planning on reading back your data at high or unpredictable rates. Amazon S3 infrequent is the cheapest if you don’t plan on reading back your data and want it as an insurance policy only. I honestly would go with Glacier or Oracle Archive if this is the case since it is an order of magnitude cheaper. The chart below compares 120 TB of storage and the variable charge for reading back this data on a monthly basis. If you have 120 TB of storage and plan on reading back 120 TB on a monthly basis, the Oracle non-metered storage is significantly cheaper. If you are only planning on reading back 12 to 24 TB per month the cost is about the same for all of the services.

In summary, one option is not clearly better than the other (except for high read rates) and this blog is intended to help you decide on what fits your needs best. Pricing calculators can help with the cost based on transfer rates. It is important to remember that storage transfer is a significant part of the calculation. It is also important to look at your usage model. We assumed that you started with 120 TB and ended with 120 TB for our analysis. If you start with 12 TB and grow to 120 TB, the pricing calculation will be a little different. Neither the Amazon nor Azure calculators will help you run this simulation and you will have to calculate everything on a month by month basis. It is also interesting to take 120 TB of on-premise storage and assume that each TB can be purchased at $3K/TB. If we assume 10% annual hardware maintenance and a three year amortization, the charge for on-premise storage is $1030/month which might be more or less than cloud based storage. Your results might vary.

making backup better

Yesterday we looked at backing up our production databases to cloud storage. One of the main motivations behind doing this was cost. We were able to reduce the cost of storage from $3K/TB capex plus $300/TB/year opex to $400/TB/year opex. This is a great solution but some customers complain that it is not generic enough and latency to cloud storage is not that great. Today we are going to address both of these issues with the cloud storage appliance. First, let’s address both of the typical customer complaints.

The database backup cloud service is just that. It backs up a database. It does it really well and it does it efficiently. You replace one of the backup library modules that translates writes of backup data to the cloud REST api rather than a tape driver. The software works well with commercial products like Symantec or Legato and integrates well into that solution. Unfortunately, the critics are right. The database backup cloud service does that and only that. It backs up Oracle databases. It does not backup MySQL, SQL Server, DB2, or other databases. It is a single use tool. A very useful single use tool but a single use tool. We need to figure out how to make it more generic and backup more than just databases. It would be nice if we could have it backup home directories, email servers, virtual machines, and other stuff that is used in the data center.

The second complaint is latency. If we are writing to an SSD or spinning disk attached to a server via high speed SCSI, iSCSI, or SAS, we should expect 10ms access time or less. If we are writing to a server half way across the country we might experience 80ms latency. This means that a typical read or write takes eight times longer when we read and write cloud storage. For some applications this is not an issue. For others this latency makes the system unusable. We need to figure out how to read adn write at 10ms latency but leverage the expandability of the cloud storage and lower cost.

Enter stage left the Oracle Cloud Storage Appliance. The appliance is a software component that listens on the data center internet using the NFS protocol and talks to the cloud services using the storage REST api. Local disks are used as a cache front end to store data that is written to and read from the network shares exposed by the appliance. These directories map directly to containers in the Oracle Storage Cloud Service and can be clear text or encrypted when stored. Data written from network servers is accepted and released quickly as it is written to local disk and slow tricked to the cloud storage. As the cache fills up, data is aged and migrated from the cache storage into cloud storage. The metadata representing the directory structure and storage location is updated to show that the data is no longer stored locally but stored in the cloud. If a read occurs from the file system, the meta data helps the appliance locate where the data is stored and it is presented to the network client from the cache or pulled from the cloud storage and temporarily stored in the local cache as long as there is space. A block diagram of this architecture is shown below

The concept of how to use this device is simple. We create a container in our cloud storage and we attach to it with the cloud storage appliance. This attachment is exposed via an nfs mount to clients on our corporate network and anyone on the client can read or write files in the cloud storage. Operations happen at local disk speed using the network security of the local network and group/owner rights in the directory structure. It looks, smells, and feels just like nfs storage that we would spend thousands of dollars per TB to own and operate.

For the rest of this blog we are going to go through the installation steps on how to configure the appliance. The minimum requirements for the appliance are

Linux 7 (3.10 kernel or later)
Docker 1.6.1 or later
two dual core x86 CPUs
4 GB of RAM

We will be installing our configuration on a Windows desktop running VirtualBox. We will not go through the installation of Oracle Enterprise Linux 7 because we covered this a long time ago. We do need to configure the OS to have 4 GB of RAM and at least 2 virtual cores as shown in the images below.

We also need to configure a network. We configure two networks. One is for the local desktop console and the other is for the public internet. We could configure a third interface to represent our storage network but for simplicity we only configure two.

We can boot our Linux 7 system and will need to select the 3.10 kernel. By default it will want to boot to the 3.8 kernel which will cause problems in how the appliance works.

What we would like to do is remove the 3.8 kernel from our installation. This is done by removing the packages with the rpm -e command. We then update the grub.cfg file to list only the 3.10 kernels.

Once we have removed the kernels, we update the grub loader and enable additional options for the yum update.

The next step that we need to take is to install docker. This is done with the yum install command.

Once we have the docker package installed, we need to make sure that we have the nfs-client and nfs-server installed and started.

It is important to note that the tar bundle is not generally available. It does require product manager approval to get a copy of the software for installation. The file that I got was labeled oscsa-1.0.5.tar.gz. I had to unzup and untar this file after loading it on my Linux VirtualBox instance. I did not do a screen capture of the download but did go through the installation process.

We start the service with the oscsa command. When we start it it brings up a management web page so that we can make the connection to the cloud storage service. To see this page we need to start firefox and connect to the page.

One of the things that we need to know is the end point of our storage. We can find this by looking at the management console for our cloud services. If we click on the storage cloud service details link we can find it.

Once we have the end point we will need to enter this into the management console of the appliance as well as the cloud credentials.

We can add encryption and a container name for our network share and start reading and writing.

We can verify that everything is working from our desktop by mounting the nfs share or by using cloudberry to examine our cloud storage containers. In this example we use cloudberry just like we did when we looked at the generic Oracle Storage Cloud Services.

We can examine the properties of the container and network share from the management console. We can look at activity and resources available for the caching.

In summary, we looked at a solution to two problems offered by our database backup solution. The first was single purpose and the second was latency. By providing a network share to the data center we can not only backup or Oracle database but all of the databases by having the backup software write to the network share. We can backup other files like virtual machines, email boxes, and home directories. Disk latency operates at the speed of the local disk rather than the speed of the cloud storage. This software does not cost anything additional and can be installed on any virtual platform that supports Linux 7 with kernel 3.10 or greater. When we compare this to the Amazon Storage Gateway which requires 2x the processing power and $125/month to operate it looks significantly better. We did not compare it to the Azure solution because it is an iSCSI hardware solution and not easy to get a copy of for testing.

backing up a database to the cloud

Up to this point we have talked about the building blocks of the cloud. Today we are going to look into the real economics of using some of the cloud services that we have been examining. We have looked at moving compute and storage to the cloud. Let’s look at some of the reasons why someone would look at storage in the cloud.

Storage is one of those funny things that everyone asks for. Think of uses for storage. You save emails that come in every day. If you host your email system in your corporation, you have to consider how many emails someone can keep. You have to consider how long you keep files associated with email. At Oracle we have just over 100,000 employees and limit everyone to 2GB for email. This means that we need 200 TB to store email. If we increase this to 20 GB this grows to 2 PB. At $3K/TB we are looking at $600K capex to handle email messages. If we grow this to 2 PB we are looking at $6M for storage. This is getting into real money. Associated with this storage is a 10% support cost ($60K opex annually) as well as part of a full time employee to replace defective disks, tune and feed the storage system, allocate disks and partitions not only to our storage but other projects at a cost of $80K payroll annually. If we use a 4 year depreciation, our email boxes will cost us ($150K capex + $60K opex + $80K opex) $290K per year or $29/user just to store the email. If we expand the email limits to 20 GB we grow almost everything by a factor of 10 as well so the email boxes cost us $220/user annually (we don’t need 10x the storage admins). Pile on top of this home directories that people want to save attachments into and this number explodes. We typically do want to give everyone 20 GB for a home directory since this stores documents associated with operation of the company. We typically want people storing these documents on a network share and not on a disk on their laptop. Storing data on their laptop opens up security and data protection discussions as well as access to data if the laptop fails. Putting it on a shared home directory allows the company to backup the files as well as define protection mechanisms for sensitive data. We have basically justified $250/user for email and home directories by allocating 22 GB to each employee.

The biggest problem is not user data, it is corporate data. Databases typically consume upwards of 400 GB to 40 TB. There is a database for human resources, payroll, customer service, purchase orders, general ledger, inventory, transportation management, manufacturing…. the list goes on. Backing up this data as it changes becomes an issue. Fortunately, programs like E-Business Suite, PeopleSoft, and JD Edwards aggregate all of these business functions into a small number of database instances so we don’t have tens or hundreds of databases to backup. Some companies do roll out multiple databases for each project to collect and store data but these are typically done with low cost, low function databases like MySQL, Postgress, MongoDB, and SQL Server. Corporate data that large numbers of people use are typically an Oracle database, DB2, or SQL Server. Backing up this data is critical to a corporation. Database backups are typically done nightly to make sure that you can recover disk or server failures. Fortunately, you don’t need to backup all 400 GB every night but can do incremental backups and copy only the data blocks that have changed from the previous night. Companies typically reserve late at night on the weekends to do a full backup because users are typically not working and few if any people are hitting the database at 2am on Sunday morning. The database can be taken offline for a couple of hours to backup 400 GB or a live backup can be taken with little risk since few if anyone is on the system at this time. If you have a typical computer with SCSI or SAS disks you can reasonably get 2G/second throughput so reading 400 GB will take 200 seconds. Unfortunately, writing is typically about half that speed so the backup should reasonably take 400 seconds which is 7 minutes. If your database is 4 TB then you increase this by a factor of 10 so it takes just over an hour to backup everything. Typically you want to copy this data to another data center or to tape and your write speeds get doubled again. The 7 minutes becomes 15 minutes. The hour becomes two hours.

When we talk about backing up database data, there are two schools of thought. The database data is contained in a table extent file. You can backup your data by replicating your file system or using database tools to backup your data. Years ago files were kept on raw disk partitions. Few people do this anymore and table extent files are kept on file systems. Replicating data from a raw partition is difficult so most people used tools like RMAN to backup database files on raw partitions. Database vendors have figured out how to optimize reads and writes to disk despite the file system structures that operating system vendors created. File system vendors have figured out how to optimize backup and recovery to avoid disk failures. Terms like mirroring, triple mirroring, RAID, and logical volume management come up when you talk about protecting data in a file system. Other terms like snap mirror and off-site cloning sneak into the conversation as well. Earlier when we talked about $3K/TB we are really talking about $1K/TB but we triple mirror the disks thus triple the cost of usable storage. This makes sense when we go down to Best Buy or Fry’s and look at a 1 TB USB disk for $100. We could purchase this but the 2G/second transfer rate suddenly drops to 200K/second. We need to pay more for a higher speed communication bridge to the disk drive. We could drop the cost of storage to $100/TB but the 7 minute backup and recovery time suddenly grows to 70 minutes. This becomes a cost vs recovery time discussion which is important to have. At home, recovering your family photos from a dead desktop computer can take hours. For a medical practice, waiting hours to recover patient records impacts how the doctors engage with patients. Waiting hours on a ticket sales or stock trading web site becomes millions of dollars lost as people go to your competitors to transact business.

Vendors like EMC and NetApp talk about cloning or snap mirrors of disks to another data center. This technology works for things like email and home directories but does not work well for databases. Database files are written to multiple files at times. If you partition your data, the database might be moving data from one file to another as data ages. We might have high speed SSD disks for current data and low speed, low latency disks for data greater than 30 days old. If we start a clone of our SSD disks during a data move, the recent data will get copied to our mirror at another site. The database might finish re-partitioning the data and the disk management software starts backing up the lower speed disks. We suddenly get into a data consistency problem. The disk management software and the database software don’t talk to each other and tell each other that they are moving data between file systems. Data that was in the high speed SSD disks is now out of sequence with the low speed disks at our backup site. If we have a disk failure on our primary site, restoring data from or secondary site will cause database failure. The only way to solve this problem is to schedule disk clones while the database is shut down. Unfortunately, many IT departments select a disk cloning solution since it is the best solution for mirroring home directories, email servers, and virtualization servers. Database servers have a slightly different backup requirement and require a different way of doing things.

The recommended way of backing up a database is to use archive tools like RMAN or commercially available products like ComVault or Legato. The commercial products provide a common backup process that knows how virtualization servers and databases like to be backed up. It allows you to backup SQL Server and an Oracle database with the same user interface and process. Behind the scenes these tools talk to RMAN and the SQL Server backup utilities but presents a uniform user interface to schedule and manage backups and restores.

Enough rambling about disks and backups. Let’s start talking about how to use cloud storage for our disk replication. Today we are going to talk about database backup to the cloud. The idea behind our use of the cloud is pure economics. We would like to reduce our cost of storage from $3K/TB to $400/TB/year and get rid of the capex cost. The true problem with purchasing storage for our data center is that we don’t want to purchase 10 TB a month because that is what we are consuming for backups. What we are forced to do is look 36 months ahead and purchase 400 TB of disk to handle the monthly data consumption or start deleting data after a period. For things like Census data and medical records the retention period is decades and not months. For some applications, we can delete data after 12 months. If we are copying incremental database backups, we can delete the incrementals once we do a full backup. In our 10 TB a month example we will have to purchase $1.2M in storage today knowing that we will only consume 10 TB this month and 10 TB next month. Using the cloud storage we can pay $300 this month, $600 next month, and grow this amount at $300/month until we get to the 400 TB that we will consume in 36 months. If we guestimated low we will need to purchase more storage again in two years. If we guestimated high we will have overpurchased storage and spend more than $3K/TB for what we are using. Using cloud storage allows us to consume storage at $400/TB/year. If we guess wrong and have metered storage there is no penalty. If we are using non-metered storage, we might purchase a little too much but only have to look forward 12 months rather than 36. It typically is easier to guess a year ahead rather than three years.

Just to clarify, we are not talking about moving all of our backup data to the cloud all at once. What we are talking about is doing daily incremental backups to your high speed disk attached to your database. After a few days we do a full backup to lower cost network storage. After a few weeks we copy these backups to the cloud. In the diagram below we do high speed backups for five days, backup to low speed disks for 21 days, backup to the cloud for periods beyond that. We show moving data to tape in the cloud beyond 180 days. The cost benefit is to take data that we probably won’t read to a lower cost storage. Using $400/TB/year gives us a $2600/TB cost savings in capex. Using $12/TB/year tape gives us an additional $388/TB cost savings in opex.

The way that we get this storage tiering is to modify the RMAN backup libraries and move the tape interface from an on-site tape unit to disk or tape in the cloud. The library module can be download from the oracle tech network. More information on this service can be found at Oracle documentation or Backup whitepaper. You can also watch videos that describe this service

The economics behind this can be seen in a TCO analysis that we did. In this example we look at moving 30 TB of backup from on-premise disk to cloud backup. The resulting 4 year savings is $120K. This does not take into account tangential savings but only looks at physical cost savings of not purchasing 30 TB of disk.

Let’s walk through what is needed to make this work. First we have to download the library module that takes RMAN read and write commands and translates them into REST api commands. This library exists for Oracle Storage Cloud Services as well as Amazon S3. The key benefits to the Oracle Storage Cloud is that you get encryption and parallelism for free as part of the service. With Amazon S3 you need to pay for additional parallel channels at $1500/channel as well as encryption in the database at $10K/processor license. The Oracle Storage Cloud provides this as part of the $33/TB/month database backup bundle.

Once we download the module, we need to install it with a java command. Note that this is where we tie the oracle home and SID to the cloud credentials. The data is stored in a database wallet as well as the encryption keys used to encrypt the backups.

Now that we have replaced the tape interface with cloud storage, we need to define a tape interface for RMAN and link the library into the process. When we read and write to tape we are actually reading and writing to cloud storage.

Once we have everything configured, we use RMAN or ComVault or Legato as we have for years. Accessing the tape unit is really accessing the cloud storage.

In summary, this is our first use case of the cloud. We are offsetting the cost of on-premise storage and reduce the cost of our database backups. A good rule of thumb is that we can drop the cost of backups from $3K/TB plus $300/TB/year to $400/TB/year. Once we have everything downloaded an installed, nothing looks or feels different from what we have been doing for years. When you start looking at purchasing more disks because you are running out of space, look at moving your backups from local disk to the cloud.

So how about other cloud providers

If you are looking for a cloud hosting provider, the number one question that comes up is which one to use. There are a ton of cloud providers. How do you decide which one is best for you? To be honest, the answer is it depends. It depends on what your problem is and what problem you are trying to solve. Are you trying to solve how you communicate with customers? If so do you purchase something like SalesForce or Oracle Sales Cloud, you get a cloud based sales automation tool. Doing a search on the web yields a ton of references. Unfortunately, you need to know what you are searching for. Are you trying to automate your project management (Oracle Primavera or Microsoft Project)? Every PC magazine and trade publication have opinions on this. Companies like Gartner and Forrester write reviews. Oracle typically does not rate well with any of these vendors for a variety of reasons.

My recommendation is to look at the problem that you are trying to solve. Are you trying to lower your cost of on-site storage? Look at generic cloud storage. Are you trying to reduce your data center costs and go with a disaster recovery site in the cloud? Look at infrastructure in the cloud and compute in the cloud. I had a chance to play with VMWare VCloud this week and it has interesting features. Unfortunately, it is a really bad generic cloud storage company. You can’t allocate 100 TB of storage and access it remotely without going through a compute engine and paying for a processor, operating system, and OS administrator. It is really good if I have VMWare and want to replicate the instances into the cloud or use VMotion to move things to the cloud. Unfortunately, this solution does not work well if I have a Solaris of AIX server running in my data center and want to replicate into the cloud.

The discussion on replication opens a bigger can of worms. How do you do replication? Do you take database and java files and snap mirror them to the cloud or replicate them as is done inside a data center today? Do you DataGuard the database to a cloud provider and pay on a monthly basis for the database license rather than owning the database? Do you setup a listener to switch between your on-site database and cloud database as a high availability failover? Do you setup a load balancer in front of a web server or Java app server to do the same thing? Do you replicate the visualization files from your VMWare/HyperV/OracleVM/Zen engine to a cloud provider that supports that format? Do you use a GoldenGate or SOA server to physically replicate objects between your on-site and cloud implementation? Do you use something like the Oracle Integration server to synchronize data between cloud providers and your on-premise ERP system?

Once you decide on what level to do replication/fail over/high availability you need to begin the evaluation of which cloud provider is best for you. Does your cloud provider have a wide range of services that fits the majority of your needs or do you need to get some solutions from one vendor and some from another. Are you ok standardizing on a foundation of a virtualization engine and letting everyone pick and choose their operating system and application of choice? Do you want to standardize at the operating system layer and not care about the way things are virtualized? When you purchase something like SalesForce CRM, do you even know what database or operating system they use or what virtualization engine supports it? Do or should you care? Where do you create your standards and what is most important to you? If you are a health care provider do you really care what operating system that your medical records systems uses or are you more interested in how to import/export ultrasound images into your patients records. Does it really matter which VM or OS is used?

The final test that you should look at is options. Does your cloud vendor have ways of easily getting data to them and easily getting data out. Both Oracle and Amazon offer tape storage services. Both offer disks that you can ship from your data center to their cloud data centers to load data. Which one offers to ship tapes to you when you want to get them back? Can you only backup from a database in the cloud to storage in the cloud? What does it cost to get your data back once you give it to a cloud provider? What is the outbound charge rate and did you budget enough to even terminate the service without walking away from your data? Do they provide an un-limited read and write service so that you don’t get charged for outbound data transfer.

Picking a choosing a cloud vendor is not easy. It is almost as difficult as buying a house, a car, or a phone carrier. You will never know if you made the right choice until you get penalized for making the wrong choice. Tread carefully and ask the right questions as you start your research.

Storage on Azure

Yesterday we were out on a limb. Today we are going to be skating on thin ice. Not only do I know less about Azure than AWS but Microsoft has significantly different thoughts and solutions on storage than the other two cloud vendors. First, let’s look at the available literature on Azure storage

There are four types of storage available with Azure storage services; blob storage, table storage, queue storage, and file storage. Blob storage is similar to the Oracle Block Storage or Amazon S3 storage. It provides blocks of pages that can be used for documents, large log files, backups, databases, videos, and so on. Blobs are objects placed inside of containers that have characteristics and access controls. Table storage offers the ability to store key/attribute entries in a semi-structured dataset similar to a NoSQL database. The queue storage provides a messaging system so that you can buffer and sequence events between applications. The third and final is file based storage similar to dropbox or google docs. You can read and write files and file shares and access them through SMB file mounts on Windows systems.

Azure storage does give you the option of deciding upon your reliability model by selecting the replication model. The options are locally triple redundant storage, replication between two data centers, replication between different geographical locations, or read access geo-redundant storage.

Since blob storage is probably more relevant for what we are looking for, let’s dive a little deeper into this type of storage. Blobs can be allocated either as block blobs or page blobs. Block blobs are aggregation of blocks that can be allocated in different sizes. Page blobs are of smaller fixed size chunks of 512 bytes for each page blob. Page blogs are the foundation of virtual machines and are used by default to support operating systems running in a virtual machine. Blobs are allocated into containers and inherit the characteristics of the container. Blobs are accessed via REST apis. The address of a blob is formatted as http://(account-name).blob.core.windows.net/(container-name)/(blob-name). Note that the account name is defined by the user. It is important to note that the account-name is not unique to your account. This is something that you create and Microsoft adds it to their DNS so that your ip address on the internet can be found. You can’t choose simple names like test, testing, my, or other common terms because they have been allocated by someone else.

To begin the process we need to log into the Azure portal and browser for the Storage create options.

Once we find the storage management page we have to click the plus button to add a new storage resource.

It is important to create a unique name. This name will be used as an extension of the REST api and goes in front of the server address. This name must be unique so picking something like the word “test” will fail since someone else has already selected it.

In our example, we select wwpf which is an abbreviation for a non-profit that I work with, who we play for. We next need to select the replication policy to make sure that the data is highly available.

Once we are happy with the name, replication policy, resource group, and payment method, we can click Create. It takes a while so we see a deploying message at the top of the screen.

When we are finished we should see a list of storage containers that we have created. We can dive into the containers and see what services each contains.

Note that we have the option of blob, table, queue, and files at this point. We will dive into the blob part of this to create raw blocks that can be used for backups, holding images, and generic file storage. Clicking on the blob services allows us to create a blob container.

Note that the format of the container name is critical. You can’t use special characters or capital letters. Make sure that you follow the naming convention for container names.

We are going to select a blob type container so that we have access to raw blocks.

When the container is created we can see the REST api point for the newly created storage.

We can examine the container properties by clicking on the properties button and looking at when it was created, lease information, file count, and other things related to container access rights.

The easiest way to access this newly created storage is to do the same thing that we did with Oracle Storage. We are going to use the CloudBerry Explorer. In this gui tool we will need to create an attachment to the account. Note that the tool used for Azure is different from the Oracle and Amazon tools. Each cost a little money and they are not the same tool unfortunately. They also only work on a Windows desktop which is challenging if you use a Mac of Linux desktop.

To figure out your access rights, go to the storage management interface and click on the key at the top right. This should open up a properties screen showing you the account and shared access key.

From here we can access the Azure blob storage and drag and drop files. We first add the account information then navigate to the blob container and can read and write objects.

In this example, we are looking at virtual images located on our desktop “E:\” drive and can drag and drop them into a blob container for use by an Azure compute engine.

In summary, Azure storage is very similar to Amazon S3 and Oracle Storage Cloud Services. The cost is similar. The way we access it is similar. The way we protect and restrict access to it is similar. We can address it through a REST api (which we did not detail) and can access it from our desktop or compute server running in Azure. Overall, storage in the cloud is storage in the cloud. You need to examine your use cases and see which storage type works best for you. Microsoft does have an on-premise gateway product called Azure SimpleStor which is similar to the Amazon Storage Gateway or the Oracle Cloud Storage Appliance. It is more of a hardware solution that attaches via iSCSI to existing servers.