One of the newest concepts that has been introduced for cloud services is the concept of un-metered or dedicated services. Before we dive into this subject, let’s review what a cloud service really is. When you boil it all down, you are basically leasing computer resources on a computer that you don’t own. You are taking a slice of a compute engine, slice of a disk drive, part of a network connection. You are renting space. Think of it as living in an apartment. Yes, this is a silly analogy but if you think about it, it makes sense. You can rent an efficiency, one, two, or three bedroom apartment. You can get parking with or without a roof over your car. You can get a storage closet or a garage to store stuff that you are not using but want to keep around. There are benefits to apartments. You don’t have to cut the grass. You typically have access to a pool but don’t need to maintain it. If the toilet backs up or the gas stops working you call the super and they come fix it. You still have to replace your own light bulbs that burn out. You still need to clean your own bathroom and kitchen and take out your trash on a regular basis. On the grander scale, you don’t need to drop 10% down and get a mortgage to live there. Monthly rents are typically cheaper than paying down a mortgage. Your taxes are bundled into your rent cost. You basically show up, use the apartment and go on with your life. On the flip side, there are drawbacks to apartments. If your upstairs neighbor likes to play heavy metal at 2am or throw wild parties on the weekends it does make it hard to sleep. Someone might park in your parking spot so you need to park farther away from your front door. You can’t pull into your garage to unload your groceries and have to potentially carry them in the rain across the parking lot and up the stairs to your third floor apartment. The super might decide that Tuesday they are going to repaint all bathrooms another color and you need to be out of the way for a day and put up with the smell even though you planned a dinner party the next night. It is difficult to grill on your balcony and you can’t really sit out without sharing the space with all of your neighbors. The true downfall is that twenty years from now, you will still be renting your apartment (and the rent probably went up every other year) while your college buddies are celebrating a mortgage burning party and the only thing that they owe on a monthly basis is the taxes that the government takes annually.
Yes, our analogy is silly. Yes, our analogy is relevant. It is easy to decide that you want another job in another city so you hire a mover, pack up all your stuff, and move to another apartment. This is where our analogy breaks down. Cloud vendors charge you for every piece of furniture that you take out of the building. They charge you to use the stairs or elevator. They charge you every time a moving van exits the building full of furniture and boxes of clothes. It is free to bring stuff in because it locks you into the apartment. Just don’t try to take anything out. Remember that storage closet or garage that you got with your apartment, you can open the door and put stuff in for free but if you carry anything out (even if you just relocate it to your apartment) you get charged per item that you carry across the threshold.
If you look at storage from any cloud vendor they offer a metered storage service. The same is true for compute services. You can lease a virtual processor and memory and grind on data all that you want. The catch is when you want to transfer your files or report results of you analysis to your desktop computer, you get charged on a per gigabyte transferred across the internet. Cost calculators help you calculate these costs but they are a little hard to estimate and use to calculate outbound data charges. Amazon, for example, has a calculators that you can use. The AWS pricing calculator allows you to look at the cost of all cloud services.
Let’s walk through the cost of Amazon Glacier. The price list says that you should pay $0.007/GB/month or $7/TB/month to keep things in cold storage. We will use 120 TB as our basis for analysis. We put this as the amount to store and see the cost of storing the data is $860/month.
If we plan on reading back 10% of this data during the month the price goes up to $2217. The bulk of these charges are the outbound charges. The cost goes to $921 if we read the data to an EC2 instance and not all the way back to our data center or desktop computer. To use our apartment analogy, you are paying $860 to get a storage garage. You pay $61 every time you take something out and move it to your apartment. You can put all you want into the storage area (as long as your don’t exceed the space of the storage unit) but taking something out will cost you. If you put your recently retrieved item in your car or a truck and drive it out the gate you get a surcharge of $1300. It is important to remember that pulling more stuff out of your storage will cost you more. Putting this in terms of computer archive, you can store all of your emails, contracts, customer transactions, patient records in long term storage. If your on-site storage fails for some reason or if you get a legal request to review five years of data, you can pull the data back from cold storage. It will cost you to pull the data back but it is still cheaper than keeping the seven years of longer data on spinning disk in your data center (estimate $3K/TB plus 10% per year for spinning disk in your data center).
We can do the same calculation for cloud storage using S3. We can store 120 TB for roughly $3950/month. If we want to read back 10%, or 12 TB, of that data, it will cost us $5150 or $1200 additional.
We can reduce the cost by using lower speed storage in the cloud. We put the S3 data into the infrequent category to save money. This drops the cost to just over $3K which does save us about $2100/month. We agree to pay a lower cost to get higher latency and longer retrieval times. It is better than using tape in the cloud and we can save some money with this option.
We can opt for reduced redundancy storage (aka non-mirrored and non-replicated data) but we risk data loss since we will only have a single copy in the cloud. This drops the cost to $4300 with the data retrieval but we have to weigh the cost vs data loss risks.
Let’s not pick on Amazon. How does this compare to Azure? Unfortunately, we can’t start with Microsoft tape in the cloud, they don’t offer the service. We must start with block storage in the Azure cloud. Microsoft has an Azure pricing calculator that you can use to perform the same calculations. The calculator and pricing is a little difficult to use when you first get into it. You basically need to put together the calculation a piece at a time. You need to factor in the cost of the storage and the cost of transferring the data from the cloud to your data center. This is done in two different pieces. An example of what we are looking for can be seen below.
We need to piece together the calculator. First we add the storage component then the bandwidth component. There is a transaction component but this amount is trivial and we are going to ignore it for simplicity.
If we look at the options for Azure storage, we can basically select blob storage in different zones. In the grand scheme of things, the cost is not siginificantly higher one way or another. The basic cost is about the same.
The third class of storage that are going to look at is the Oracle Storage Cloud Service. We can look at Oracle Storage Cloud Service as well as Oracle Archive Cloud Service. The Archive service compares directly to the Amazon Glacier service except that it is $1/TB/month and suffers the same transfer charges for outbound data. The Oracle Storage Cloud Service is similar to the Amazon S3 and Azure Blob Storage Service but it is offered either as a metered service (as is S3 and Blobs) and un-metered services. Unfortunately, Oracle does not provide a cost calculator for general use. The Value Added Distributors are given a copy of the calculator but it is not generally available. The key difference with the Oracle storage services is that there are two significant flavor differences; metered and non-metered. The metered services are charged just like the Microsoft and Amazon services. You pay for what you use on a per GB basis and pay for outbound data transfer. An example of the pricing calculator is shown below. Note that we do need to have a good guestimate on how much data we will transfer outbound across the internet. These charges are not consumed if you are reading the data to a compute engine in the cloud unlike S3 which still consumes cost just for reading the data off the disk.
The most significant differential in storage offerings is the non-metered storage. Oracle offers storage in blocks that you reserve and allocate for 12 months. This is different from the metered storage in that metered can start with 10 TB and grow to 120 TB over the year. With the non-metered storage, you start with 120 TB and end with 120 TB. You can extend your contract and grow storage but you basically extend a new contract for more storage. You can not shrink your storage and pay for less. The benefit of this is that you don’t have to pay for outbound data transfer. You can read and write as much as you want and not get charged for transferring the data across the internet. A pricing calculator for this is simple. How much do you need and how long do you need it?
If we piece all of this together and look a price comparison between the three service providers, the answer of which is cheapest comes down to it depends. Oracle non-metered storage has a significant advantage if you are planning on reading back your data at high or unpredictable rates. Amazon S3 infrequent is the cheapest if you don’t plan on reading back your data and want it as an insurance policy only. I honestly would go with Glacier or Oracle Archive if this is the case since it is an order of magnitude cheaper. The chart below compares 120 TB of storage and the variable charge for reading back this data on a monthly basis. If you have 120 TB of storage and plan on reading back 120 TB on a monthly basis, the Oracle non-metered storage is significantly cheaper. If you are only planning on reading back 12 to 24 TB per month the cost is about the same for all of the services.
In summary, one option is not clearly better than the other (except for high read rates) and this blog is intended to help you decide on what fits your needs best. Pricing calculators can help with the cost based on transfer rates. It is important to remember that storage transfer is a significant part of the calculation. It is also important to look at your usage model. We assumed that you started with 120 TB and ended with 120 TB for our analysis. If you start with 12 TB and grow to 120 TB, the pricing calculation will be a little different. Neither the Amazon nor Azure calculators will help you run this simulation and you will have to calculate everything on a month by month basis. It is also interesting to take 120 TB of on-premise storage and assume that each TB can be purchased at $3K/TB. If we assume 10% annual hardware maintenance and a three year amortization, the charge for on-premise storage is $1030/month which might be more or less than cloud based storage. Your results might vary.