PaaS – Page 2 – Data Protection Down Pat

Database in Amazon EC2

Today we are going to look at what it takes to get a 12c database instance up and running in Amazon EC2. Note that this is different than our previous posts on getting Standard Edition running on Amazon and running Enterprise Edition running on Amazon RDS. We are going to take the traditional approach as if we were installing the database on a virtual image like VMWare, HyperV, or OracleVM. The approach is to take IaaS and layer the database upon it.

There are a few options on how to create the database instance. We can load everything from scratch, we can load a pre-defined AMI, we can create a golden image and clone it, we can do a physical to virtual then import the instance into the cloud, or we can create a Chef recipe and automate everything. In this blog we are going to skip the load everything because it is very cumbersome and time consuming. You basically would have to load the operating system, patch the operating system, create users and groups, download the binaries, unpack the binaries, manage the firewall, and manage the cloud port access rights. Each of these steps takes 5-30 minutes so the total time to get the install done would be 2-3 hours. Note that this is much better than purchasing hardware, putting it in a data center, loading the operating system and following all the same steps. We are also going to skip the golden image and cloning option since this is basically loading everything from scratch then cloning an instance. We will look at cloning a physical and importing into the cloud in a later blog. In this blog we are going to look at selecting a pre-defined AMI and loading it.

One of the benefits of the Marketplace model is that you get a pre-defined and pre-configured installation of a software package. Oracle provides the bundle for Amazon in the form of an AMI. For these instances you need to own your own perpetual license. It is important to understand the licensing implications and how Oracle defines licensing for AWS. Authorized Cloud Environment instances with 4 or fewer virtual cores are counted as 1 socket, which is considered equivalent to a processor license. For Authorized Cloud Environment instances with more than 4 virtual cores, every 4 virtual cores used (rounded up to the closest multiple of
4) equate to a licensing requirement of 1 socket. This is true for the Standard Edition license. For the Enterprise Edition license the assumption is that the cloud processor is an x86 chip set to a processor license is required for every 2 virtual cores. All of the other software like partitioning, diagnostics, tuning, compression, advanced security, etc also need to be licensed with the same metric.

If we look at the options for AMIs available we go to the console, click on EC2, and click on Launch Instance.

When we search for Oracle we get a wide variety of products like Linux, SOA, and database. If we search for Oracle database we refine the search a little more but get other supplementary products that are not the database but products that relate to the database. If we search for Oracle database 12c we get six return values.

We find two AMIs that look the same but the key difference is that one limits you to 16 cores and the other does not. We can select either one for our tests. If we search the Community AMIs we get back a variety of 11g and 10g installation options but no 12c options. (Note that the first screen shot is the Standard Edition description, it should be the Enterprise Edition since two are listed).

We are going to use the Commercial Marketplace and select the first 12c database instance. This takes us to a screen that lets us select the processing shape. Note that the smaller instances are not allowed because you need a little memory and a single core does not run the database very well. This is one of the advantages over selecting an operating system ourselves and finding out that we selected too few cores or not enough memory. Our selections are broken down into general purpose, compute optimized, or storage optimized. The key difference is how many cores, how much memory, and dedicated vs generic IOPs to the disk.

We could select an m3.xlarge or c3.xlarge and the only difference would be the amount of memory allocated. Network appears to be a little different with the c3.xlarge having less network throughput. We are going to select the m3.xlarge. Looking at pricing we should be charged $0.351/hour for the Ec2 instance, $0.125 per GB-month provisioned or $5/month for our 40 GB of disk, and $0.065 per provisioned IOP-month or $32.50/month. Our total cost of running this x3.xlarge instance will be $395.52/month or $13.18/day. We can compare this to a similarly configured Amazon RDS at $274.29/month. We need to take into account that we will need to purchase two processor licenses of the Enterprise Edition license at $47,500 per processor license. The cost of this license over four years will be $95,000 for the initial license plus 22% or $20,900 per year for support. Our four year cost of ownership will be $178,600. Amortizing this over four years brings this cost to $3,720/month. Our all in cost for the basic Enterprise Edition will cost us $4,116.35/month. If we want to compare this to the DBaaS cost that we covered earlier we also need to add the cost of the Transparent Data Encryption so that we can encrypt data in the cloud. This module is included in the Advanced Security Module which is priced at $15,000 per processor license. The four year cost of ownership for this package is $56,400 bringing the additional cost to $1,175/month. We will be spending $5,291.35 for this service with Amazon.

If we want to compare this with PaaS we have the option or purchasing the same instance at $1,500/OCPU/month or $3,000/month or $2.52/OCPUhour for the Enterprise Edition on a Virtual Image. We only need two OCPUs because this provides us with two threads per virtual core where Amazon provides you with one thread per core. We are really looking for thread count and not virtual core count. Four virtual processors in Amazon is equivalent to two OCPUs so our cost for a virtual image will be $1.5K/OCPU * 2 OCPUs. If we go with the Database as a Service we are looking at $3,000/OCPU/month or $6,000/month or $5.04/OCPU/hour for the Enterprise Edition as a service. What we need to rationalize is the extra $708/month for the PaaS service. Do we get enough benefit from having this as a service or do we spend more time and energy up front to pay less each month?

If we are going to compare the High Performance edition against the Amazon EC2 edition we have to add in the options that we get with High Performance. There are 13 features that need to be licensed to make the comparison the same. Each of these options cost anywhere from $11,500 per processor to $23,000 per processor. We saw earlier that each option will add $1,175/month so adding the three most popular options, partitioning, diagnostics, and tuning, will cost $3,525/month more. The High Performance edition will cost us $2,000/OCPU/month or $4K/month for the virtual image and $4,000/OCPU/month or $8K/month. Again we get ten more options bundled on with the High Performance option at $8K/month compared to $8,816.35 with the AWS EC2 option. We also get all of the benefits of PaaS vs IaaS for this feature set.

Once we select our AMI, instance type, we have to configure the options. We can request a spot instance but this is highly discouraged for a database. If you get terminated because your instance is needed you could easily loose data unless you have DataGuard configured and setup for synchronous data commit. We can provision this instance into a virtual private network which is different from the way it is done in the Oracle cloud. In the Oracle cloud you provision the service then configure the virtual instance. In Amazon EC2 it is done at the same time. You do have the option of provisioning the instance into one of five instance zones but all are located in US East. You can define the administration access roles with the IAM role option. You have to define these prior to provisioning the database. You can also define operating of this instance from the console. You can stop or terminate the instance when it is shut down as well as prohibit someone from terminating the instance unless they have rights to do so. You can enable CloudWatch (at an additional charge of $7.50/month) to monitor this service and restart it if it fails. We can also add elastic block attachment so that our data can migrate from one instance to another at an additional cost.

We now have to consider the reserved IOPs for our instance when we look at the storage. By default we get 8 GB for the operating system, 50 GB for the data area with 500 provisioned IOPS, and 8 GB for log space. The cost of the reserved IOPS adds $38.75/month. If we were looking at every penny we would also have to look at outbound traffic from the database. If we read all of our 50 GB back it would increase the price of the service by a little over $3/month. Given that this is relatively insignificant we can ignore it but it was worthy of looking at with the simple monthly calculator.

Our next screen is the tags which we will not use but could be used to search if we have a large number of instances. The screen after that defines the open ports for this service. We want to add other ports like 1521 for the database, and 443 and 80 for application express. Port 1158 and 22 were predefined for us to allow for enterprise manager and ssh access.

At this point we are ready to launch our instance. We will have 50 GB of table space available and the database will be provisioned and ready for us upon completion.

Some things to note in the provisioning of this instance. We were never asked for an OID for the database. We were never asked for a password associated with the sys, system, or sysdba user account. We were never asked for a password to access the operating system instance. When we click on launch we are asked for an ssh key to access the instance once it is created.

When you launch the instance you see a splash screen then a detail screen as the instance is created. You also get an email confirming that you are provisioning an instance from the marketplace. At this point I notice that I provisioned Standard Edition and not Enterprise Edition. The experience is the same and nothing should change up to this point so we can continue with the SE AMI.

Once the instance is created we can look at the instance information and attach to the service via putty or ssh. The ip address that we were assigned was 54.242.14.146. We load the private key and ip address into putty and connect. We first failed with oracle then got an error message with root. Once we connect with ec2-user we are asked if we want to create a database, enter the OID, and enter the sys, system, and dbsnmp passwords.

The database creation takes a while (15-30 minutes according to the create script) and you get a percent complete notification as it progresses. At this point we have a database provisioned, the network configured, security through ssh keys to access the instance, and should be ready to connect to our database with sql developer. In our example it took over an hour to create the database after taking only five minutes to provision the operating system instance. The process stalled at 50% complete and sat there for a very long time. I also had to copy the /home/ec2-user/.ssh/authorized_keys into the /home/oracle/.ssh directory (after I created it) to allow the oracle user to login. The ec2-user account has rights to execute as root so you can create this directory, copy the file, and change ownership of the .ssh directory and contents to oracle. After you do this you can login as oracle and manage the database who owns the processes and directories in the /u01 directory.

It is important to note that the database in EC2 provides more features and functions than the Amazon RDS version of the database. Yes, you get automated backup with RDS but it is basically a snapshot to another storage cloud instance. With the EC2 instance you get features like spatial, multi-tenant, and sys access to the database. You also get the option to use RMAN for backups to directories that you can read offsite. You can setup DataGuard and Enterprise Manager. The EC2 feature set is significantly more robust but requires more work to setup and operate.

In summary, we looked at what it takes to provision a database onto Amazon EC2 using a pre-defined AMI. We also looked at the cost of doing this and found out that we can minimally do this at roughly $5.3K/month. When we add features that are typically desired this price grows to $8.8K/month. We first compared this to running DBaaS in a virtual instance in the Oracle Public Cloud at $6K/month (with a $3K/month smaller footprint available) and DBaaS as a service at $8K/month (with a $4K/month smaller footprint available). We talked about the optional packs and packages that are added with the High Performance option and talked about the benefits of PaaS vs IaaS. We did not get into patching, backups, and restart features provided with PaaS but did touch on them briefly when we went through our instance launch. We also compared this to the Amazon RDS instance in features and functions at about a hundred of dollars per month cheaper. The bulk of the cost is the database license and not the compute or storage configuration. It is important to note that the cost of the database perpetual license is still being paid for if you are running the service or not. With PaaS you do get the option of keeping the data active in cloud storage attached to a compute engine that is running but you can turn off the database license on an hourly or monthly basis to save money if this fits your usage model of a database service.

Amazon RDS

Today we are going to look at Amazon RDS as a solution for running the Oracle Database in the cloud. According to the Amazon RDS website RDS is an easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity while managing time-consuming database administration tasks, freeing you up to focus on your applications and business. Amazon RDS provides you six familiar database engines to choose from, including Amazon Aurora, Oracle, Microsoft SQL Server, PostgreSQL, MySQL and MariaDB.

We are going to focus on the Application Express using Amazon RDS so we won’t dive deep into the different shapes available and skim over pricing in this discussion. With Amazon RDS, you can deploy multiple editions of Oracle Database. You can run Amazon RDS for Oracle under two different licensing models – “License Included” and “Bring-Your-Own-License (BYOL)”. In the “License Included” service model, you do not need separately purchased Oracle licenses; the Oracle Database software has been licensed by AWS. “License Included” pricing. Oracle has a policy paper on cloud licensing. It is important to understand database licensing and how it applies to hard partitions, soft partitions, and cloud environments.

Automated backups are turned on by default and snapshots are enabled as they are for EC2 instances. You can scale up and down processors as well as scale up and down IOPs allocated for the Oracle instance. You can use Amazon VPC to connect this instance to your data center as well as Amazon HSM to encrypt your data.

The two biggest issues that you need to consider with any cloud strategy is security and lock in. Backups are done from Amazon zone to Amazon zone. Oracle RMAN is not available as a backup mechanism and neither is the Oracle Advanced Security. Encryption is done at the disk layer and not inside the database. Amazon strips the ability for you to replicate your data back to your data center or use your security keys to encrypt your data. You have to use their backup tools and their encryption technology using their keys to protect your data. Key management and key rotation become an issue for security sensitive applications and data sets.

Amazon RDS pricing is available on the Amazon web page. Pricing starts at $0.035/hour for a quarter virtual processor and 1 GB of RAM and goes up to $3.64/hour for a standard instance and high memory instance. This pricing is for Standard Edition One of the database and includes the database license. For Standard Edition Two and Enterprise Edition, you must bring your own license. Pricing for this model starts at $0.017/hour and grows to $7.56/hour. You can also pay for a reserved account that dedicates an instance to you for as low as $99/year growing upto $33.8K/year. Data ingestion to load the database is free but there is a cost associated with copying data from the RDS instance to your client at a charge ranging from $0.09/GB/month to $0.05/GB/month at higher transfer rates. We recommend that you use the Amazon AWS Pricing Calculator to figure out your monthly charges.

To create an RDS instance, we go to the AWS Console and select the RDS instance. From here we select the “Get Started Now” button in the middle of the screen. We then select the Oracle tab and the Oracle EE Select button. To save money we are going to select the Dev/Test version but both effectively do the same thing. The key difference in the dev or production selections are minor. The production instance preloads the Instance Class with a m3.xlarge, Multi-AZ turned on, and Storage Type set to SSD. There is one item displayed, Provisioned IOPS, in the create screen that is not in the dev option. We can make the dev option look just like the production option by selecting a large enough instance and turning on Multi-AZ in the next screen.

Production instance

Development instance

We are going to select the latest 12.1.0.2 version, an instance with a little memory, general purpose storage to get 3 IOPS/GB since this is a test instance, and define our ORACLE_SID and account to access the instance.

The next screen is what defines this as Platform as a Service and not an EC2 instance with an AMI. We automatically open ports in the operating system and network access by opening port 1521 for the listener, we confirm the OID, select the character name set, turn on or off encryption in the storage, define the backup window and retention period, as well as patching policies. We are going to accept the defaults and not change anything. The one thing that Amazon does that Oracle does not is define the VPC connection when you define the database. Oracle requires a separate step to create a VPN connection. If you select Multi-AZ, I would have expected to see a selection of zones that you can replicate across. For all of the options that I selected, the Availability Zone was greyed out and I could not select the failover zone. I assume that you have to pre-define a zone relationship to make this work but it was never an option for my tests.

Once you click on Create Instance you see a splash screen and can go to the RDS monitor to look at instances that you have launched.

Once the provisioning is finished we can look at the connection details and use SQL Developer to connect to the instance. It is important to note here that we do not have SSH or command line access to this database instance. We only have access through port 1521 and do not have sys, system, or sysdba access to this database instance.

We can connect with SQL Developer by using the details from the previous screen to get the endpoint, port, and instance identifier.

The first thing to note with the database connection is that the RDS instance does not have a container. You are connecting to the base instance and pluggable databases are not available. If you have purchased the multi-tenant option for the database, RDS is not an option. If we dive into the database configuration we note that auditing is turned off for RDS. The primary reason for this is that you would not have access to the audit logs since you don’t have file system access to read the logs. If you look at the control management access packs parameter, diagnostics and tuning is turned on and enables. This means that you have to purchase these additional packages to run in RDS. There is no way to turn this option off and these licenses are not included as part of the RDS usage license. You also do not have access to this data but have to go through the monitor screens to view this data and sql developer. The database compatability type is set to 12.0.0. Given that we do not have sys access we can not run in compatability mode to help migrate 11g databases into a container. Some important parameters are set that we can not change; enable_ddl_logging is false disabling DataGuard, enable_goldengate_replication is false disabling Golden Gate, enable_pluggable_database is false disabling Multi-Tenant. Default_tbs_type is set to bigfile and there are no mechanisms to change this.

It is important to figure out what the default user that we created can and can’t do when managing the database. The roles assigned to this user are rather limited. We can compare the roles of the oracle user (the one we created) to the sys user. Note that the oracle roles are a short list.

The RDS Option Documentation talks about connecting to enterprise manager and application express. We were not able to connect to ports 1158 or 5500 as suggested. My gut says that this has to do with the port routing rules that were created by default.

If we are running out of table space we can modify the existing instance and grow the storage. This is done by going to the RDS instance page and selecting modify instance. You type in the new storage size and click apply immediately.

Once the modification finishes we can see the new storage in the details page in the RDS console.

We should note that we do not see the addition to the tablespace because it is added to the filesystem but the tablespaces are all configured to auto extend and consume all available space. Unfortunately, this makes it look like all of the tablespace is full and our used percent will always be relatively high for the files that our tables are stored. We need to monitor disk usage separately with a different part of the RDS console.

In summary, you can run the Oracle database in Amazon RDS. There are limitations and issues that you need to be aware of when doing this. Some of the features are limited and not available to you. Some of the features are required which you might not be using today. If you are running an application server in EC2, running the database in RDS makes sense. The intention of this blog is not to tear down one implementation and elevate another but to elevate discussion on what to look out for when you have decided to run in a specific cloud instance. Up next, how does Amazon RDS differ from Amazon EC2 with a pre-configured AMI.

Database as a Virtual Image

The question that we are going to dive into this week is what does it really mean to be platform as a service vs infrastructure as a service. Why not go to Amazon and spin up an EC2 instance or search for an Oracle provided AMI on Amazon or Virtual Image on Azure? What benefit do I get from PaaS? To answer that we need to look at the key differences. Let’s look at the two options when you provision a database in the Oracle DBaaS. When you provision a database you have the option of service levels; Database Cloud Service and Database Cloud Service – Virtual Image. We looked at the provisioning of the cloud service. It provisions a database, creates the network rules, and spins up an instance for us. What happens when we select Virtual Image?

The release and version screens are the same. We selected 12c for the release and High Performance for the version. Note that the questions are much simpler. We are not asked about how much storage. We are not asked for an SID or sys password. We are not asked about backup options. We are not given the option of DataGuard, RAC, or GoldenGate. We are only asked to name the instance, pick a compute shape, and provide an ssh public key.

This seems much simpler and better. Unfortunately, this isn’t true. What happens from here is that a Linux 6.6 instance is created and a tarball is dropped into a staging area. The database is not provisioned. The file system is not prepared. The network ports are not configured and enabled. True, the virtual instance creation only takes a few minutes but all we are doing is provisioning a Linux instance and copying a tarball into a directory. Details on the installation process can be found at Database Cloud Installation – Virtual Image Documentation.

If you look at the detailed information about a system that is being created with a virtual image and a system that is being created as a service there are vast differences.

The first key difference is the amount of information displayed. Both instances have the same edition, Enterprise Edition – High Performance. Both will display this difference in the database as well as in the banner if asked what version the database is. The Service Level is different with the virtual image displayed as part of the service level. This effects the billing. The virtual image is a lower cost because less is done for you.

Product (per OCPU)	General Purpose		High-Memory
	Per Month	Per Hour	Per Month	Per Hour
Standard Edition Service	$600	$1.008	$700	$1.176
Enterprise Edition Service	$3,000	$5.040	$3,100	$5.208
High Performance Service	$4,000	$6.720	$4,100	$6.888
Extreme Performance Service	$5,000	$8.401	$5,100	$8.569

Virtual Image Product (per OCPU)	General Purpose		High-Memory
	Per Month	Per Hour	Per Month	Per Hour
Standard Edition Service	$400	$0.672	$500	$0.840
Enterprise Edition Service	$1,500	$2.520	$1,600	$2.688
High Performance Service	$2,000	$3.360	$2,100	$3.528
Extreme Performance Service	$3,000	$5.040	$3,100	$5.208

The only other information that we get from the management screen is that the instance comsumes 30 GB rather than 100 GB that the database service instance consumes. Note that the database service instance also has the container name and a connection string for connecting to the database. Both will eventually show an ip address and we should look into the operating system to see the differences. The menu to the right of the instance is also different. If we look at the virtual machine instance we only see ssh access, access rules, and deletion of the instance as options.

The ssh access allows us to upload the public key or look at the existing public key that is used to access the instance. The access rules takes us to a new screen that shows the security rules that have been defined for this instance, which is only ssh and nothing else.

If we look at a database as a service instance, the menu is different and allows us to look at things like the DBaaS Monitor, APEX, Enterprise Manager monitor, as well as the ssh and access rules.

Note that the database as a service instance has a lot more security rules defined with most of them being disabled. We can open up ports 80, 443, 4848, 1158, 5500, and 1521. We don’t have to define these rules, just enable them if we are accessing them from a whitelist, ip address range, or public internet.

Once we connect to both instances we can see that both are running

Linux hostname 3.8.13-68.2.2.2.el6uek.x86_64 #2 SMP Fri Jun 19 16:29:40 PDT 2015  x86_64 x86_64 x86_64 GNU/Linux

We can see that the file system is different with the /u01, /u02, /u03, and /u04 partitions not mounted in the screen shots below.

If we look at the installation instructions we see that we have to create the /u01, /u02, /u03, and /u04 disks by hand. These are not created for us. We also need to create a logical volume as well as creating the storage services. Step one is to scale up the service by adding a disk. We need to grow the existing file system by first attaching a logical volume then laying out/expanding the logical volume that we have. Note that we can exactly mirror our on-premise system at this point. If we put everything into a 1 TB /u01 partition and blend the log files and data files into one disk (not really recommended) we can do this.

To add the /u01 disk we need to scale up the service and add storage. Note that we only can add a raw disk and can not grow the data volume as we can with the database service.

Note that this scale up does require a reboot of the service. We have the option of adding one logical unit or a full 1 TB disk then partitioning it or we can add the different volumes into different disks. The drawback of doing this is that the way that attached storage is charged is $50/TB/month so adding four disks that consume 20 GB each will consume $200/month because we are allocated the full 1 TB even though we just allocate 20 GB on each disk. We do not subdivide the disk when it is attached and are charged on a per TB basis and not a per GB basis. To save money it is recommended to allocate a full TB rather than a smaller amount. To improve performance and reliability it is recommended to allocate multiple disks and stripe data across multiple spindles and logical units. This can be done at the logical volume management part of disk management detailed in the documentation in provisioning the virtual image instance.

We can look at the logical volume configuration with the lvm pvdisplay, lvm vgdisplay, and lvm lvdisplay. This allows us to look at the physical volume mapping to map physical volumes to logical unit numbers, look at logical volumes for mirroring and stripping options, and volume group options which gets mapped to the data, reco, and fra areas.

Once our instance has rebooted we note that we added /dev/xvdc which is 21.5 GB in size. After we format this disk it partitions down to a 20 GB disk as we asked. If we add a second disk we will get /dev/xvdd and can map these two new disks into a logical volume that we can map to /u01/and /u02. A nicer command to use to look at this is the lsblk command which does not require elevated root privileges to run.

Once we go through the mapping of the /u01, /u02, /u03, and /u04 disks (the documentation only goes into single disks with no mirroring to mount /u01 and /u02) we can expand the binary bits located in /scratch/db. There are two files in this directory, db12102_bits.tar.gz and db12102_se2bits.tar.gz. These are the enterprise edition and standard edition versions of the database.

We are not going to go through the full installation but look at some of the key differences between IaaS with a tarball (or EC2 with an AMI) and a DBaaS installation. The primary delta is that the database is fully configured and ready to run in about an hour with DBaaS. With IaaS we need to create and mount a file system, untar and install the database, configure network ports, define security rules, and write scripts to automatically start the database upon restarting the operating system. We loose the menu items in the management page to look at the DBaaS Monitor, Enterprise Manager monitor, and Application Express interface. We also loose the patching options that appear in the DBaaS management screen. We loose the automated backup and database instance and PDB creation as is done with the DBaaS.

In summary, the PaaS/DBaaS provisioning in not only a shortcut but removes manual steps in configuring the service as well as daily operations. We could have just as easily provisioned a compute service, attached storage, downloaded the tarball that we want to use from edelivery.oracle.com. The key reasons that we don’t want to do this are first pricing and second patching. If we provision a virtual image of database as a service the operating system is ready to accept the tarball and we don’t need to install the odbc drivers and other kernel modules. We also get to lease the database on an hourly or monthly basis rather than purchasing a perpetual license to run on our compute instance.

Up next, selecting a pre-configured AMI on Amazon and running it in AWS compared to a virtual image on the Oracle Public Cloud.

DBaaS for real this time

We have danced around creating a database in the Oracle Public Cloud for almost a week now. We have talked about Schema as a Service, Exadata as a Service, licensing, and the different versions of DBaaS. Today, let’s tackle what it takes to actually create a database. It is important to note that the accounts that we are using are metered services accounts. We don’t have the option to run as a non-metered service and have to provision the services on an hourly or monthly basis. Unfortunately, we are not going to go through the step by step process of creating a database. There are plenty of other sites that do this well

And my personal favorite

Create DBaaS with REST api by Jean-Philippe Pinte

I personally like the Oracle by Example links. Most of the screen shots are out of date and look slightly different if you go through the steps now. For example, the Configure Backup and Recovery screen shots from the first link above shows local backup as an options. This option has been removed from the menu. My guess is a few months from now all of this will be removed and you will be asked for a container that will be automatically created for you rather than having to enter a container that was previously created as is done now. The critical steps that are needed to follow these examples are

Get a trial cloud account – instructions on how to do this
Log into your cloud account – Account documentation
Navigate to the Database Cloud Service console
Click the Create Instance button
Define the Subscription type, billing type, software release, software edition
Configure your instance with name, description, ssh public key, compute shape, backup mechanism and location, storage size, sys password, SID and PID, and optional configurations (like potentially DataGuard, RAC, and GoldenGate).
Wait for instance to be provisioned
Connect to the database via ssh using ssh private key and putty/ssh
Optionally open up ports (port 1521 for client connect, port 80 for apex)
Do something productive

The tutorials go through screen shots for all of these services. You can also watch this on youtube

Things to watch out for when you create a database instance in the Oracle Public Cloud

If you configure a backup service on a demo system and increase the database size to anything of size, you will overflow the 500 GB of storage in about three weeks. Things will stop working when you try to create a service
All ports are locked down with the exception of ssh. You can use an ssh tunnel to securely connect to localhost:1521 if you tunnel this port. If you are using a demo account you can only open port 1521 to the world. White listing and ip address lists are not supported in the demo accounts
Play with SQL Developer connections across the internet. It works just like it does on-premise. The DBA tool has good management interfaces that allows you to do simple administration services from the tool
Play with Enterprise Manager 13c. It is easy to connect to your database via ssh and add your cloud instance to the OEM console. You can manage it just like an on-premise database. Cloning a PDB to the cloud is trivial. Database backup to the cloud is trivial
Play with unplugging and replugging a PDB in 12c. You can clone and unplug from your on-premise system, copy the xml files to the cloud, and plug in the PDB to create a clone in the cloud.
The longer you let a database run, the smaller your credit will get. If you are playing with a sandbox you can stop a database. This will stop charging for the database (at $3-$5/hour) and you will only get charged for the compute and storage (at $0.10/hour). If you leave a database running for 24 hours you burn through $72-$120 based on your edition selection. You will burn through $3 in 24 hours if you turn off the database and restart it when you want to jump back into your sandbox. Your data will still be there. That is what you are paying $3 a day for.
If you are using a demo system, you can extend your evaluation once or twice. There is a button at the top right allowing you to extend you evaluation period. Make sure you do this before time runs out. Once time runs out you need to request another account from another email address.
If you are playing with an application, make sure that you spin up WebLogic or Tomcat in a Java or Compute instance in the same account. Running a application server on-premise and a database in the cloud will suffer from latency. You are shipping MB/GB across with select statement returns. You are shipping KB/MB to paint part of a screen. It is better to put the latency between the browser and the app server than the app server and the database server
Request an account in Amazon and Azure. The more you play with DBaaS in the Oracle environment the more you will appreciate it. Things like creating a RAC cluster is simple. Linking a Java Service to a Database Service is simple. Running a load balancer in front of a Java Service is easy. Play with the differences between Iaas with a database and Paas DBaaS. There is a world of difference.
If you run your demo long enough, look at the patch administration. It is worth looking at since this is a major differential between Oracle, Amazon, and Azure.

In summary, we didn’t go through a tutorial on how to create a database as a service. At this point all of you should have looked at one or two tutorials, one or two videos, and one or two documentation pages. You should have a sample database to move forward with. It does not matter if it is Standard Edition, or Enterprise Edition, High Performance, or Extreme Performance. You should have a simple database that we can start to play with. The whole exercise should have taken you about an hour to learn and play and an hour to wait for the service to run to completion. Connect via ssh and run sqlplus as the oracle user. Open up port 1521 and download SQL Developer and connect to your cloud instance. Explore, play, and have fun experimenting. That is the primary reason why we give you a full database account and not a quarter of an account that you can’t really do much with.

technology behind DBaaS

Before we can analyze different use cases we need to first look at a couple of things that enable these use cases. The foundation for most of these use cases is data replication. We need to be able to replicate data from our on-premise database into a cloud database. The first issue is replicating data and the second is access rights to the data and database allowing you to pull the data into your cloud database.

Let’s first look at how data is stored in a database. If you use a Linux operating system, this is typically done by splitting information into four categories; ORACLE_HOME, +DATA, +FRA, and +RECO. The binaries that represent the database and all of the database processes go into the ORACLE_HOME or ORACLE_BASE. In the cloud this is dropped into /u01. If you are using non-rac the file system is a logical volume manager (LVM) where you stripe multiple disks to mirror or triple mirror data to keep a single disk failure from bringing down your database or data. If you are using a rac database this goes into ASM. ASM is a disk technology that manages replication and performance. There are a variety of books and websites written on this technology

LVM links

ASM links

The reason why we go into storage technologies is that we need to know how to manage how and where data is stored in our DBaaS. If we access everything with IaaS and roll out raw compute and storage, we need to know how to scale up storage if we run out of space. With DBaaS this is done with the scale up menu item. We can grow the file system by adding logical units to our instance and grow the space allocated for data storage or data logging.

The second file system that we should focus on is the +DATA area. This is where data is stored and all of our file extents and tables are located. For our Linux cloud database this is auto-provisioned into /u02. In our test system we create a 25 GB data area and get a 20G file system in the +DATA area.

If we look at the /u02 file system we notice that there is one major directory /u02/app/oracle/oradata. In the oradata there is one directory associated with the ORACLE_SID. In our example we called it ORCL. In this directory we have the control01.dbf, sysaux01.dbf, system01.dbf, temp01.dbf, undotbs01.dbf, and users01.dbf. These files are the place where data is stored for the ORCL SID. There is also a PDB1 directory in this file structure. This correlates to the pluggable database that we called PDB1. The files in this directory correspond to the tables, system, and user information relating to this pluggable database. If we create a second pluggable a new directory is created and all of these files are created in that directory. The users01.dbf, PDB1_users01.pdf in the PDB1 directory, file defines all of the users and their access rights. The system01.dbf file defines the tables and system level structures. In a pluggable database the system01 file defines the structures for the PDB1 and not the entire database. The temp01.dbf holds temp data tables and scratch areas. The sysaux01.dbf contains the system information contains the control area structures and management information. The undotbs01.dbf is the flashback area so that we can look at information that was stored three days ago in a table. Note that there is no undotbs01.dbf file in the pluggable because this is done at a global area and not at the pluggable layer. Backups are done for the SID and not each PID. Tuning of memory and system tunables are done at the SID layer as well.

Now that we have looked at the files corresponding to tables and table extents, we can talk about data replication. If you follow the methodology of EMC and NetApp you should be able to replicate the dbf files between two file systems. Products like SnapMirror allow you to block copy any changes that happen to the file to another file system in another data center. This is difficult to do between an on-premise server and cloud instance. The way that EMC and NetApp do this are in the controller layer. They log write changes to the disk, track what blocks get changed, and communicate the changes to the other controller on the target system. The target system takes these block changes, figures out what actual blocks they correspond to on their disk layout and update the blocks as needed. This does not work in a cloud storage instance. We deal on a file layer and not on a track and sector or bock layer. The fundamental problem with this data replication mechanism is that you must restart or ingest the new file into the database. The database server does not do well if files change under it because it tends to cache information in memory and indexes into data get broken if data is moved to another location. This type of replication is good if you have an hour or more recovery point objective. If you are looking at minutes replication you will need to go with something like DataGuard, GoldenGate, or Active DataGuard.

DataGuard works similar to the block change recording but does so at the database layer and not the file system/block layer. When an update or insert command is executed in the database, these changes are written to the /u04 directory. In our example the +REDO area is allocated for 9.8 GB of disk. If we look at our /u04 structure we see /u04/app/oracle/redo contains redoXX.log file. With DataGuard we take these redo files, compress them, and transfer them to our target system. The target system takes the redo file, uncompresses it, and applies the changes to the database. You can structure the changes either as physical logging or logical logging. Physical logging allows you to translate everything in the database and records the block level changes. Logic logging takes the actual select statement and replicates it to the target system. The target system either inserts the physical changes into the file or executes the select statement on the target database. The physical system is used more than the logical replication because logical has limitations on some of the statements. For example, any blob or file operations can not translate to the target system because you can’t guarantee that the file structure is the same between the two systems. There are a variety of books available on DataGuard. It is also important to note that DataGuard is not available for Standard Edition and Enterprise Edition but for High Performance Edition and Extreme Performance Edition only.

Oracle Data Guard 11g Handbook
Oracle Dataguard: Standby Database Failover Handbook
Creating a Physical Standby Documentation
Creating a Logical Standby Documentation
Golden Gate is a similar process but there is an intermediary agent that takes the redo log, analyzes it, and translates it into the target system. This allows us to take data from an Oracle database and replicate it to SQL Server. It also allows us to go in the other direction. SQL Server, for example, is typically used for SCADA or process control systems. The Oracle database is typically used for analytics and heavy duty number crunching on a much larger scale. If we want to look at how our process control systems is operating in relation to our budget we will want to pull in the data for the process systems and look at how much we spend on each system. We can do this by either selecting data from the SQL Server or replicating the data into a table on the Oracle system. If we are doing complex join statements and pulling data in from multiple tables we would typically want to do this on one system rather than pulling the data across the network multiple times. Golden Gate allows us to pull the data into a local table and perform the complex select statements without having to suffer network latency more than the initial copy. Golden Gate is a separate product that you must pay for either on-premise or in the cloud. If you are replicating between two Oracle databases you could use Active DataGuard to make this work and this is available as part of Extreme Edition of the database.

The /u03 area in our file system is where backups are placed. The file system for our sample system shows /u03/app/oracle/fast_recovery_area/ORCL. The ORCL is the ORACLE_SID of our installation. Note that there is no PDB1 area because all of the backup data is done at the system layer and not at the pluggable layer. The tool used to backup the database is RMAN. There are a variety of books available to help with RMAN as well as an RMAN online tutorial
It is important to note that RMAN requires a system level access to the database. Amazon RDS does not allow you to replicate your data using RMAN but uses a volume snapshot and copies this to another zone. The impact of this is that first, you can not get your data out of Amazon with a backup and you can not copy your changes and data from the Amazon RDS to your on-premise system. The second impact is that you can’t use Amazon RDS for DataGuard. You don’t have sys access into the database which is required to setup DataGuard and you don’t have access to a filesystem to copy the redo logs to drop into. To make this available with Amazon you need to deploy the Oracle database into EC2 with S3 storage as the back end. The same is true with Azure. Everything is deployed into raw compute and you have to install the Oracle database on top of the operating system. This is more of an IaaS play and not a PaaS play. You loose patching of the OS and database, automated backups, and automatic restart of the database if something fails. You also need to lay out the file system on your own and select LVM or some other clustering file system to prevent data loss from a single disk corruption. All of this is done for you with PaaS and DBaaS. Oracle does offer a manual process to perform backups without having to dive deep into RMAN technology. If you are making a change to your instance and want a backup copy before you make the change, you can backup your instance manually and not have to wait for the automated backup. You can also change the timing if 2am does not work for your backup and need to move it to 4am instead.

We started this conversation talking about growing a table because we ran out of space. With the Amazon and Azure solutions, this must be done manually. You have to attach a new logical unit, map it into the file system, grow the file system, and potentially reboot the operating system. With the Oracle DBaaS we have the option of growing the file system either as a new logical unit, grow the /u02 file system to handle more table spaces, or grow the /u03 file system to handle more backup space.

Once we finish our scale up the /u03 file system is no longer 20 GB but 1020 GB in size. The PaaS management console allocates the storage, attaches the storage to the instance, grows the logical volume to fill the additional space, and grows the file system to handle the additional storage. It is important to note that we did not require root privileges to do any of these operations. The DBA or cloud admin can scale up the database and expand table resources. We did not need to involve an operating system administrator. We did not need to request an additional logical unit from the storage admin. We did not need to get a senior DBA to reconfigure the system. All of this can be done either by a junior DBA or an automated script to grow the file system if we run out of space. The only thing missing for the automated script is a monitoring tool to recognize that we are running into a limit. The Oracle Enterprise Manager (OEM) 12c and 13c can do this monitoring and kick off processes if thresholds are crossed. It is important to note that you can not use OEM with Amazon RDS because you don’t have root, file system, or system access to the installation which is required to install the OEM agent.

In summary, we looked at the file system structure that is required to replicate data between two instances. We talked about how many people use third party disk replication technologies to “snap mirror” between two disk installations and talked about how this does not work when replicating from an on-premise to a cloud instance. We talked about DataGuard and GoldenGate replication to allow us to replicate data to the cloud and to our data center. We looked at some of the advantages of using DBaaS rather than database on IaaS to grow the file system and backup the database. Operations like backup, growing the file system, and adding or removing processors temporarily can be done by a cloud admin or junior DBA. These features required multiple people to make this happen in the past. All of these technologies are needed when we start talking about use cases. Most of the use cases assume that the data and data structures that exist in your on-premise database also exist in the cloud and that you can replicate data to the cloud as well as back from the cloud. If you are going to run a disaster recovery instance in the cloud, you need to be able to copy your changes to the cloud, make the cloud a primary instance, and replicate the changes back to your data center once you bring your database back online. The same is true for development and testing. It is important to be able to attach to both your on-premise database and database provisioned in the cloud and look at the differences between the two configurations.

DBaaS in Oracle Public Cloud

Before we dive deep into database as a service with Oracle we need to define some terms. We have thrown around concepts like Standard Edition, Enterprise Edition, High Performance Edition, and Extreme Performance Edition. We have talked about concepts like DataGuard, Real Application Clustering, Partitioning, and Compression. Today we will dive a little deeper into this so that we can focus on comparing them running in the Oracle Public Cloud as well as other cloud providers.

First, let’s tackle Standard Edition (SE) vs Enterprise Edition (EE). Not only is there a SE, there is a SE One and SE2. SE2 is new with the 12c release of the database and the same as SE and SE1 but with different processor and socket restrictions. The Oracle 12c documentation details the differences between the different versions. We will highlight the differences here. Note that you can still store data. The data types do not change between the versions of the database. A select statement that works in SE will work in SE2 and will work in EE.

The first big difference between SE and EE is that SE is licensed on a per socket basis and EE is licensed on a per core basis. The base cost of a SE system is $600 per month per processor in the Oracle Public Cloud. The Standard Edition is limited to 8 cores in the cloud. If you are purchasing a perpetual license the cost is $17,500 and can run across two sockets or single sockets on two systems. The SE2 comes with a Real Application Cluster (RAC) license so that you can have a single instance running on two computers. The SE2 instance will also limit the database to run in 16 threads so running in more cores will have no advantage. To learn more about the differences and limitations, I recommend reading Mike Dietrich’s Blog on SE2.

The second big difference is that many of the optional features are not available with SE. For example, you can’t use diagnostics and tuning to figure out if your sql command is running at top efficiency. You can’t use multi-tenant but you can provision a single pluggable database. This means that you can unplug and move the database to another database (and even another version like EE). The multi-tenant option allows you to have multiple pluggable databases and control them with a master SGA. This allows admins to backup and patch a group of databases all at once rather than having to patch each one individually. You can separate security and have different logins to the different databases but use a global system or sys account to manage and control all of the databases. Storage optimization features like compression and partitioning are not available in SE either. Data recovery features like DataGuard and FlashBack are not supported in SE. DataGuard is a feature that copies changes from one system through the change logs and apply them to the second system. FlashBack does something similar and allows you to query a database at a previous time and return the state of the database at that time. It uses the change log to reconstruct the database as it was at the time requested. Tools like RMAN backup and streams don’t work in SE. Taking a copy of a database and copying it to another system is not allowed. The single exception to this is RMAN works in the cloud instance but not in the perpetual on-premise version. Security like Transparent Data Encryption, Label Security, Data Vault, and Audit Vault are not supported in SE. The single exception is transparent data encryption to allow for encryption in the public cloud is supported for SE. All of these features are described here.

When we get Enterprise Edition in the Oracle Public Cloud at $3K/OCPU/month or $5.04/OCPU/hour and the only option that we get is transportation data encryption (TDE) bundled with the database. This allows us to encrypt all or part of a table. TDE encrypts data on the disk when it is written with a SQL insert or update command. Keys are used to encrypt this data and can only be read by presenting the keys using the Oracle Wallet interface. More information on TDE can be found here. The Security Inside Out blog is also a good place to look for updates and references relating to TDE. This version of the database allows us to scale upto 16 processors and 4.6 TB of storage. If we are looking to backup this database, the largest size that we can have for storage is 2.3 TB. If our table requirements are greater than 2.3 TB or 4.6 TB you need to go to Exadata as a Service or purchase a perpetual license and run it on-premise. If we are looking to run this database in our data center we will need to purchase a perpetual license for $47.5K per processor license. If you are running on an IBM Power Server you need to license each processor per core. If you are running on x86 or Sparc servers you multiply the number of cores by 0.5 and can run two cores per processor license. TDE is part of the Advanced Security Option which lists for $15K per processor license. When calculating to see if it is cheaper to run on-premise vs the public cloud you need to factor in both license requirements. The same is true if you decide to run EE in AWS EC2 or Azure Compute. Make sure to read Cloud Licensing Requirements to understand the limits of the cost of running on EC2 or Azure Compute. Since all cloud providers use x86 processors the multiplication factor is 0.5 times the number of cores on the service.

The High Performance Edition contains the EE features, TDE, as well as multi-tenant, partitioning, advanced compression, advanced security, real application testing, olap, DataGuard, and all of the database management packs. This is basically everything with the exception of Real Application Clusters (RAC), Active DataGuard, and In-Memory options. High Performance comes in at $4K/processor/month or $6.72/OCPU/hour. If we wanted to bundle all of this together and run it in our data center we need to compare the database at $47.5K/processor license plus roughly $15K/processor/option (there are 12 of them). We can then calculate which is cheaper based on our accounting rules and amortization schedule. The key differential is that I can use this version on an hourly or monthly basis for less than a full year. For example, if we do patch testing once a quarter and allocate three weeks a quarter to test if the patch is good or bad, we only need 12 weeks a year to run the database. This basically costs us $12K/processor/year to test on a single processor and $24K on a dual processor. If we purchased the system it would cost us $47.5K capital expenditure plus 22% annually for support. Paying this amount just to do patch testing does not make sense. With the three year cost of ownership running this on premise will cost us $78,850. If we use the metered services in the public cloud this will cost us $72K. The $6,850 does not seem like a lot but with the public cloud service we won’t need to pay for the hardware, storage, or operating system. We can provision the cloud service in an hour and replicate our on site data to the cloud for the testing. If we did this to a computer or virtual image on site it will take hours/days to provision a new computer, storage, database, and replicate the data.

It is important to note here that you need to be careful with virtualization. You need to use software that allows for hard partitioning. Products like VMWare and HyperV are soft partitioning virtualization software. This means that you can grow the number of processors dynamically and are required to license the Oracle software for the potential high water mark or all of the cores in the cluster. If you are running on something like a Cisco UCS blade server that has a dual socket 16 core processor, you must license all 32 cores to run the database even though you might just create a 2 core virtual instance in this VMWare installation. It gets even worse if you cluster 8 blades into one cluster then you must license all 256 cores. This get a little expensive at $47.5K times 128 processors. Products like OracleVM, Solaris Contailers, and AIX LPARs solve this cost problem with hard partitions.

The third enterprise edition is the Extreme Performance Edition of the database. This feature is $5K/OCPU/month or $8.401/processor/hour. This option comes with RAC, Active DataGuard, and In-Memory. RAC allows you to run across multiple compute instances and restart queries that might fail if one node fails. Active DataGuard allows you to have two databases replicating to each other and for both to be open and active at the same time. Regular or passive DataGuard allows you to replicate the data but not keep the target open and active. In-Memory allows you to store data not only in row format but in column format. When data is entered into the table it is stored on disk in row format. A copy is also placed in memory but stored in column format. This allows you to search faster given that you have already sorted the data in memory and can skip stuff that does not apply to your search. This is typically done with an index but we can’t always predict what questions that the users are going to ask and adding too many indexes slows down all operations.

It is important to reiterate that we can take our perpetual license and run it in IaaS or generic compute. We can also effectively lease these licenses on a monthly or hourly rate. If you are running the database, you are consuming licenses. If you stop the database, you stop consuming the database license but continue to consume the storage and processor services. If you terminate the database you stop consuming the database, processor, and storage services because they are all deleted upon termination.

In summary, there are four flavors of DBaaS; Standard Edition, Enterprise Edition, High Performance Edition, and Extreme Performance Edition. Standard Edition and Enterprise Edition are available by other cloud providers but some require perpetual licenses and some do not. If you decide to run this service as PaaS or DBaaS in the Oracle Public Cloud you can pay hourly or monthly and start/stop these services if they are metered to help save money. All of these services come with partial management features offloaded and done by Oracle. Backups, patches, and, restart of services are done automatically for you. This allows you to focus more on how to apply the database service to provide business benefits rather than the feeding and maintenance to keep the database operational.

Up next, we will dive into use cases for database as a service and look at different configurations and pricing models to solve a real business problem.

Exadata as a Service

For the last four days we have been focusing on Database as a Service in the cloud. We focused on Application Express, or Schema as a Service, in the last three days and looked at pricing and how to get APEX working in the Oracle Public Cloud, Amazon AWS, and Microsoft Azure. With the Oracle Public Cloud we have three options for database in the cloud at the platform as a service layer; Schema as a Service, Database as a Service, and Exadata as a Service. We could run this in compute as a service but have already discussed the benefits of offloading some of the database administration work with platform as a service (backup, patching, restarting services, etc).

The question that we have not adequately addressed is how you choose between the three services offered by Oracle. We touched on one of the key questions, database size, when we talked about Schema as a Service. You can have a free database in the cloud if your database is smaller than 25 MB. It will cost you a little money, $175/month, if you have a database smaller than 5 GB. You can grow this to 50 GB and stay with the Schema as a Service. If your database is larger than 50 GB you need to look at Database as a Service or Exadata as a Service. You also need to look at these alternatives if you are running an application in a Java container and need to attach to the database through the standard port 1521 since Schema as a Service only supports http(s) connection to the database. If you can query the database with a REST api call, Schema as a Service is an option but is not necessarily tuned for performance. Products like WebLogic or Tomcat or other Java containers can buffer select statements in cache and not have to ask the same question over and over again from the database. For example, if we census data and are interested in the number of people who live in Texas, we get back roughly 27 million rows of data from the query. If we want to drill down and look at how many people live in San Antonio, we get back 1.5 million rows. If our Java code were smart enough and our application server had enough buffer space, we would not need to read the 27 million rows back when we want to just look at the 1.5 million rows relating to San Antonio. The database can keep the data in memory as well and does not need to read the data back from disk to make the select statement to find the state or city rows that match the query.

Let’s take a step back and talk about how a database works. We create a table and put information in columns like first name, last name, street address, city, state, zip code, email address, and phone number. This allows us to contact each person either through snail mail, email, or phone. If we allocate 32 bytes for each field we have 8 fields and each row takes up 256 bytes to identify each person. If we store data for each person who lives in Texas we consume 27 million rows. Each row takes up 256 bytes. The whole table will fit into 6.9 GB of storage. This data is stored in a table extent or file that we save into the /u02/data directory. If we expand our database to store information about everyone who lives in the United States we need 319 million rows. This will expand our database to 81.7 GB. Note that we have crossed the boundary for Schema as a Service. We can’t store this much information in a single table so we have to look at Database as a Service or Exadata as a Service. Yes, we can optimize our database by using less than 32 bytes per column. We can store zip codes in 16 bytes. We can store phone numbers in 16 bytes. We can store state information in two bytes. We can also use compression in the database and not store the characters “San Antonio” in a 32 byte field but store it in an alternate table once and correlate it to the hexadecimal number 9c. We then store 9c into the state field which tells us that the city name is stored in another table. This saves us 1.5 million times 31 bytes (one to store the 9c) or 46 MB of storage. If we can do this for everyone in Texas shrink the storage by 840 MB. This is roughly 13% of what we had allocated for all of the information related to people who live in Texas. If we can do this for the city, state, and zip code fields we can reduce the storage required by 39% or shrink the 81.7 GB to 49.8 GB. This is basically what is done with a technology called Hybrid Columnar Compression (HCC). You create a secondary table that correlates the 9c value to the character string “San Antonio”. You only need to store the character string once and the city information shrinks from 32 bytes to 1 byte. When you read back the city name, the database or storage that does the compression returns the string to the application server or application.

When you do a select statement the database looks for the columns that you are asking for in the table that you are doing a select from and returns all of the data that matches the where clause. In our example we might use

select * from census where state = 'Texas';
select * from census where city = 'San Antonio';

We can restrict what we get back by not using the “*” value. We can get just the first_name and last_name and phone number if that is all we are interested in. The select statement for San Antonio will return 1.5 million rows times 8 columns times 32 bytes or 384 MB of data. A good application server will cache this 384 MB of data and if we issue the same select statement again in a few seconds or minutes we do not need to ask the database again. We issue a simple request to the database asking it if anything has changes since the last query. If we are running on a slow internet connection as we find in our homes we are typically running at 3 MB/second download speeds. To transfer all of this data will take us 128 seconds or about two minutes. Not reading the data a second time save us two minutes.

The way that the database finds which 384 MB to return to the application is done similarly. It looks at all of the 81.7 GBs that store the census data and compares the state name to ‘Texas’ or hex value of corresponding to the state name. If the compare is the same, that row is put into a response buffer and transmitted to the application server. If someone comes back a few seconds later and requests the information correlating to the city name ‘San Antonio’, the 81.7 GB is read from disk again and and the 384 MB is pulled out to return to the application server. A smart database will cache the Texas data and recognize that San Antonio is a subset of Texas and not read the 81.7 GB a second time but pull the data from memory rather than disk. This can easily be done by partitioning the data in the database and storing the Texas data in one file or disk location and storing the data correlating to California in another file or disk location. Rather than reading back 81.7 GB to find Texas data we only need to read back 6.9 GB since it has been split out in storage. For a typical SCSI disk attached to a computer, we read data back at 2.5 GB/second. To read back all of the US data it takes us 33 seconds. It we read back all of the Texas data it takes us 2.76 seconds. We basically save 30 seconds by partitioning our data. If we read the Texas data first and the San Antonio data second with our select statements, we can cache the 6.9 GB in memory and not have to perform a second read from disk saving us yet another 33 seconds (or 3 seconds with partitioned data). If we know that we will be asking for San Antonio data on a regular basis we setup an index or materialized view in the database so that we don’t have to sort through the 6.9 GB of data but access the 384 MB directly but read just the relevant 384 MB of data the first time and reduce our disk access times to 0.15 seconds. It is important to note that we have done two simple things that reduced our access time from 33 seconds to 0.15 seconds. We first partitioned the data and the way that we store it by splitting the data by state in the file system. We second created an index that helped us access the San Antonio data in the file associated with Texas without having to sort through all of the data. We effectively pre-sort the data and provide the database with an index. The cost of this is that every insert command to add a new person to San Antonio requires not only updating the Texas table but updating the index associated with San Antonio as well. When we do an insert of any data we must check to see if the data goes into the Texas table and update the index at the same time whether the information correlates to San Antonio or not because the index might change if data is inserted or updated in the middle of the file associated with the Texas information.

Our original question was how do we choose between Schema as a Service, Database as a Service, and Exadata as a Service. The first metric that we used was table size. If our data is greater than 25 MB, we can’t use the free APEX service. If our data is greater than 50 GB, we can’t use the paid APEX or Schema as a Service. If we want to use features like compression or partitioning, we can’t use the Schema as a Service either unless we have sys access to the database. We can create indexes for our data to speed requests but might or might not be able to setup compression or partitioning since these are typically features associated with the Enterprise Edition of the database. If we look at the storage limitations of the Database as a Service we can currently store 4.8 TB worth of data in the database. If we have more data than that we need to go to Exadata as Service. The Exadata service comes in different flavors as well and allows you to store up to 42 TB with a quarter rack, 84 TB with a half rack, and 168 TB with a full rack. If you have a database larger than 168 TB, there are no solutions in the cloud that can store your data attached to an active database. You can backup your data to cloud storage but you can not have an active database attached to it.

If we look a little deeper into the Exadata there are multiple advantages to going with Exadata as a Service. The first and most obvious is that you are suddenly working on dedicated hardware. In most cloud environments you share processors with other users as well as storage. You do not get a dedicated bandwidth from processor to disk but must time share this with other users. If you provision a 16 core system, it will typically consume half of a 32 core system that has two sockets. This means that you get a full socket but have to share the memory and disk bandwidth with the next person running in the same server. The data read from the disk is cached in the disk controller’s cache and your reads are optimized until someone else reads data from the same controller and your cached data gets flushed to make room. Most cloud vendors go with commodity hardware for compute and storage so they are not optimized for database but for general purpose compute. With an Exadata as a Service you get hardware optimized for database and you get all of the processors in the quarter, half, or full rack. There is no competing for memory bandwidth or storage bandwidth. You are electrically isolated from someone in the other quarter or half rack through the Infiniband switch. Your data is isolated on spindles of your own. You get the full 40 GB/second to and from the disk. Reading the 81.7 GB takes 2.05 seconds compared to 32.68 seconds through a standard SCSI disk controller. The data is partitioned and stored automatically so that when we ask for the San Antonio data, we only read back the 384 MB and don’t need to read back all of the data or deal with the index update delays when we write the data. The read scans all 81.7 GB and returns the results in 0.01 seconds. We effectively reduce the 33 seconds it took us previously and dropped it to 10 ms.

If you want to learn more about Exadata and how and why it makes queries run faster, I would recommend the following books

or the following youtube video channels

or the following web sites

The Exadata as a Service is a unique offering in the cloud. Amazon and Microsoft have nothing that compares to it. Neither company offers dedicated compute that is specifically designed to run a database in the cloud with dedicated disk and dedicated I/O channels. Oracle offers this service to users of the Enterprise Edition of the database that allows them to replicate their on-premise data to the cloud, ingest the data into an Exadata in the cloud, and operate on the data and processes unchanged and unmodified in the cloud. You could take your financial data that runs on a 8 or 16 core system in your data center and replicate it to an Exadata in the cloud. Once you have the data there you can crunch on the data with long running queries that would take hours on your in house system. We worked with a telecommunications company years ago that was using an on-premise transportation management system and generated an inventory load list to put parts on their service trucks, work orders for the maintenance repair staff, and a driving list to route the drivers on the optimum path to cover the largest number of customers in a day. The on-premise system took 15-16 hours to generate all of this workload and was prone to errors and outages requiring the drivers to delay their routes or parts in inventory to be shipped overnight for loading in the morning onto the trucks. Running this load on an Exadata dropped the analytics to less than an hour. This allowed trucks to be rerouted mid-day to higher profit customers to handle high priority outages as well as next day delivery of inventory between warehouses rather than rush orders. Reducing the analytics from 15 hours to less than an hour allowed an expansion of services as well as higher quality of services to their customer base.

Not all companies have daily issues like this and look for higher level processing once a quarter or once or twice a year. Opening new retail outlets, calculating taxes due, or provisioning new services that were purchased as Christmas presents are three examples of predictable, periodic instances where consuming a larger footprint in the cloud rather than investing in resources that sits idle most of the year in your data center. Having the ability to lease these services on an monthly or annual basis allows for better utilization of resources not only in your data center but reduces the overall spend of the IT department and expanding the capabilities of business units to do things that they normally could not afford.

Exadata as a Service is offered in a non-metered configuration at $40K per month for a quarter rack (16 cores and 144 TB of disk), $140K per month for a half rack (56 cores and 288 TB of disk), or $280K per month for a full rack (112 cores and 576 TB of disk). The same service is offered on a metered basis for $80K for a quarter rack, $280K for a half rack, and $560K for a full rack (in the same configuration as the non-metered service). One of the things that we recommend is that you analyze the cost of this service. Is it cheaper to effectively lease a quarter rack at $80K for a month and get the results that you want, effectively lease a quarter rack at $480K for a year, or purchase the hardware, database license, RAC licenses, storage cell licenses, and other optional components to run this in your data center. We will not dive into this analysis because it truly varies based on use cases, value to your company for the use case, and cost of running one of these services in your data center. It is important to do this analysis to figure out which consumption model works for you.

In summary, Exadata as a Service is a unique service that no other cloud vendor offers. Having dedicated hardware to run your database is unique for cloud services. Having hardware that is optimized for long, complex queries is unique as well. Exadata is one of the most popular hardware solutions offered by Oracle and having it available on a monthly or annual basis allows customers to use the services at a much lower cost than purchasing a box or larger box for their data center. Having Oracle manage and run the service frees up your company to focus on the business impact of the hardware and accelerated database rather than spend month to administer the server and database. Tomorrow we will dive into Database as a Service and see how a generic database in the cloud has a variety of use cases and different cost entry points as well as features and functions.

APEX in Azure

Today we are going to look and see what it takes to get Schema as a Service running in the Microsoft Azure Cloud. Our last two entries looked at Schema as a Service, or Application Express, or APEX, running in the Oracle Public Cloud for free, $175, $900, or $2000/month or running in the Amazon RDS Service for $50 – $2700/month. The idea behind Schema as a Service is that you are leasing a database instance on a compute server in the cloud. You get automated backup, html and htmls interfaces into the database, tools to load and unload data, interfaces to run ad-hoc queries or scripted queries, and REST apis to access our data. The pricing on the Oracle Public Cloud is based on how much storage that you consume for your database. The cost on Amazon RDS is based on the compute power allocated to the database. Storage is charged separately and the more you consume, the more you get charged.

Azure uses the Amazon model for pricing in that it charges on a processor shape. Automated backup and the APEX interface/libraries are not included and requires a separate step to install and configure the software on top of the Oracle database. The default operating installation is Windows Server and you are billed monthly for this shape as well. Azure does not offer backup or any other part of a managed service but only provided a virtual image with the Oracle database pre-installed on a Windows Server in the cloud. The pricing for this software is

Shape	Cores	RAM	Disk	Standard Edition	Enterprise Edition	Shape price
A0 Basic or Standard	0.25	0.75 GB	19 GB	$1.11/hr or $826/mo	$3.16/hr or $2,351/mo	0.02/hr or $15/mo
A1 Basic or Standard	1	1.75 GB	224 GB	$1.11/hr or $826/mo	$3.16/hr or $2,351/mo	0.09/hr or $67/mo
A2 Basic or Standard	2	3.5 GB	489 GB	$1.11/hr or $826/month	$3.16/hr or $2,351/mo	0.18/hr or $134/mo
A5 Standard	2	14 GB	489 GB	$1.11/hr or $826/month	$3.16/hr or $2,351/mo	0.33/hr or $246/mo
A3 Basic or Standard	4	7 GB	999 GB	$1.28/hr or $952/mo	$6.32/hr or $4,702/mo	0.036/hr or $268/mo
A6 Standard	4	4	28 GB	999 GB	$1.28/hr or $952/mo	$6.32/hr or $4,702/mo
A4 Basic or Standard	8	14 GB	2,039 GB	$2.55/hr or $1,897/mo	$12.63/hr or $9,397/mo	0.72/hr or $536/mo
A7 Standard	8	56 GB	2,039 GB	$2.55/hr or $1,897/mo	$12.63/hr or $9,397/mo	1.32/hr or $982/mo
A8 Standard	8	56 GB	382 GB	$2.55/hr or $1,897/mo	$12.63/hr or $9,397/mo
A10 Standard	8	56 GB	382 GB	$2.55/hr or $1,897/mo	$12.63/hr or $9,397/mo
A9 Standard	16	112 GB	382 GB	$5.10/hr or $3794/mo	$25.27/hr or $18,801/mo
A11 Standard	16	112 GB	382 GB	$5.10/hr or $3794/mo	$25.27/hr or $18,801/mo
D1 Standard	1	3.5 GB	50 GB	$1.11/hr or $826/mo	$3.16/hr or $2,351/mo	$0.14/hr or $104/mo
D2 Standard	2	7 GB	100 GB	$1.11/hr or $826/mo	$3.16/hr or $2,351/mo	$0.28/hr or $208/mo
D11 Standard	2	14 GB	100 GB	$1.11/hr or $826/mo	$3.16/hr or $2,351/mo	$0.33/hr or $246/mo
D3 Standard	4	14 GB	200 GB	$1.28/hr or $952/mo	$6.32/hr or $4,702/mo	$0.56/hr or $417/mo
D12 Standard	4	28 GB	200 GB	$1.28/hr or $952/mo	$6.32/hr or $4,702/mo	$0.652/hr or $485/mo
D4 Standard	8	28 GB	400 GB	$2.55/hr or $1,897/mo	$12.63/hr or $9,397/mo	$1.12/hr or $833/mo
D14 Standard	16	112 GB	800 GB	$5.10/hr or $3,794/mo	$25.27/hr or $18,801/mo	$2.11/hr or $1571/mo

Note that this is a generic database pricing with compute shape pricing. It is not a Schema as a Service pricing. We still need to layer APEX on top of the generic database installation. This provides the cheapest option for deploying Schema as a Service as $841/month or $1.13/hr for Standard Edition or $2,366/month or $3.18/hr for Enterprise Edition. We can basically stop here. Running an Oracle database on a single core with less than 1 GB is unusable. Going to the A1 Basic or Standard shape increases the memory to 1.75 GB which might work for a single schema and the 70 GB of disk will store enough tables for most use cases.

Let’s walk through creation of an Oracle database on Azure. First we go to Azure portal and search for the Oracle virtual machine by clicking on New.

We are looking for either the Enterprise Edition or Standard Edition of the database.

We select the Standard Edition and ask that it be provisioned.

Once we select the product, we select the shape that we will run this instance on. The shape recommended is relatively large so we have to look at all shapes and we select the A1 shape primarily for cost reasons. We should look at what we are trying to do and load the right core count and memory footprint.

We select the network and storage models for this instance. We go with the defaults in this example rather than adding another storage instance since we are just looking for a small footprint.

Some things to note here. First, we have the option of a username and password or ssh to connect to the operating system. Second, we are never presented with any information about the operating system. Based on the documentation that we read earlier, I assume that this would be Windows Server. It turns out that is actually Oracle Enterprise Linux 6.7. Third, we are never asked information about the database. We are not asked about the SID, password for sys, or ports to open up or use to connect to the database. It turns out that the database is installed in the /u01 directory but no database is created. You still need to run the dbca to create a database instance and start a listener. There are other OS options available to install the database on. We theoretically could have selected Windows 2012 but these options did not come up with our search.

It took a few minutes to start up the database virtual machine. We can look up the details on the instance to find the ip address to connect to and use putty or ssh to connect to the instance.

When we log in we can look at the installation directory and notice that everything is installed in /u01. When we look at the /etc/oratab we notice that nothing is configured or installed. We will need to run oratab to create a database instance. We will then need to download and install APEX to configure Schema as a Service.

In summary, we can install and run an Oracle database on Azure. The installation is lacking a bit. This is more of an Infrastructure as a Service with a binary pre-installed but not configured. This is not Database as a Service and lacking when we compare it to Schema as a Service. To get Schema as a Service we need to download software, install it, change the network configuration, and update the virtual machine network (we did not show this). Microsoft has a good tutorial on the steps needed to create the database once the virtual machine is installed. You do get root access to the operating system. You do get sys access to the database. You do get file system access through the operating system. The documentation says that the service is on Windows but got provisioned on Linux. We might have done something wrong looking back or the documentation is wrong. The service is priced for generic database starting at $800/month or more based on the shape you select. This installation is not DBaaS or PaaS. Backups are not automatically done for you. Patches are not configure or installed. You basically get an operating system with a binary configured. The database is not configured and ports are not configured to allow you to connect across the internet.

Apex in Amazon AWS

Today we are going to look at running Schema as a Service using Amazon AWS as your IaaS foundation. Yesterday we looked at Schema as a Service using Oracle DBaaS. To quickly review, you can run a schema in the cloud using apex.oracle.com for tables upto 25 MB for free or cloud.oracle.com for tables of 5 GB, 20GB, or 50 GB for a monthly fee. You do not get a login to the operating system, database, or file system but do everything through an html interface. You can query the database through an application that you write in the APEX interface or through REST api interfaces, both of which are accessible through http or https. Today we are looking at what it takes and how much it cost to do the same thing using Amazon AWS.

Amazon offers a variety of database options as a managed service. This is available through the Amazon RDS Console. This screen looks like

If you go to the RDS Getting Started page you will see that you can provision

MySQL
Oracle
SQL Server
Maria DB
or an Amazon custom database, Aurora

We won’t go into a compare and contrast in this blog entry to compare the different database but go into the Oracle RDS offerings and look at what you get that compares to Schema as a Service offered by Oracle.

The Oracle database offerings that you get from Amazon RDS are

Oracle Standard Edition One
Oracle Standard Edition Two
Oracle Standard Edition
Oracle Enterprise Edition

Note that we can launch any version in EC2 but we are trying to look for a platform as a service where the database is pre-configured and some management is done for us like patching, backups, and operating system maintenance, failure detection, and service restarting.

You can launch 11g or 12c version of the Oracle database but it is important to note that through RDS you do not get access to the operating system, file system, or sys/system user. There is an elevated user that lets you perform a limited function list but not all options are available to this elevated user. Some features are also not enabled in the 12c instance

In-Memory
Spatial
multi-tenant or pluggable databases
Real Application Clusters (RAC)
Data Guard / Active Data Guard
Connection to Enterprise Manager
Automated Storage Management
Database Vault
Java libraries
Locator services
Label Security

In the 11g instance the above list plus the following features are not supported

Real Application Testing
Streams
XML DB

The following roles and privileges are not provided to users in Amazon RDS

Alter database
Alter system
Create any directory
Drop any directory
Grant any privilege
Grant any role

Before we dive into the usage of Amazon RDS, let’s talk pricing and licensing. The only option that you have for a license included with RDS is the Standard Edition One license type. To figure out the cost, we must look at the sizes that we can provision as well as the RDS cost calculator. To start this journey, we start at the AWS console, go to the RDS console, and select Oracle SE1 as the instance type.

If we select the license-included License Model we get to look at the shapes that we can deploy as well as the versions of the database.

We can use the cost calculator in conjunction to figure out the monthly cost of deploying this service. For our example we selected 11.2.0.4 v7, db.t2.micro (1 vCPU, 1 GB RAM), and 20 GB of storage. For this shape we find that the monthly cost will be $25.62. We selected the 11.2.0.4 version because this is the only 11g option available to us for the SE1 licensed included selection. We could have selected the 12.1.0.1 as an option. If we select any other version we must bring our own license to run on AWS. It is important to look at the outbound transfer rate because this cost is some times significant. If we put 20 GB outbound traffic the price increases to $26.07 which is not significant. This says that we can backup our entire database once a month offsite and not have to pay a significant to get our database off RDS.

It is important to look at the shape options that we have for the different database versions. We should also look at the cost associated with it. For 11g we have

db.t2.micro (1 vCPU, 1 GB) – $25.62/month
db.t2.small (1 vCPU, 2 GB) – $51.24/month
db.t2.medium (2 vCPU, 4 GB) – $102.48/month
db.t2.large (2 vCPU, 8 GB)- $205.70/month
db.m4.large (2 vCPU, 8 GB)- $300.86/month
db.m4.xlarge (4 vCPU, 16 GB)- $602.44/month
db.m4.2xlarge (8 vCPU, 32 GB)- $1324.56/month
db.m4.4xlarge (16 vCPU, 64 GB) – $2649.11/month
db.m3.medium (1 vCPU, 3.75 GB) – $153.72/month
db.m3.large (2 vCPU, 7.5 GB) – $307.44/month
db.m3.xlarge (4 vCPU, 15 GB) – $614.88/month
db.m3.2xlarge (8 vCPU, 30 GB) – $1352.74/month
db.r3.large (2 vCPU, 15 GB) – $333.06/month
db.r3.xlarge (4 vCPU, 30 GB) – $666.12/month
db.r3.2xlarge (8 vCPU, 61 GB) – $1465.47/month
db.r3.4xlarge (16 vCPU, 122 GB) – $2930.93/month
db.m2.xlarge (2 vCPU, 17 GB) – $409.92/month
db.m2.2xlarge (4 vCPU, 34 GB) – $819.84/month
db.m2.4xlarge (8 vCPU, 68 GB) – $1803.65/month
db.m1.small (1 vCPU, 3.75 GB) – $84.18/month
db.m1.medium (2 vCPU, 7.5 GB) – $168.36/month
db.m1.large (4 vCPU, 15 GB) – $336.72/month
db.m1.xlarge (8 vCPU, 30 GB) – $673.44/month

For the 12c version we have

db.m1.small (1 vCPU, 3.75 GB) – $84.18/month

db.m3.medium (1 vCPU, 3.75 GB) – $153.72/month

db.m3.large (2 vCPU, 7.5 GB) – $307.44/month

db.m3.xlarge (4 vCPU, 15 GB) – $614.88/month

db.m3.2xlarge (8 vCPU, 30 GB) – $1352.74/month

db.m2.xlarge (2 vCPU, 17 GB) – $409.92/month

db.m2.2xlarge (4 vCPU, 34 GB) – $819.84/month

db.m2.4xlarge (8 vCPU, 68 GB) – $1803.65/month

db.m1.medium (2 vCPU, 7.5 GB) – $168.36/month

db.m1.large (4 vCPU, 15 GB) – $336.72/month

db.m1.xlarge (8 vCPU, 30 GB) – $673.44/month

If we want to create the database, we can select the database version (11g), the processor size (smallest just to be cheap for demo purposes), and storage. We define the OID, username and password for the elevated user, and click next

We then confirm the selections and backup schedule (scroll down to see), and click on Launch.

When we launch this, the system shows that the inputs were accepted and the database will be created. We can check on the status by going to the RDS console.

It takes a few minutes to provision the database instance, 15 minutes in our test. When the creation is finished we see available rather than creating for the status.

Once the instance is created we can connect to the database using the Oracle Connection Instructions and connect using sqlplus installed on a local machine connecting to a remote database (the one we just created), using the aws connection tools to get status (aws rds describe-db-instances –headers), or connecting with sql developer to the ip address, port 1521, and user oracle with the password we specified. We chose to open up port 1521 to the internet during the install which is not necessarily best practices.

Note that we have fallen short of Schema as a Service. We have database as a service at this point. We will need to layer application express on top of this to get Schema as a Service. We can install APEX 4.1.1 on the 11g instance that we just created by following installation instructions. Note that this is a four step process followed by spinning up and EC2 instance and installing optional software to run a listener because the APEX listener is not supported on the RDS instance. We basically add $15-$20/month to spin up a minimal EC2 instance and install the listener software and follow the nine step installation process to link the listener to the RDS instance.

The installation and configuration steps are similar for 12c. We can provision a 12c instance of the database in RDS, spin up an EC2 instance for the listener, and configure the EC2 instance to point to the RDS instance. At this point we have the same experience that we have as Schema as a Service with the Oracle DBaaS option.

In summary, we can provision a database into the Amazon RDS. If we want anything other than Standard Edition One, we need to bring our own license at $47.5K/two cores for Enterprise Edition or $17.5K for Standard Edition and maintain our annual support cost at 22%. If we want to use a license provided by Amazon we can but are limited to the Standard Edition One version and APEX 4.1.1 with 11.2.0.4 or APEX 4.2.6 with 12.1.0.1. Both of these APEX versions are a major version behind the 5.0 version offered by Oracle through DBaaS. On the positive side, we can get a SE One instance with APEX installed for about $50/month for 11g or $100/month for 12c which is slightly cheaper than the $175/month through Oracle. The Oracle product is Enterprise Edition vs Standard Edition One on AWS RDS and the Oracle version does come with APEX 5.0 as well as being configured for you upon provisioning as opposed to having to perform 18 steps and spin up an EC2 instance to act as a listener. It really is difficult to compare the two products as the same product but if you are truly only interested in a simple query engine schema as a service in the cloud, RDS might be an option. If you read the Amazon literature, switching to Aurora is a better option but that is a discussion for another day.

Application Express or Schema as a Service

Today we are doing to dive headlong into Schema as a Service. This is an interesting option offered by Oracle. It is a unique service that is either free or can cost as much as $2K/month. Just a quick review, you get the Oracle database from an http interface for:

10 MB storage – free
25 MB storage – free
5 GB – $175/month
20 GB – $900/month
50 GB – $2000/month

When you consume this service you don’t get access to an operating system. You don’t get access to a file system. You don’t get access to the database from the command line. All access to the database is done through an http or https interface. You can access the management console to load, backup, and query data in the database. You can load, backup, and update applications that talk to the data in the database. You can create a REST api that allows you to read and write data in your database as well as run queries against the database. You get a single processor access with 7.5 GB of RAM running the 12c version of the Oracle database inside a pluggable container and isolated from other users sharing this processor with you. Microsoft offers an Express Edition of SQL Server and the table storage service that allows you to do something similar. Amazon offers a lightweight database that does simple table lookups. The two key differences between these products is that all access to the Oracle Schema as a Service is done through http or https. Applications can be written but run inside the database and not on a separate server as is done with the other similar cloud options. Some say this is an advantage, some say it is a disadvantage.

We covered this topic from a different angle a month ago in convertign excel to apex and printing from apex. Both of these blog entries talk about how to use Schema as a Service to solve a problem. Some good references on Schema as a Service can be found at

I typically use safari books subscription to get these books on demand and on my iPad for reading on an airplane.

When we login to access the database we are asked to present the schema that we created, a username, and a password. Note in this example we are either using the external free service apex.oracle.com or the Oracle corporate service for employees apex.oraclecorp.com. The two services are exactly the same. As an Oracle employee I do not have access to the public free service and am encouraged to use the internal service for employees. The user interface is the same but screen shots will bounce between the two as we document how to do things.

Once we login we see a main menu system that allows us to manage application, manage tables in the database, to team development, and download and install customer applications.

The Object Browser allows us to look at the data in the database. The SQL Commands allow us to make queries into the database. The SQL Scripts allows us to load and save sql commands to run against the database. Utilities allows us to load and unload data. The REST ful service allows us to define html interfaces into the database. If we look at the Object Browser, we can look at table definitions and data stored in tables.

We can use the SQL Commands tab to execute select statements against a table. For example, if we want to look at part number B77077 we can select it from the pricelist by matching the column part_number. We should get back one entry since there is only one part for this part number.

If we search for part number B77473 we get back multiple entries that are the same part number. This search returns six lines of data with more data in other columns than the previous select statement.

The SQL Scripts allows you to load scripts to execute from your laptop or desktop. You can take queries that have run against other servers and run them against this server.

Up to this point we have looked at how to write queries, run queries, and execute queries. We need to look at how to load data so that we have something to query against. This is done in the Utilities section of the Schema as a Service. Typically we start with a data source either as an XML source or an Excel spreadsheet. We will look first at taking an Excel spreadsheet like the one below and importing it into a table.

Note that the spreadsheet is well defined and headers exist for the data. We do have some comments at the top that we need to delete so that the first row becomes the column names in our new table. Step one is to save the sheet as a comma separated value file. Step two is to edit this file and delete the comment and blank like. Step three is to upload the file into the Schema as a Service web site.

At this point we have a data source loaded and ready to drop into a table. The user interface allows us to define the column type and tries to figure out if everything is character strings, numbers, or dates. The tools is good at the import but typically fails at character length and throws exceptions on specific rows. If this happens you can either manually enter the data or re-import the data into an existing table. Doing this can potentially cause replication of data so deleting the new table and re-importing into a new table might be a good thing. You have to play at this point to import your data and get the right column definitions to import all of your data.

Alternatively we can import xml data into an existing table. This is the same process of how we backup our data by exporting it as xml.

At this point we have loaded data using a spreadsheet or csv file and an xml file. We can query the database by entering sql commands or loading sql scripts. We could load data from a sql script if we wanted but larger amounts of data needs to be imported with a file. Unfortunately, we can not take a table from an on-premise database or an rman backup and restore into this database. We can’t unplug a pdb and plug it into this instance. This service does have limitations but for free or less than $200 per month, the service provides a lot of functionality.

To create an application to read and display our data, we must create an application. To do this we go into the application development interface of Schema as a Service.

We select the Application Builder tab at the top and click on Create icon. This takes us to a selection of what type of application to build. We are going to build a desktop application since it has the most function options. We could have easily selected the mobile option which formats displays for a smaller screen format.

We have to select a name for our application. In this example we are calling it Sample_DB since we are just going to query our database and display contents of a table.

We are going to select one page. Note that we can create multiple pages to display different information or views into a table. In our previous blog entry on translating excel to apex we created a page to display the cost of archive and the cost of storage on different pages. In this example we are going to create one page and one page only.

If we have shared components from other apps or want to create libraries that can called from other apps, we have the option at this point to define that. We are not going to do this but just create a basic application to query a database.

We can create a variety of authorization sources to protect our data. In this example we are going to allow anyone to read and write our table. We select no authorization. We could use an external source or look at the user table in the database to authenticate users. For this example, we will leave everything open for our application.

We get a final confirmation screen (not shown) and create the application. When the create is finished we see the following screen that lets us either run the application or edit it.

If we click on the Home icon we can edit the page. This screen is a little daunting. There are to many choices and things that you can do. There are different areas, breadcrumbs for page to page navigation, and items that you can drop into a screen. In this example we are going to add a region by hovering the mouse over the Content Body and right clicking the mouse. This allows us to create a new region in the body of our page.

Note that a new region is created and is highlighted in the editor. We are going to edit the content type. We have a wide variety of options. We could type in static text and this basically becomes a static web page. Note that we could create a graph or chart. We could create a classic report. We could create a form to submit data and query a table. We will use the interactive report because it allows us to enter sql for a query into our table.

In this example we will enter the select statement in the query box. We could pop this box into another window for full screen editing. For our example we are doing a simple select * into our table with select * from pricelist.

When we click the run button at the top right we execute this code and display it in a new window. We can sort this data, we can change the query and click Go. This is an interactive from to read data from our table. If we wanted to restrict the user from reading all of the data we would have selected a standard report rather than an interactive report.

The final part of our tutorial is creation of a REST api for our data. We would like to be able to go to a web page and display the data in the table. For example, if we want to look at the description of part number B77077 it would be nice to do it from a web page or get command at the command line. To do this we go to the SQL Workshop tab and click the RESTful Service icon at the top right.

Again, this screen is a little daunting. We get a blank screen with a create button. Clicking the create button takes to a screen where we need to enter information that might not be very familiar.

The screen we see is asking us to enter a name for our service, a template, and a resource handler. Looking at this for the first time, I am clueless as to what this means. Fortunately, there is an example on how to enter this information if you scroll down and click on the Example button.

If we look at the example we see that the service name is the header that we will hit from our web page.

In our example we are going to create a cloud RESTapi where we expose the pricelist. In this example we call the service cloud. We call the resource template pricelist and allow the user to pass in a part_number to query. In the resource handler we go a get function that does a select from the table. We could pass in the part number that we want to read but for simplicity we ignore the part number and return all rows in the table. Once we click save, we have exposed our table to a web query with no authentication.

Once we have created our REST service we can query the database from a we browser using the url of the apex server/pls/apex/(schema name)/pricelist/(part number). In this example we go to apex.oraclecorp.com/pls/apex/parterncloud/pricelist/B77077. It executes the select statement and returns all rows in the table using JSON format.

In summary, we are able to upload, query, and display datatbase data using http and https protocols. We can upload data in xml or csv format. We can query the database using web based tools or REST interfaces. We can display data either by developing a web based program to display data or pull the data from a REST interface and get the data in JSON format. This service is free if your database size is small. If we have a larger database we can pay for the service as well as host the application to read the data. We have the option to read the data from a REST interface and pull it into an application server at a different location. We did not look at uploading data with a PUT interface through the REST service but we could have done this as well. Up next, how do we implement this same service in AWS or Azure.