May 12, 2016 – Data Protection Down Pat

Before we can analyze different use cases we need to first look at a couple of things that enable these use cases. The foundation for most of these use cases is data replication. We need to be able to replicate data from our on-premise database into a cloud database. The first issue is replicating data and the second is access rights to the data and database allowing you to pull the data into your cloud database.

Let’s first look at how data is stored in a database. If you use a Linux operating system, this is typically done by splitting information into four categories; ORACLE_HOME, +DATA, +FRA, and +RECO. The binaries that represent the database and all of the database processes go into the ORACLE_HOME or ORACLE_BASE. In the cloud this is dropped into /u01. If you are using non-rac the file system is a logical volume manager (LVM) where you stripe multiple disks to mirror or triple mirror data to keep a single disk failure from bringing down your database or data. If you are using a rac database this goes into ASM. ASM is a disk technology that manages replication and performance. There are a variety of books and websites written on this technology

LVM links

ASM links

The reason why we go into storage technologies is that we need to know how to manage how and where data is stored in our DBaaS. If we access everything with IaaS and roll out raw compute and storage, we need to know how to scale up storage if we run out of space. With DBaaS this is done with the scale up menu item. We can grow the file system by adding logical units to our instance and grow the space allocated for data storage or data logging.

The second file system that we should focus on is the +DATA area. This is where data is stored and all of our file extents and tables are located. For our Linux cloud database this is auto-provisioned into /u02. In our test system we create a 25 GB data area and get a 20G file system in the +DATA area.

If we look at the /u02 file system we notice that there is one major directory /u02/app/oracle/oradata. In the oradata there is one directory associated with the ORACLE_SID. In our example we called it ORCL. In this directory we have the control01.dbf, sysaux01.dbf, system01.dbf, temp01.dbf, undotbs01.dbf, and users01.dbf. These files are the place where data is stored for the ORCL SID. There is also a PDB1 directory in this file structure. This correlates to the pluggable database that we called PDB1. The files in this directory correspond to the tables, system, and user information relating to this pluggable database. If we create a second pluggable a new directory is created and all of these files are created in that directory. The users01.dbf, PDB1_users01.pdf in the PDB1 directory, file defines all of the users and their access rights. The system01.dbf file defines the tables and system level structures. In a pluggable database the system01 file defines the structures for the PDB1 and not the entire database. The temp01.dbf holds temp data tables and scratch areas. The sysaux01.dbf contains the system information contains the control area structures and management information. The undotbs01.dbf is the flashback area so that we can look at information that was stored three days ago in a table. Note that there is no undotbs01.dbf file in the pluggable because this is done at a global area and not at the pluggable layer. Backups are done for the SID and not each PID. Tuning of memory and system tunables are done at the SID layer as well.

Now that we have looked at the files corresponding to tables and table extents, we can talk about data replication. If you follow the methodology of EMC and NetApp you should be able to replicate the dbf files between two file systems. Products like SnapMirror allow you to block copy any changes that happen to the file to another file system in another data center. This is difficult to do between an on-premise server and cloud instance. The way that EMC and NetApp do this are in the controller layer. They log write changes to the disk, track what blocks get changed, and communicate the changes to the other controller on the target system. The target system takes these block changes, figures out what actual blocks they correspond to on their disk layout and update the blocks as needed. This does not work in a cloud storage instance. We deal on a file layer and not on a track and sector or bock layer. The fundamental problem with this data replication mechanism is that you must restart or ingest the new file into the database. The database server does not do well if files change under it because it tends to cache information in memory and indexes into data get broken if data is moved to another location. This type of replication is good if you have an hour or more recovery point objective. If you are looking at minutes replication you will need to go with something like DataGuard, GoldenGate, or Active DataGuard.

DataGuard works similar to the block change recording but does so at the database layer and not the file system/block layer. When an update or insert command is executed in the database, these changes are written to the /u04 directory. In our example the +REDO area is allocated for 9.8 GB of disk. If we look at our /u04 structure we see /u04/app/oracle/redo contains redoXX.log file. With DataGuard we take these redo files, compress them, and transfer them to our target system. The target system takes the redo file, uncompresses it, and applies the changes to the database. You can structure the changes either as physical logging or logical logging. Physical logging allows you to translate everything in the database and records the block level changes. Logic logging takes the actual select statement and replicates it to the target system. The target system either inserts the physical changes into the file or executes the select statement on the target database. The physical system is used more than the logical replication because logical has limitations on some of the statements. For example, any blob or file operations can not translate to the target system because you can’t guarantee that the file structure is the same between the two systems. There are a variety of books available on DataGuard. It is also important to note that DataGuard is not available for Standard Edition and Enterprise Edition but for High Performance Edition and Extreme Performance Edition only.

Oracle Data Guard 11g Handbook
Oracle Dataguard: Standby Database Failover Handbook
Creating a Physical Standby Documentation
Creating a Logical Standby Documentation
Golden Gate is a similar process but there is an intermediary agent that takes the redo log, analyzes it, and translates it into the target system. This allows us to take data from an Oracle database and replicate it to SQL Server. It also allows us to go in the other direction. SQL Server, for example, is typically used for SCADA or process control systems. The Oracle database is typically used for analytics and heavy duty number crunching on a much larger scale. If we want to look at how our process control systems is operating in relation to our budget we will want to pull in the data for the process systems and look at how much we spend on each system. We can do this by either selecting data from the SQL Server or replicating the data into a table on the Oracle system. If we are doing complex join statements and pulling data in from multiple tables we would typically want to do this on one system rather than pulling the data across the network multiple times. Golden Gate allows us to pull the data into a local table and perform the complex select statements without having to suffer network latency more than the initial copy. Golden Gate is a separate product that you must pay for either on-premise or in the cloud. If you are replicating between two Oracle databases you could use Active DataGuard to make this work and this is available as part of Extreme Edition of the database.

The /u03 area in our file system is where backups are placed. The file system for our sample system shows /u03/app/oracle/fast_recovery_area/ORCL. The ORCL is the ORACLE_SID of our installation. Note that there is no PDB1 area because all of the backup data is done at the system layer and not at the pluggable layer. The tool used to backup the database is RMAN. There are a variety of books available to help with RMAN as well as an RMAN online tutorial
It is important to note that RMAN requires a system level access to the database. Amazon RDS does not allow you to replicate your data using RMAN but uses a volume snapshot and copies this to another zone. The impact of this is that first, you can not get your data out of Amazon with a backup and you can not copy your changes and data from the Amazon RDS to your on-premise system. The second impact is that you can’t use Amazon RDS for DataGuard. You don’t have sys access into the database which is required to setup DataGuard and you don’t have access to a filesystem to copy the redo logs to drop into. To make this available with Amazon you need to deploy the Oracle database into EC2 with S3 storage as the back end. The same is true with Azure. Everything is deployed into raw compute and you have to install the Oracle database on top of the operating system. This is more of an IaaS play and not a PaaS play. You loose patching of the OS and database, automated backups, and automatic restart of the database if something fails. You also need to lay out the file system on your own and select LVM or some other clustering file system to prevent data loss from a single disk corruption. All of this is done for you with PaaS and DBaaS. Oracle does offer a manual process to perform backups without having to dive deep into RMAN technology. If you are making a change to your instance and want a backup copy before you make the change, you can backup your instance manually and not have to wait for the automated backup. You can also change the timing if 2am does not work for your backup and need to move it to 4am instead.

We started this conversation talking about growing a table because we ran out of space. With the Amazon and Azure solutions, this must be done manually. You have to attach a new logical unit, map it into the file system, grow the file system, and potentially reboot the operating system. With the Oracle DBaaS we have the option of growing the file system either as a new logical unit, grow the /u02 file system to handle more table spaces, or grow the /u03 file system to handle more backup space.

Once we finish our scale up the /u03 file system is no longer 20 GB but 1020 GB in size. The PaaS management console allocates the storage, attaches the storage to the instance, grows the logical volume to fill the additional space, and grows the file system to handle the additional storage. It is important to note that we did not require root privileges to do any of these operations. The DBA or cloud admin can scale up the database and expand table resources. We did not need to involve an operating system administrator. We did not need to request an additional logical unit from the storage admin. We did not need to get a senior DBA to reconfigure the system. All of this can be done either by a junior DBA or an automated script to grow the file system if we run out of space. The only thing missing for the automated script is a monitoring tool to recognize that we are running into a limit. The Oracle Enterprise Manager (OEM) 12c and 13c can do this monitoring and kick off processes if thresholds are crossed. It is important to note that you can not use OEM with Amazon RDS because you don’t have root, file system, or system access to the installation which is required to install the OEM agent.

In summary, we looked at the file system structure that is required to replicate data between two instances. We talked about how many people use third party disk replication technologies to “snap mirror” between two disk installations and talked about how this does not work when replicating from an on-premise to a cloud instance. We talked about DataGuard and GoldenGate replication to allow us to replicate data to the cloud and to our data center. We looked at some of the advantages of using DBaaS rather than database on IaaS to grow the file system and backup the database. Operations like backup, growing the file system, and adding or removing processors temporarily can be done by a cloud admin or junior DBA. These features required multiple people to make this happen in the past. All of these technologies are needed when we start talking about use cases. Most of the use cases assume that the data and data structures that exist in your on-premise database also exist in the cloud and that you can replicate data to the cloud as well as back from the cloud. If you are going to run a disaster recovery instance in the cloud, you need to be able to copy your changes to the cloud, make the cloud a primary instance, and replicate the changes back to your data center once you bring your database back online. The same is true for development and testing. It is important to be able to attach to both your on-premise database and database provisioned in the cloud and look at the differences between the two configurations.

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Day: May 12, 2016

technology behind DBaaS