October 2008 – Data Protection Down Pat

The cost of business intelligence (or not)

I have sat in on a couple of meetings with financial departments and listened to the need for reporting. This seems to be the topic of the year. The struggle is how to do this and how to fix what has been done in the past.

First, let’s look at what has been done in the past. The key reporting tool historically has been something like Discoverer. Everyone I talk to has a love/hate relationship with this tool. They love the information that comes out of the tool. They hate it because it makes concurrent manager more complex to run and drives up your hardware and license cost by increasing the horsepower needed to run your general ledger. The key problem with this solution is that it does not scale. The more reports that you run, the more overloaded your system becomes. The more you use it, the more difficult it becomes to keep up and closing your general ledger takes longer and longer time. One company that tried to reduce their close time quickly realized that they had written custom interfaces that pulled data out of E-Biz, put the information into a spreadsheet, performed a calculation and manually entered the results of the calculation back into E-Biz. When they initialized the project they quickly realized that it wasn’t the process that was broken, it was the data that was invalid. Over the years, errors accumulated and got bigger and bigger and eventually hit the bottom line. One division that always appeared to be profitable turned out to be loosing money on a regular basis. An error in a calculation that added instead of subtracting two cells showed a profit and took money from another department.

If we take this methodology to the extreme, you get people who do nothing but come into work and look at the nightly runs of data extraction, verify that the results look good, and manually key the information into E-Biz to reduce the close time. You get hundreds of people monitoring thousands of calculations and an IT department that does nothing but handle failed jobs and bad data being pulled from a variety of sources.

Let’s look at the polar opposite of this. Let’s look at someone who created their own interfaces so that they can capture data as it is entered then push the data into E-Biz. The reason to capture the data is to reduce the Discoverer runs and concurrent jobs. The data is pushed into a spreadsheet as it is entered into E-Biz and updates are run nightly to verify that all of the data was properly collected. This solves the processing power issue in E-Biz but does not solve the overall picture.

Both of these solutions hamstring organizations because it ties custom code either in the forms entry or spreadsheet calculations. When it comes time to upgrade the E-Biz package or install a patch, testing becomes a nightmare. If a new feature is needed, like handling a new currency or language, multiple lines of code need to be changed in different places. If not all places are changed, errors can creep into the system without an ability to trace these errors.

The real solution to this problem is two fold. The first is to have a common API to pull and push data through E-Biz. This can be done through SOA or WebServices. If a financial element is pushed into E-Biz, the currency type is pushed as well. The conversion is done in the WSDL and not in the spreadsheet. The second solution is to have a data warehouse that is populated on a regular basis separate from the E-Biz services. Rather than run Discoverer against E-Biz, you run OBIEE/Analytics against the data warehouse and not have to have as large as a database under E-Biz. This solution also allows you to pull data from alternate repositories that are not Oracle solutions. If you patch or upgrade E-Biz, you change the API interface and tell it how to populate the data appropriately. If you patch or upgrade your third party application, you change the API interface to understand the data change. This allows you to centralize your changes and not have to change things in multiple places.

Politically, this solution is not a valid solution. Given that knowledge is power, giving up the calculation in your spreadsheet or giving up headcount to verify a calculation for your department means loss of budget and resources. Typically a solution like this involves an additional expense, significant consulting and manpower to implement the solution, and loss of power for individuals. What this does is give you a single source of truth and control over who sees what and how much detail that they can get. A solution like this typically requires direction from executive management and investment of resources to document processes and procedures. This can be a scary thing at times and expose stuff that some people want to keep hidden.

I realize that this is a little off topic from what I typically write about but I have heard the same message multiple times from different customers in different industries.

X11 with Amazon Cloud

Having an ssh term into an instance isn’t the easiest thing to work with. I was hoping to get an X11 window into the session and display it back to my desktop. My desktop is a WindowsXP so the only real option is to use TightVNC to get an X11 interface.

Fortunately the Oracle Unbreakable Linux that I am running as an instance has vncserver installed as a default module. I tried running vncserver on the instance and attaching to it from my desktop. Unfortunately, it did not work.

To connect to the vncserver, I first tried to use putty to tunnel the ports. VNC runs on ports 5900 and 5901. I tried to configure the proxy to have it tunnel through ssh and connect on the other end. Fortunately, this did not work. I have done this in the past and it is very slow and difficult to use.

My second attempt was to create a new port opening in the Elasticfox tool and connect directly with TightVNC. I defined the ports 5900 and 5901 as the default connection and mapped it to the security group used to start the instance. This worked and allowed me to connect to the VNC server running on the test instance.

Doing this, we can now launch firefox and view web sites. The ones that we want to look at are https://localhost:1158/em and http://localhost:8080/apex. With this configuration we can now start looking at installing, testing, and experimenting with instances. We can also use the Amazon storage to keep our tests persistent so that we don’t have to recreate the environment every time.

more later…..

R12 Oracle EBS Essentials for Implementers – day 2

Continuation of the Oracle Education class on EBS Foundations……

Today we continued the discussion on shared entities and integration. I think I understand this concept. Basically it is similar to Java Objects and inheritance. If I define something at a global level, it is shared across all entities. If I define something inside an object, it is valid for that object only. The example given was with a supplier. If I create a supplier, the supplier name is global. If I create a site for invoices or delivery, the site is specific to the role that I was in when I created the site. I can have this object shared across sites but the secondary role must define this and allow it to be the same or different. The example given was with a company that does business in the US and Canada. If I work for a company that has divisions (and different roles) in both locations, the invoice group in Canada has a site that they invoice that might be the same or different from the site in the US.

Chapter 5 of the class focuses on security of EBS. The basic model is what you would expect. End users have access to self service and approvals. A smaller class of users has access to registered applications. An even smaller class of system administrators have access to delegate application access and define access control. There is also a class of higher system administrators that define the data security model and functional security of EBS. By functional security, I mean individual menus of functions, forms, and html pages. What was new to me was that this is the same as Oracle Access Manager. Given that EBS already has this, there really isn’t a need to OAM to front end an EBS installation. The only reason you would use OAM is to time restrict or restrict by ip address a users access to the EBS installation. This does not come up very often so the need to use OAM is decreased.

It is interesting that EBS allows you to define effective dates but not times. It seems that effective date range should also allow you to restrict on a time basis and EBS might be able to do this but out of the box it does not. OAM gives you this option along with some other options and we have recommended this to some customers. They were more excited about the password reset features in OAM and less interested in the time access. EBS does not have the self service password reset that OAM does have.

The reporting mechanism to look at security inside of EBS is also something that is potentially very powerful but also dangerous. Many of the customers that I have talked to have a difficult time managing concurrent processing. I think I understand why now. Users can be given the right to define and launch reports at any time. If these reports are search intensive, it can log jam a system. Using something like partitioning can reduce the search by reducing the result set to a month instead of six or seven years. Incorporation of the EBS Management Pack into OEM does give you a finer grain administration of concurrent processes. OEM gives you a view into processes and allow you to raise or lower the priority of the process. EBS administration tools do not give you this fine a grain of control. OEM integrates better into the operating system thus has better controls to raise or lower process priorities through EBS or OS commands.

Chapter 6 is a discussion of flexifields. Flexifields is a configurable field that opens in a window from a regular EBS window. It allows you to structure identifiers required by EBS according to your own business definition. It also allows you to collect and display additional information that EBS does not require as fine a grain definition. You can create key flexfields to generate a shortcut name for an item. For example, if you encode components with descriptive information in a series of values, these values can be concatenated together. The 11g version of the database allows you to partition based on these entries. The two fit very nicely together and allows you to separate data based on a partial field and not have to waste storage to split information into different columns.

more tomorrow…..

R12 Oracle EBS Essentials for Implementers – day 1

I am taking an Oracle Edu class this week – D49493GC10 (or D52216) which is the intro to EBS R12 version. The class material looks very itneresting. The assumption is that you have EBS and are trying to use the features and functions of the class.

The course starts relatively basic with tips and tricks for navigating the forms interfaces. This is very helpful for someone who is doing this for the first time. It shows you different ways of doing things like searching, personalizing, and customizing the forms interface. Once you get through the basics of the user interface, the class dives into the different components of EBS and how it is installed and configured. The next chapter is discussing shared data and integration. This is a good introduction chapter. The next chapter is fundamentals of system admin. This is an abreviated chapter that summarizes the five day class. It is more of an introduction chapter and not a detailed chapter on administration.

Overall this has been a good class. It is an introductory class but it does give you a good feel for what it takes to navigate and manage some of the features in EBS. I can see where someone would want to take this class if they were getting into consulting or deploying EBS for the first time. It seems to me that the admin class or the installation class is more in line with someone starting. The workflow class is probably what someone wants to take if they have an existing installation, understand basic administration, and want to enhance an existing deployment.

more tomorrow on the admin issues…..

more on HP-Oracle Exadata Storage

It appears that there is more than hype with the HP-Oracle Exadata Storage solution. I like the architecture and way that it splits out queries between boxes. A good source of more information can be found at Kevin Closson’s blog. Kevin is an Oracle Employee and is a performance architect. He goes into some good detail about the operations.

What impressed me the most about this device is that is currently is focused on data warehouse solutions. It looks for select statements and farms these requests to the storage node. Since the storage node understands the table structure and query formatting, it only returns the relevant data. Instead of returning a 1G table and doing the where select on the database machine, the select…where is done on the storage node and the relevant columns and rows are returned. This means that things like returning a current billing statement for a specific cell phone will return the current bill and not all of the historical data. Partitioning does this but it does not reduce the number of columns returned. If you have a lot of data associated with a user in different columns, you can reduce the amount transferred from the storage by returning only the relevant information from the tables and not the entire row.

Interesting stuff. If we only had a few telco companies where I work…..

The huge benefit that I see from this device is that the statements that are shipped across to the storage node is warehouse centric. I can foresee this working with spatial data and being applied to seismic reservoir systems. I can see this working with materialized views for manufacturing. The only thing that needs to change is the engine on the database that farms out the request to the storage component. This has huge long term potential where large data sets are used. Many of the complex models that are currently designed to reduce this complexity can be expanded and simplified. The storage node can be customized to understand this format and pre-process the requests. If we follow this to the extreme, you might be able to embed some of the stored procedures for things like velocity correction of seismic data and have the storage node process this data and present the corrected trace instead of the raw trace. You can also do thing like image recognition or enhancement in the storage component and not at the higher levels. More complex stuff that currently requires parallel processing can be forced down into the storage and not to parallel processes. Agreed, the product is not here yet but I can see it going in that direction. Since Oracle typically provides open APIs to most of its products, I can see new ventures springing up based on this technology and specific solutions being created for specific industries.

S3 storage and Cloud Computing with Oracle 11g

I spent some time playing with the Amazon Cloud and Oracle today. I wanted to see if I could allocate some persistent storage with an instance so that I didn’t have to create a new repository every time I started an instance. I wanted to create an instance, store data there, and stop the instance. I didn’t want to have to reload the data or the instance every time I started.

I was reading LewisC’s blog and he suggested that you could not use the S3 storage for a file system. I didn’t think that was true so I thought I would play. It turns out that you can create a file system and even create a database in that area. Here are the steps.

1) create an S3 persistent storage area. I did this using the Elastifox extension to Firefox. This is done in the Volumes and Snapshots tab. If you click on the green + button it allows you to define a new storage. I created a 400G area just to see if I could create a file system of interest. When I did it assigned it to a VOL ID but did not populate anything else.

2) start an AMI. I did a simple search for an oracle instance and started an 11g 32-bit instance. Once I did this, I was able to use putty and login to the system as root and oracle. Setting this up did require downloading putty and puttygen. The puttygen was needed to convert the keys that aws generates into the right format for putty. Once I was able to do this, I could create to the public DNS instance name with the keys that aws generated.

3) once I had the instance started, I went back to the Volumes and Snapshots and associated storage with the running instance. I had to associate the storage with a device so I associated it with /dev/sdb. I chose this because /dev/sda is the default storage from the AMI.

4) After associating the S3 storage with the instance, I was able to treat it as a local disk. I partitioned it with fdisk and created a Linux partition and write the data to the partition. Since this was a simple test, I created one instance and only one instance thus I see /dev/sdb1.

5) Once I had a partition defined, I formatted the partition with a filesystem using the mkfs /dev/sdb1 command. This partitions the file system using the ext3 filesystem by default. I did not test any other file systems but I assume that you could do something like reiser or newer filesystem.

6) Once I had a formatted partition, I was able to mount it. I first had to create /u03 with the mkdir /u03 command. I then mounted the file system with the mount /dev/sdb1 /u03 command. To make this persistent I had to edit the /etc/fstab file with the following line:
/dev/sdb1 /u03 ext3 defaults 1 3

7) I can verify that the file system exists with the df -k command. I should see
/dev/sdb1 412843916 3434932 388437736 1% /u03
as one of the lines.

8) now that I have the file system created and mounted I can run_dbca.sh as root and create a database on the file system.

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31