August 2006 – Data Protection Down Pat

different ways to skin a cat

So I think that I am overdosed on education and training classes. I’ve
now been to database administration, database new features, database
clustering, data guard, grid management of databases, and last but not
least, using a database to back end a portal server. Some key things
that I got out of the classes bring me back to key things that I got
out of college. No matter how exciting the subject sounds, a bad
instructor can really screw up a class. Conversely, no matter how
mundane a topic is, a really good instructor can make a class
interesting.

It is interesting that there are different things that you can plan to
fail and different solutions to make it work properly. Tuning and
making an operating system was something that I did a bit of research
on, specifically the storage systems. Job scheduling has also been a
long standing research project for operating systems starting with
multiprocessors and expanding to networked systems. It is refreshing to
see the same problems with database systems and different ways of
solving these problems. One of the slides that I saw multiple times was
how to protect data and data access controlled by a database.

You can protect the data using RAID/Mirror disks either with or without
hardware or operating system RAID (this is done using a new feature in
10g, ASM). You can protect against data loss from system failure (this
is done with Data Guard) and can be done either by copying physical
blocks across as they change or logically by executing parallel
updates to databases. You can also protect against data loss by running
a distributes system that partitions the database and takes over in the
event of a node failure. This is a more complex solution because it
requires you to use network storage so that data can be accessed from
multiple nodes. It seems like more and more IT departments are using
network storage. It isn’t quite to the point where many small
businesses or homes can afford such a nice feature.

The one that shocked me the most was that you can even protect your
data against the weak link in your IT staff. If someone deletes
something or wipes something out, the database has the ability to roll
back the mistake and restore the transactions upto a point. This is a
lot more interesting than backup and restore.

The other interesting thing that surprised me was that you can split a
database and run it on different machines. There are multiple ways to
do this. If you have multiple writers, you need to cluster your
database so that record locking can happen on a row level. If you have
one writer and multiple readers you just need to replicate your data
either through block update synchronization or sql command
synchronizations between multiple boxes. These updates can be
synchronous, async, or deleayed by minutes to hours. One example that
was given was the way that Apple does iTunes. One writer exists to
create the music index/repository. They replicate the database across
multiple systems and let large numbers of clients come in and search
for music titles. The searches go against the replicas and not against
the primary master. If the primary master fails for some strange
reason, one of the replicas becomes the primary and continues to feed
the other replicas.

taking it up one level

So one of the jobs and duties that I have on the side is keeping statistics for the Houston MLS team, the Dynamos. Really great organization. I get all the peanuts and diet cokes that I want every game. I get free parking and admission to the press box so that I can keep stats during the game. The key job of a stats person is to record items that the TV/print/web journalists are interested in. Last week for the New England game we had to count the number of times that two players touched the ball. Someone had a theory that the balance of the game changed with the number of touches of they key midfielders. Amazingly, it was true. There were minutes where nothing happened and the game got a little boring. The key players were walking, loping, or even jogging to get to the play. The defenders were hanging tight with them then started to drop back a little to try and intercept a pass or breakup a pass. That was their key mistake. Every time that happened, the player who was apparently sleeping woke up and created a play, shot on goal, or assisted someone else in a great scoring opportunity.

Last night I got to go to a double header, the Dynamos vs LA Galaxy followed by FC Barcelona vs Club America. The Galaxy and Dynamos have a grudge match going and the star midfielder for the Dynamos was out of the game with a red card. A typical game sees these two battling and fighting for possession and control of the offense. Typically fifteen to twenty thousand people show up for a game. A nice crowd that gets into the game. Last night there were seventy thousand that showed up for the game. The key reasons? The European champion was playing the American champion. Two of the best teams in the world were playing each other and every soccer fan wanted to see them play.

The key elements that I saw in the games last night were control, movement, and seizing an opportunity. The speed of play for the MLS team was a little measured and controlled. Plays developed but they developed slowly. When a fast break happened it was two or three players. With Club American and Barcelona, a fast break was four or five players and usually didn’t happen. It was more of a surgical strike to get a pass in or flip a high ball ten yards to someone who is making a break. It was truly exciting to watch the game and the crowd reacting to the players.
Moments like this make me wax philosophical. How can I take my work or exercise or playing with the kids up a notch? Do I need to read one more book that makes me that much smarter? Do I need to put in ten more minutes on the eliptical training? Do I need to make sure that I play a game for fifteen more minutes to get more fun in before bedtime? No. Getting better at something that you do isn’t the true answer. Getting better at something you do with others is the answer. It is one thing to know all the details of a product, it is another thing to know the technical details and have someone else know the business justifications and have someone else know how other people have succeeded or failed in production roll outs. It is one thing to spend time working on leg strength, it is another thing to work on cardio or upper arm strength. It is one thing to spend time with your kids, it is another thing to find out what they want to do and do it with them. For example, I really don’t like playing tennis. It ruins my racketball game. My youngest son loves the game and wants to get into it. I realized that I haven’t played racketball in almost two years. I need to stop worrying about a game that I really don’t have the opportunity to play and play the game that my son wants to play.

Isn’t it amazing how a little think like watching a soccer double header can put your life into perspective and make you think about what is important and how to make it better?

some things never change

this week I went to yet another training class. This week I was fortunate???? to get relegated to the sales methodology training. Ok, I’ve been in sales for what 15 years now and hardcore research/engineering for 10? I thought that I had seen it all. I remember one class I took at Sun where I was told to become my customers best friend, have their kids play with mine, have their older kids babysit my kids. What garbage. I understand the concept but I never liked the touchy feely/Bridges of Madison County approach to selling.

I did like the methodology that I heard in this class, understand your customers problems and look at it from their perspective. I get that. I have been doing that for a while. Imagine myself as an engineer trying to solve their problem. How would I attack the problem. Layer on top of that is the product that I am motivated to sell the right solution for the problem. If it is, how do I recommend it? If it isn’t could it if you looked at it with your head tilted north and your tounge sticking out the right side of your mouth? I know that sounds like an unreasonable request but some sales reps get desperate when it comes to the end of Q4 and they are 80% of goal.

The key items that I got from the sales methodology class is that you must get to know your products, your competitors products, your customers goals and ambitions, and your customers problems. Look at it from their perspective and try to solve the problem. Correlate the technical issues to the technical people and the business issues to the management level. Engineers don’t care about making a process 8% more profitable, they care about making their part of the process work efficiently so that no one will bother them with questions later. Managers are concerned with cost, quality, and all of the other Dilbert buzwords that the pointy haired manager throws about.

One fact that I didn’t know was the difference between the professional golfers stroke averages and their paychecks. There is a one stroke difference between the number one golfer and the number 20 golfer. It is interesting to me that Carl Pettersson is the 30th best golfer when you look at averages but the 12th best when you look at money. Nick O’Hern has played in half the tournaments and makes less than half the money. Nick O’Hern is 7th in shot average with almost a whole shot better for this year. Who is the better golfer? The one who has the better average or the one who makes more money? Interesting perspective.

We also did an exercise that was interesting. We were broken up into teams of four. Two of us were given puzzle solutions, one was given the puzzle pieces, and one was assigned to ask questions about assembling the puzzles. The ones with the solutions could not show the picture that had the solution on it to the one asking questions. The one with the puzzle pieces could not talk to the ones with the solutions. The exercise was very interesting because it tested your ability to share information, ask the right questions, and get to the solution. The simplest answer was to show the puzzle pieces to the two that had the solutions and let them put it together for you. Doing this risked having the person with the solution clam up and not share any information because that dosen’t seem fair. Asking too many detailed questions will frustrate the solution holders because they want to share more information but you are asking the wrong questions. I ended up messing up and asking one person too many questions and not knowing enough about their puzzle. I was able to get a solution for the second puzzle based on the information I gathered from the first. It was a very interesting exercise. It gave me a new perspective on asking questions to see what a customers problem might be.

I’m mostly glad that the class was only two days long. The nuggets that I came out with were interesting but there is so much more that I want to learn and need to learn to be effective. Right now I am working on learning our products to the best of my ability.

What the heck is RAC

this information is taken from the RAC for Beginners webcast series…..

first some personal notes….

When I first learned what RAC, real application clusters, is I thought I understood it. It is basically clustering of an application on multiple systems. The application is the Oracle database. To some extent this is true. It is a great oversimplification of what it is because it also brings in load balancing, scheduling, resource allocation, affinity, and other attributes typically associated with an operating systems or a distributed processing manager.

Terminology

database : set of files that comprise information, this includes metadata, files, and information related to the data
instance : memory and background processes used to access a database. For RAC you have multiple instances for a database. For Oracle, you can have multiple instances per database but never multiple databases per instance.
clusterware : component for RAC that takes care of cluster membership including heartbeats, split brain situations, instance managements
SAN : storage area network. typically collections of disks controlled by a storage area network controller. This is a requirement for RAC
local and shared storage : shared is disks accessed by multiple hosts. Local implies that only one system accesses the storage
raw device and cluster file system : methods for accessing data on shared storage
ASM : automatic storage management, an option for managing raw and cluster file systems available with 10g

History of RAC

early 90’s – oracle parallel services with v7
2000 – enhancements to OPS with 8i
2001 – upgrade of OPS with Cache Fusion technology in 9i
2004 – oracle clusteraware and RAC update 10g

RAC is not

set and forget, it does require monitoring and tuning
transparent to some applications

single instance vs RAC

single instance: local storage contains instance si1 on storage node A
RAC: two instances of database (rac1 and rac2) located on two storage nodes. The nodes must exist on shared storage and both servers have a cluster interconnect for heartbeat
shared database components – control files, temp tablespace, application tablespace, server parameter file (spfile)
unshared database components – redo logs, undo tablespace, rollback segments. Note that these components reside on shared storage but there are copies of these for each node that are part of the RAC.

licensing issues for Standard Edition

max 4 CPUS per cluster
must use ASM for all database storage
must use only Oracle Clusterware

note that the CPU cound is a limit for Standard Edition, for Enterprise Edition the limit does not apply

Installation process

prepare the hardware. it does require multiple network connections and SAN storage. The platforms must be the same and the OS must run the same version. It is recommended that patch levels are the same. With 10g this requirement is a little more flexible and only requires that the platform be the same.
clusterware – for UNIX, installing SSH key pairs is best, onWindows the username/passwords must be the same
ASM – this should reside in a separate ORACLE_HOME from the database mainly for patching and downtime requirements. You will need atleast two disk groups, data and flash recovery area. If using for Standard Edition, all database data must be controlled by ASM
RDBMS – install without database creation. recommendation is to have multiple oracle_home locations across the cluster. OPatch is cluster-aware and will patch all systems in the cluster. Once patches are applied, use DBCA to create the database

tuning rac

same as single instance tuning, everything still works
network bottlenecks are the most common issue
statspack, ADDM, and AWR are cluster aware
10g Enterprise Manager has good info

backup

recovery is more comples because there are multiple sets of redo logs for each node.
there is just one database, not one per node
ASM and RMAN are cluster aware, ASMCMD does not currently offer backup commands

look at OTN and metalink. There are many “how to” papers.
Note that many vendors certify RAC differently from the database

alternatives for HA

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31