Metric Extensions in EM12c #1: Monitor a Useful Target such as a GoldenGate Instance

Back in August of 2013, I wrote a post on “Alternative Method to monitor GoldenGate from EM12c outside the GoldenGate 12.1.0.1.0 Plugin” and then back in December of 2013 I wrote another one a Metric Extension to Monitor Unsupported Database Versions. As it turns out, the first post has been quite useful in many customer sites but what it lacks is the process to actually build the Metric Extension (ME).

Note: If you are interested in more ways to monitor GoldenGate, be sure to check out my older posts, Bobby Curtis’ posts (1 & 2), and his upcoming presentation at Collaborate 14. Coincidentally, he is sitting with me on the plane ride over to #C14LV at the moment :-)

It’s important for me to share my experience and reason for not using the metrics provided with the EM12c GoldenGate plugin; I have found it to be a little inconsistent due to several reasons. Starting from the Berkley DB Datastore corruptions, to JAgent hangs, to inaccurate results on the GoldenGate homepage in EM12c, and lastly I’ve experienced unreliable alerting. The JAgent architecture was inherited from the GoldenGate Monitor days and can be roughly described by the illustration below (if this is inaccurate, I’d be more than happy to adjust the diagram below). The parts in green describe the components involved with collecting the data from the GoldenGate instance, as well as, the EM12c side. The process, at certain times, and on certain platforms (Windows) has broken from my experience and after working with Oracle Support for a while until the fixes were released with subsequent patches (11.2.1.0.X), but I still found the incident management and subsequent notifications to work unreliably.

The data flow, as illustrated below described the JAgent which connects to and stores information from the GG Objects periodically in its Datastore (dibdb directory). When the EMAgent polls for updates via the JMX port, it will do so by checking the datastore. Once the raw metric is collected within the repository, it is the EM12c incident management framework which triggers notifications.

NewImage

With that being said, I’d like to pick up where I left off way back in August of last year.

I already have the output from the monitor_gg.pl script which I will invoke from my new Metric Extension. Let’s begin with a refresher on the lifecycle of an ME:

NewImage

This post assumes that:

  • You have already downloaded the monitor_gg.pl script and tested it on your hosts where GoldenGate instances currently run i.e. $ perl monitor_gg.pl and receive the output mentioned in my previous post.

Steps

1. Make your way to the Metric Extensions home page.

NewImage

2. Click on “Create”, and enter the relevant details such as “Name”, and “Display Name”. Make sure you select the Adapter as “OS Command – Multiple Columns”. The rest you can leave at default values, or change as per your desired check frequency.

NewImage

3. The next few steps involve creation of a script (stored within the Metric Extension) instead of my previous post where it is located on the actual agent host.

3.1 On the next page, enter the full path of the script in the “Command” section. Alternatively, you could also leave the “Command” section with the %perlBin%/perl and enter the absolute path of the script in the “Script” section. Remember that you can upload your own custom script with the Metric Extension, which is stored (on the agent host) in the %scriptsDir%.

1

 

3.2. At this point, we need to create the new file “monitor_gg.pl” in the metric extension, this is done by either “Adding”  or “Uploading” a new one.

3.2

4. On the next page, you need to specify the columns returned by the status check. The process is similar to what I mentioned in my previous post Metric Extension to Monitor Unsupported Database Versions, so I will quickly skim through the important bits.

NewImage

It is important to note that I specified this and the following column as Key Columns. This is because the result set in the ME framework requires unique identifiers.

NewImage

5. The next column represents the actual program name, i.e. Extract, Replicat, Manager etc.

NewImage

6. Status is an important column because we can use it to trigger state alerts. Note, that I have specified the Warning and Critical thresholds, alert and clear messages. Its quite cool how customizable the framework can be.

7. Next, we have the Lag at Checkpoint, a column which we will use for Alerting. Note, that I have specified the Warning and Critical thresholds, alert and clear messages.

NewImage

7. Time Since Last Checkpoint is set up in the same manner as the previous column.

NewImage

8. With that, we are done with the column configuration.

NewImage

9. I leave the default monitoring credentials in place, however if you are running GoldenGate as user other than the “oracle” user, you will have to either a) create a new monitoring credential set or b) grant the oracle user execute on the monitoring script.

NewImage

 

10. We’re coming to the end now. On the next screen, we can actually see this metric in action by running it against a target.

NewImage

11. Next, we review our settings and save the Metric Extension.

NewImage

12. Now, back on the ME home page, the metric is in Editable Stage.

NewImage

13. We simply need to save it as a “Deployable Draft” or a “Published” extension. The former state allows for deployments to individual targets, where as the latter is required for deployments to Monitoring Templates.

NewImage

14. Follow steps listed under section 10 on my post on creation of metric extensions to deploy the ME.

Once deployed, the metric is collected at the intervals specified in step 2. Depending on how your incident rule sets are configured, you will most likely start receiving alerting once the thresholds we defined above are crossed.

I do have some lessons learned to add to the above posts from an Incident Management perspective, but that will have to be a completely different post :-)

Hope this helps.

Cheers!

About these ads

4 comments

  1. Maaz:

    This is a Tour de force blog post on creating and deploying useful metric extensions for Oracle EM12c. Thank you for this contribution to the EM12c community!

  2. Cheers! I feel the same way about Blue Medora's contributions to the EM12c community as well!

  3. […] 9, 2014: Check out my follow-up post that describes how the metric extension was actually created in […]

  4. […] wrote about the GoldenGate monitoring using a Metric Extension, and then later on expanded upon the creation of a Metric Extension, I had installed and configured it at several customer sites where GoldenGate was running on a […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

bdt's oracle blog

Sharing experience (by Bertrand Drouvot)

Frits Hoogland Weblog

IT Technology; Oracle, linux, TCP/IP and other stuff I find interesting

Vishal desai's Oracle Blog

Just another WordPress.com weblog

So Many Oracle Manuals, So Little Time

“Books to the ceiling, Books to the sky, My pile of books is a mile high. How I love them! How I need them! I'll have a long beard by the time I read them”—Lobel, Arnold. Whiskers and Rhymes. William Morrow & Co, 1988.

Steve Harville's Blog

Oracle technical notes

PeteWhoDidNotTweet

Stuff that interests me, if not you!

Follow

Get every new post delivered to your Inbox.

Join 147 other followers

%d bloggers like this: