Metric Extensions in EM12c #1: Monitor a Useful Target such as a GoldenGate Instance

Back in August of 2013, I wrote a post on “Alternative Method to monitor GoldenGate from EM12c outside the GoldenGate 12.1.0.1.0 Plugin” and then back in December of 2013 I wrote another one a Metric Extension to Monitor Unsupported Database Versions. As it turns out, the first post has been quite useful in many customer sites but what it lacks is the process to actually build the Metric Extension (ME).

Note: If you are interested in more ways to monitor GoldenGate, be sure to check out my older posts, Bobby Curtis’ posts (1 & 2), and his upcoming presentation at Collaborate 14. Coincidentally, he is sitting with me on the plane ride over to #C14LV at the moment 🙂

It’s important for me to share my experience and reason for not using the metrics provided with the EM12c GoldenGate plugin; I have found it to be a little inconsistent due to several reasons. Starting from the Berkley DB Datastore corruptions, to JAgent hangs, to inaccurate results on the GoldenGate homepage in EM12c, and lastly I’ve experienced unreliable alerting. The JAgent architecture was inherited from the GoldenGate Monitor days and can be roughly described by the illustration below (if this is inaccurate, I’d be more than happy to adjust the diagram below). The parts in green describe the components involved with collecting the data from the GoldenGate instance, as well as, the EM12c side. The process, at certain times, and on certain platforms (Windows) has broken from my experience and after working with Oracle Support for a while until the fixes were released with subsequent patches (11.2.1.0.X), but I still found the incident management and subsequent notifications to work unreliably.

The data flow, as illustrated below described the JAgent which connects to and stores information from the GG Objects periodically in its Datastore (dibdb directory). When the EMAgent polls for updates via the JMX port, it will do so by checking the datastore. Once the raw metric is collected within the repository, it is the EM12c incident management framework which triggers notifications.

NewImage

With that being said, I’d like to pick up where I left off way back in August of last year.

I already have the output from the monitor_gg.pl script which I will invoke from my new Metric Extension. Let’s begin with a refresher on the lifecycle of an ME:

NewImage

This post assumes that:

  • You have already downloaded the monitor_gg.pl script and tested it on your hosts where GoldenGate instances currently run i.e. $ perl monitor_gg.pl and receive the output mentioned in my previous post.

Steps

1. Make your way to the Metric Extensions home page.

NewImage

2. Click on “Create”, and enter the relevant details such as “Name”, and “Display Name”. Make sure you select the Adapter as “OS Command – Multiple Columns”. The rest you can leave at default values, or change as per your desired check frequency.

NewImage

3. The next few steps involve creation of a script (stored within the Metric Extension) instead of my previous post where it is located on the actual agent host.

3.1 On the next page, enter the full path of the script in the “Command” section. Alternatively, you could also leave the “Command” section with the %perlBin%/perl and enter the absolute path of the script in the “Script” section. Remember that you can upload your own custom script with the Metric Extension, which is stored (on the agent host) in the %scriptsDir%.

1

 

3.2. At this point, we need to create the new file “monitor_gg.pl” in the metric extension, this is done by either “Adding”  or “Uploading” a new one.

3.2

4. On the next page, you need to specify the columns returned by the status check. The process is similar to what I mentioned in my previous post Metric Extension to Monitor Unsupported Database Versions, so I will quickly skim through the important bits.

NewImage

It is important to note that I specified this and the following column as Key Columns. This is because the result set in the ME framework requires unique identifiers.

NewImage

5. The next column represents the actual program name, i.e. Extract, Replicat, Manager etc.

NewImage

6. Status is an important column because we can use it to trigger state alerts. Note, that I have specified the Warning and Critical thresholds, alert and clear messages. Its quite cool how customizable the framework can be.

7. Next, we have the Lag at Checkpoint, a column which we will use for Alerting. Note, that I have specified the Warning and Critical thresholds, alert and clear messages.

NewImage

7. Time Since Last Checkpoint is set up in the same manner as the previous column.

NewImage

8. With that, we are done with the column configuration.

NewImage

9. I leave the default monitoring credentials in place, however if you are running GoldenGate as user other than the “oracle” user, you will have to either a) create a new monitoring credential set or b) grant the oracle user execute on the monitoring script.

NewImage

 

10. We’re coming to the end now. On the next screen, we can actually see this metric in action by running it against a target.

NewImage

11. Next, we review our settings and save the Metric Extension.

NewImage

12. Now, back on the ME home page, the metric is in Editable Stage.

NewImage

13. We simply need to save it as a “Deployable Draft” or a “Published” extension. The former state allows for deployments to individual targets, where as the latter is required for deployments to Monitoring Templates.

NewImage

14. Follow steps listed under section 10 on my post on creation of metric extensions to deploy the ME.

Once deployed, the metric is collected at the intervals specified in step 2. Depending on how your incident rule sets are configured, you will most likely start receiving alerting once the thresholds we defined above are crossed.

I do have some lessons learned to add to the above posts from an Incident Management perspective, but that will have to be a completely different post 🙂

Hope this helps.

Cheers!

Advertisement

17 comments

  1. Maaz:

    This is a Tour de force blog post on creating and deploying useful metric extensions for Oracle EM12c. Thank you for this contribution to the EM12c community!

  2. Cheers! I feel the same way about Blue Medora's contributions to the EM12c community as well!

  3. […] 9, 2014: Check out my follow-up post that describes how the metric extension was actually created in […]

  4. […] wrote about the GoldenGate monitoring using a Metric Extension, and then later on expanded upon the creation of a Metric Extension, I had installed and configured it at several customer sites where GoldenGate was running on a […]

  5. […] I wrote about GoldenGate monitoring using a Metric Extension, and then later on expanded upon the creation of a Metric Extension, which I’ve installed and configured at several customer sites where GoldenGate was running on a […]

  6. If I upload an additional script in the uploads section (total 2), which needs to be invoked in the first script, then how should I should my code look like. I am using shell script. If you can provide a small snippet of how it would look then that would definitely help. I am able to achieve this on the Linux host. But the same in OEM doesn’t succeed. I am stuck in this case. Please advise me with a small shell snippet rather than perl. Thank you.

  7. If I upload an additional script which I would call in the main script, how should I invoke it in the main script. I am trying this but didn’t succeed. I am using the shell script. It runs fine on the Linux host but in Metric Extension I seem to be invoking it in a wrong manner. My syntax seems to be not correct for the ME to find the script and execute it and bring its results to me. The error it returns is: [no such file or directory]
    I tried invoking it in the following ways in script1.ksh:
    . script2.ksh
    ksh script2.ksh
    $(ksh script2.ksh)
    ksh %scriptsDir%/script2.ksh
    $(ksh %scriptsDir%/script2.ksh)
    The scripts exists in the %scriptsDir% directory is the reason I tried it. Kindly help me.

    1. Sridhar,

      %scriptsDir% is located within the $AGENT_HOME directory. You don’t need to physically place the actual script there – thats the Metric Extension logic’s job. If I understand your question, you’re asking how to invoke a subscript from within another script, correct?

  8. Yes Maaz,
    I am trying to invoke a subscript within another script, but am unable to understand where am missing the syntax. Can you just correct if I am invoking it wrongly.

    1. I don’t have access to a working version of em12c therefore can’t actually try it out, but from what I recall, you should be able to upload both of your scripts in the ME, and reference it just the way you’re doing it now. But, by its pure nature, why do you want to create two separate scripts within a single ME? Why not just combine them?

  9. Can you please let me know table which stores metric extension version deployed on most of the targets

    1. Kiran,

      I don’t have access to an EM environment to check but the list of document views should help guide you in the right direction. If I recall correctly, I believe they’re still stored in the same views as regular metrics – try searching in the mgmt$metric_details or mgmt$metric_current.

      https://docs.oracle.com/cd/E73210_01/EMVWS/GUID-FEFC509F-FAD8-4374-AB96-67D83601F8EE.htm#EMVWS32314

      Good luck.

      Cheers,
      Maaz

  10. Poornima Risbud · · Reply

    Hi Maaz,

    I tried to implement this metric extension and am getting this below error, if you see this message and have some time, it will be great to get help on this error..
    Failed to get test Metric Extension metric result.: Result has repeating key value : checkpoint,Error

    1. Hi Poornima!

      This just means the key you defined for the metric extension is not unique. Take a look the result coming back from the script you’re running against the “key” you’ve defined. Make sure the rows are unique.

      Good luck!

      Cheers,
      Maaz

      1. Risbud, Poornima K · ·

        Hi Maaz,

        This is my result set, I see everything unique..

        MANAGER|MANAGER|RUNNING|0|0|0
        E103XDA3|EXTRACT|RUNNING|5|3|0
        E203XDA3|EXTRACT|RUNNING|60|3|0
        E303XDA3|EXTRACT|RUNNING|5|6|0
        E403XDA3|EXTRACT|RUNNING|227|16|0
        E503XDA3|EXTRACT|RUNNING|2|0|0
        E603XDA3|EXTRACT|RUNNING|24|1|0
        EL13XDA3|EXTRACT|RUNNING|3|9|0
        R112XDA3|REPLICAT|RUNNING|0|1|0
        R211XDA3|REPLICAT|RUNNING|0|3|0
        R311XDA3|REPLICAT|RUNNING|554|7|0
        R412XDA3|REPLICAT|RUNNING|0|1|0
        R41AXDA3|REPLICAT|RUNNING|1058|0|0
        R511XDA3|REPLICAT|RUNNING|3|1|0
        R612XDA3|REPLICAT|RUNNING|0|1|0
        R61AXDA3|REPLICAT|RUNNING|0|1|0
        RL12XDA3|REPLICAT|RUNNING|0|0|0
        RL22XDA3|REPLICAT|RUNNING|0|0|0

        Thanks,

        Poornima Risbud

      2. How did you define your key value for the metric extension?

      3. Risbud, Poornima K · ·

        Information Classification: •• Limited Access

        Sent mail to another email..if you could respond that will be great. The key columns are defined exactly the same as in your blog.

        Thanks,

        Poornima Risbud
        Infrastructure Service Delivery | Phone : 617-985-4284| Cell : 617-750-9194 | email : pkrisbud@statestreet.com
        State Street Corporation

        The information contained in this email and any attachments have been classified as limited access and/or privileged State Street information/communication and is intended solely for the use of the named addressee(s). If you are not an intended recipient or a person responsible for delivery to an intended recipient, please notify the author and destroy this email. Any unauthorized copying, disclosure, retention or distribution of the material in this email is strictly forbidden – Thank You

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Musings

Things I see and learn!

Thoughts from James H. Lui

If you Care a Little More, Things Happen. Bees can be dangerous. Always wear protective clothing when approaching or dealing with bees. Do not approach or handle bees without proper instruction and training.

bdt's blog

Sharing stuff (by Bertrand Drouvot)

Frits Hoogland Weblog

IT Technology; Yugabyte, Postgres, Oracle, linux, TCP/IP and other stuff I find interesting

Vishal desai's Oracle Blog

Just another WordPress.com weblog

%d bloggers like this: