Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Panel
borderColorBlue
titleAdmins Instructions

Find the group owner of the system page by:

Going to Xymon.
Find the red production status indicator (pictured below), this should be red if there is an alarm.
Production Status Indicator.PNG
Click the red status indicator for the system affected.
Continue moving through the sub-menus until you find the page
If the page has special instructions at the top, make sure to follow these instructions when deciding if an admin should be contacted, and which admin should be contacted.
citrix.PNG

5. If the system name is clickable, there will be special instructions. Follow those instructions

System name.png

These instructions are many times instructions about when NOT to call.
instructions.PNG

6. Once step 4 is complete, find the system in the Footprints Change and Release management CMDB.

Search for the name of the system
Click on the related CI(s) link.
Click the bubble button CI’s to “Named server”
Click on the Managed by Relationship, and click “Go To”
The on call group information should be listed here.
7. If there is no information from the previous steps

Search the communication log for the alarming system.
Review past correspondence and determine who to call.
If the server starts with a “W” it is most likely the windows on call
If the server starts with an “L” it is most likely the unix on call.(Linux)
Consult with your coworkers.
Consult your supervisor, or on call supervisor.


Xymon Top Level View

...


Image Added


From the top level a user may drill down to determine a number of factors. To investigate a given category, click the face or symbol next to each title. Users who look at the Production category are presented with a large list of Services. Below is the current list.

...

It may be a mouthful of a title, but it's entirely accurate. When a user can only have one Xymon window open, it should be this one. Again, SquaredUp will display critical alerts for production machines. With those two programs running, a user should be able to determine which alerts warrant further attention without too much wasted effort. The non-green systems category provides a real time listing (up to four hours) of the most recent changes in machine status for every monitored device. It can also display the last 4 hours of event acknowledgment by a system administrator. Finally, all alerts are displayed in a dynamic expanding grid format.

 Image Removed Image Added

Current Status

Any machines with a current error condition will be displayed at the top of the page. Selecting any of the status icons to the right of the machine listing will bring up service information for that system. If there are no current error conditions, this portion of the page will display “All Monitored Systems OK”.
Users who click on the underlined name of a machine will see information deemed appropriate by the administrators. The displayed information may be the other systems in a cluster, or specific information about a given server. For example – lppbakbm01.itap.purdue.edu is a Backup Production server. When it alarms in the ‘Current non-green Systems’ window, clicking its title brings the user to the ‘Backup Production’ category. If tsm01.itap.purdue.edu was instead in alarm, they would bypass the ‘Backup Production’ category entirely, instead seeing specific instructions for tsm01 alerts. This is similar to the ‘Production Services’ example earlier in this document. It is impossible to tell which machines contain specific on-call instructions and which do not from the ‘Current non-green Systems’ view. The only consistent way to tell if a machine has further alarm instructions is to drill down to its lowest directory. Any names underlined within this category will contain specific instructions. 

...