Quick and unrefined notes on Update Roll-up 4 for System Center 2012 R2 -Operations Manager PreparationThe usual routine applies. Check the KB for instructions and take not ofknown issues. Check if Kevin Holman has written something about it. As thisis an update roll up, I pre-emptively expect that gotchas in UR3 mayapply. Remember to open the update catalog in IE as the downloader is notworking in other browsers. Download, unpack, toss what languages that does not apply to your organization. It is advised to disable any mail-generating alert subscriptions during theupgrade process to avoid unnecessary spammage. Issues - So Far[updated: 2014-11-03] Got a few problems with cross-platform monitoring templates not working after the update. This was due to missing files in the update package. Make sure you download the updated version! Have not updated a customer using gateways yet, so unless theyhave fixed the issues in UR3, expect an update to this section soon. Planning
Story-timeI saw SquaredUp some year or two ago while googling about on behalf of a customer looking for a dashboard kind of thingy. It looked good and fairly simple, but for some reason it never clicked with the customer and we ended up going for some custom-made dashboards with a little scripting and some DB-queries. I kind of liked the look of their product though and have kept an eye on them now and then.Fast-forward to may 21st this year and the release of version 1.8 and a whole slew of nifty little features. What specifically piqued my interest was the linked dashboards, SharePoint integration and the included SLA and Map plugins. This basically ticked a lot of boxes many of my customers have looked for and something we’ve normally been looking into… err… other products for. That, coupled with some new videos on their Youtube-channel, a few well-placed tweets and a little mail-correspondence had me setting it up in my portable little lab. One of the interesting points is how, supposedly easy, it is to set the portal up. So I reset my lab – PDT is just wonderful – and decided to go for the hail-dummy approach. No manual, no preparations, no check-lists… Next-next-next then hopefully a working portal. The InstallationFirst, download the installation file (yes, singular) through the link you’ve got in your email and save it somewhere proper. Doubleclicked the installation packaged and it now tells me it will install a few pre-requisites and configure the website for me. Next! The EULA I am sure each one of you are reading. Next! Installing, or rather configuring, IIS and pre-requisites for me. Very nice. Next!
Quick and unrefined notes on Update Roll-up 3 for System Center 2012 R2 - Operations Manager. PreparationUsual routine, check the KB for instructions and known issues. Double-check with Holman’s blog and take note of any irregularities.Download, curse your favourite deity and re-try in IE. Unpack the CAB-files (why do you keep putting them in CAB-files?) and throw away all the unnecessary language-specific console patches. Notify affected parts of the organisation, I’ve had the benefit to work with good release managers at all clients so far, so no biggie. Then try to close as many consoles as possible to avoid any blocks in database. Make sure you have the credentials for the Data Access service account at hand. Issues - So Far(updated 2014-09-02) The SQL Script for the OpsDW was deadlocked at all sites and customers without exception. It’s easily fixed by stopping the management server services. A few dashboards (the 2012 versions) went blank. Self-healed after some aggregation job over night. From what I’ve noticed, only the SLA Dashboards are affected but never all of them. Agents behind daisy-chained Gateways is not identified as in need of an update. “Repairing” them from the Console is one work-around that seems to work consistently. A few agents that was updated using Windows Update did not report as updated. Repair fixed that nuisance. Had to flush the cache on a few Gateways to avoid heartbeat failures from their agents. Only daisy-chained ones if I recall correctly. Planning
By request, I uploaded a short clip demonstrating how you would add a windows performance counter to a performance collection rule using the Authoring Console. It is a fairly simple task to complete but does require the Authoring Console, obviously, and a better target class than what I use in the demo. The demo also assumes that this counter exist on all the targeted servers in your environment. It would be wise, when making your management pack, to check that it’s there on all targeted operating systems, and that’s what I use Performance Monitor for. (just search for perfmon in your start menu or run perfmon.exe) Enjoy.
Now we’re gonna make things even faster! In the previous post on the subject of Agent Fail-over in Operations Manager 2012 we created a script that will go through a selection of agents and make sure that they all have up-to-date fail-over settings. We are doing the same thing in this one, but making it go faster. In my lab, it’s about five times faster in fact and I only have about 20 agents to play with. Not really a big deal, but scale it up a bit and add a few thousand agents and the pay-off will be very significant. As usual, the script will work as is, but it really is more to show the concept. You would have to add filtering to make sure you don’t mix agents behind gateway servers and agents behind management servers. Giving an agent behind a gateway a management servers as it’s fail-over server will likely not help you in any way. We will pretty quickly go “advanced” this time, so buckle up. ;) Being a slight modification of the script in the last post I am not going to go through those details. Use that post if you need references to the Inputs, the OpsMgr 2012 Modules, Management Group connection and gathering your agents and management servers. Why Is It Faster?We are doing the same thing, on the same agents and with the same servers. And we already did some optimization by loading them all into memory and working from there. How do you make it faster? Basically, I’m cutting the over-head of the cmdlets and how they work. You may have noticed that in the “Do Stuff” section, we are actually calling the Set-SCOMParentManagementServer cmdlet twice! Once for the primary Management Server and once for the fail-over Management Servers. In effect, we connect, fire a command, wait for result, and disconnect two times for each agent. And pretty much only because the cmdlet does not offer support to set primary and fail-over management servers at the same time. Any attempt to do so will return an ambiguous parameter error. I don’t like that. A brief look at the agent object class, Microsoft.EnterpriseManagement.Administration.AgentManagedComputer, revealed a method called SetManagementServers. This method takes, or actually “requires”, two parameters. One for primary and one for fail-over management servers. Yay! Using this method saves us a bunch of over-head and a couple of round-trips to the SDK-service. The Challenge
In the last post, OpsMgr 2012 Agent & Gateway Failover – The Basics, we looked at the basics of the Agent and Gateway fail-over configuration cmdlets and how to use them in a direct and interactive setting. This is absolutely useful when you got this specific agent that you need to configure with a specific fail-over management server. To spice it up a little, we are going to add a little intelligence to it and enable wild-card selections while at it. The scenario we are building this script for is that now and then you want to make sure that certain agents have fail-over management servers configured. You also want to make sure that all management servers that are not the primary management server of any selected agent will be in that list of fail-over servers. This would include any new management servers as well as exclude any removed ones. In short, make sure your agent fail-over settings are up-to-date with the current environment. InputsTo use this script you need to know which management server you should connect your powershell session to and which agent, or agents, you want to check and configure. [powershell] Input SCOM Management Server to connect to in this session[string]$inputScomMS = “scomms01.domain.local” Input an existing agent you want to modify[string]$inputTargetAgent = “*.domain.local”
I have previously posted a few scripts on managing and configuring fail-over management servers on gateways and agents in System Center Operations Manager 2007 R2. Now that System Center 2012 Operations Manager is RTM and users are starting to explore the differences between the versions I see more and more questions on how you do, in OpsMgr 2012, what you did in OpsMgr 2007. In a few posts henceforth I will go through Agent and Gateway server fail-over configuration and management. In this first post I’ll look at the very basics of fail-over configuration, the cmdlets to use and some one-liners. The cmdletFirst of all, the cmdlets of OpsMgr powershell have all got new names looking like Verb-SCOMnoun and to list them all in the console you can execute the following command: [powershell]get-command SCOM[/powershell] The cmdlet we are looking for to set and manage primary and fail-over management servers is [powershell]Get-SCOMParentManagementServer[/powershell] As usual, you can pass the cmdlet as a parameter to get-help for information about its parameters and a few use-cases. SYNOPSIS Changes the primary and failover management servers for an agent or gateway management server. SYNTAX Set-SCOMParentManagementServer -Agent -PrimaryServer [-PassThru ] [-Confirm ] [-WhatIf ]  Set-SCOMParentManagementServer -Agent -FailoverServer [-PassThru ] [-Confirm ] [-WhatIf ]  Set-SCOMParentManagementServer -GatewayServer -FailoverServer [-PassThru ] [-Confirm ] [-WhatIf ]  Set-SCOMParentManagementServer -GatewayServer -PrimaryServer [-PassThru] [-Confirm] [-WhatIf] But that’s so boring to read the manual is a bit sketchy on how it behaves and the limitations.
PreludeNow that System Center Operations Manager no longer has that pesky Root Management Server role; a server role that in larger environments quickly became the choking point and made creating a fully Highly Available SCOM-environment both complex and frustrating to support and with little gain at that. With that gone and the SDK Service, or Data Access Service, thriving on all the Management Servers HA suddenly became pretty simple. All you have to do in SCOM2012 to make sure your management groups keep on kicking is to have at-least two Management Servers and your databases clustered. This new distributed architecture does not only give easy HA, it also makes it possible to connect to the SDK-service—be it using the Operations Console or powershell to name two options—on any Management Server. This, in turn, provides for a completely new level of scalability. Choked on sessions? Deploy a new Management Server! Anyway… given all this scalability and HA, would it not be nice if you could load-balance all these SDK-sessions you will be running from System Center Virtual Machine Manager, System Center Service Manager, System Center Orchestrator, regular scheduled powershell scripts and what-not? Of course it would! And you can! The simple solution is to use the built-in Network Load Balancer (NLB for short) feature in Windows Server and that’s what we’re going to discuss in this post.Before we go, I’d like to point to a great article written by Justin Cook that is covering most bases but in a less for-dummies way. So, yeah… I suppose this is the for-dummies version then. ;) Enjoy! PrerequisitesWe need to have the Network Load Balancing feature installed on all our targeted Management servers. The quick way to do this is using command-line (Windows Server 2008 R2 or later?). dism /online /enable-feature /featurename:NetworkLoadBalancingFullServer
…and why you should not use itA DisclaimerI have had serious doubts about actually writing this article for almost a year now for reasons that I will explain further on. But as others have discovered this “feature” as well–maybe “hack” would be a better name for it–I feel the need to explain how it works and also why you should not use it. Knowledge is power, and even if I advice against using this technique it is also a good way to understand how SCOM uses display-strings in management packs. The Good NewsYes, you can use parameter replacement in you AlertName. With “parameter replacement” i mean using some kind of substitute text, or mnemonic if you like, that at run-time get translated into something useful. If you have written any kind of alert generating rules or monitors, you most like included something like $Data/Context/Property[@Name=’SomeDataFromAPropertyBag’]$ into your alert description. In this dialog, you also have the possibility to set the Alert Name. And if you are lazy, like I am, you probably also noticed that it is impossible to insert any kind of dynamic data into that field as well. This is especially annoying when you are writing a management pack that needs to look different in the Alert Views in the console, and you want to monitor 50 different Events or Performance counters or Log entries that are pretty much the same apart from a Name or ID.Of course I could not refrain from copy-pasting a $Data/Context…$ into the alert name only to realize that it simply is not being parsed and translated into the value of the specified parameter. Over time I have settled for a stand-point that it’s probably a performance issue and I have also used that as an argument for this apparent lack of simplicity that some of my customers have been questioning. Two, maybe three, years later. Microsoft releases an update to the core agent monitoring packs. Much to my surprise, one performance monitor suddenly generated alerts with a dynamic performance value in the Alert Name. You know, that field that is not gettingt parsed I was mentioning in the earlier paragraph. It actually looked pretty bad and made it very much impossible to practice any kind of alert supression, but still. It actually had a parsed value in the Alert Name.As the lack of this feature had me irked before, I exported the core MP and started reading through the XML to find out how they did it. To my surprise, it was actually pretty simple if you ditched the Authoring Console and used your trusty text-editor instead. How To Do ItIn simple terms, if you know your SCOM XML out-side-in, you add the parameters to your “Alert” and modify your DisplayString, the one under LanguagePacks, to call that parameter by it’s relative ID.
Here’s a little something-something for the wicked. Me and my apprentice is currently decommissioning an entire Management Group with a thousand (-ish) agents. Long story short, we got a new Management Group, migrated all the agents, added a couple of hundreds more, deployed a bunch of gateways and now we are shutting down the old one. Now, uninstalling the old Management Group from all the agents is a breeze using SCCM and handling the few 20-ish servers that are left is not a biggie either. Shutting down ACS, however, is a different matter. Although you do configure your forwarders using Operations Manager, removing the management group you were running ACS in does not mean the agents will shut down and disable the AdtAgent service or stop trying to forward audit events to your collector. Now, selecting 10 agents at the time and running the “Disable Audit Collection” task–in case you did not know, there’s a limitation on how many agents you can run a task on in the Operations Console–is not my idea of a jolly good day and since Powershell is a bucket of joy in comparison; here’s a script for you all! DisableACSForwarders It is zipped to avoid security alerts, but as with any script found on the internet I implore to to read the code before actually running it. Anyway, you can use it in a couple of ways. To run it interactively, just go to the directory where you unpacked it and run it. You will be requested to enter the FQDN of you Root Management Server and a wildcard search for ACS Forwarders.For example: C:\..\Scripts> .\DisableACSForwarders.ps1Root Management Server: rms.teknoglot.localACS Forwarder name (wildcard): *.teknoglot.local