Upgrading #OpsMgr 1801 to 1807 - Fieldnotes My Fieldnotes are quick, unrefined notations and reflections from the field. Content may be obvious and unnecessary to some, useful to others. The main purpose is for them to be used as a searchable notebook. These Notes are not to be seen as a manual, a how-to or a set of instructions, but rather a collection of thoughts, reflections and experiences from my field-work. Abstract Upgrades from OpsMgr (SCOM) 1801 to 1807 is generally a safe operation. Can be done using Windows Update if you want to, but I would suggest only letting Management Servers, Reporting Servers and Web Console servers pre-load the update for a controlled update at a suitable time. I general, the updates goes smooth, but you still have to run the SQL-script(s) and import the updated Management Packs provided with the update. This should be done directly after updating the last Management Server. Links Official Documentation Download Location Announcement Blog What’s New! - Webinar Recording List of fixes Notes Updating the Management Servers Fairly quick update, may need a restart afterwards.
A Collection of OpsMgr Upgrade Fails I’ll be frank on this one; Microsoft really dropped the ball on the 1801 setup program. No upgrade or update has been this ridden with faults and obscure errors, not even the infamous SCOM 2007 SP1 setup. And this is not only errors that will abort the installation, they will actively remove your existing SCOM Components and cause a restore, either from snapshot/checkpoint or from backup. MAKE SURE YOUR BACKUPS ARE WORKING!!! Rollbacks don’t work in 1801, at all. You have been warned. Here’s my list of the issues I’ve seen and wrestled so far. .NET 3.5 Pre-requisites Although neither SCOM 2016 nor 1801 actually has a .NET 3.5 requirement, the setup think it does. The Prerequisute checker isn’t aware though, meaning it will happily try to upgrade and then fail. And, boy, does it fail spectacularly! When the upgrade failes, it’s supposed to perform a rollback, and reading the setup log it actually do try. Unfortunatly, the “rollback” will remove any existing SCOM Roles in the server. O_o Yes, thats right. Your Management Server is no longer a Management Server! Workaround Make sure all SCOM Management Servers have .NET Framework 3.5 installed before attempting an upgrade.
Untimely, perhaps While most of us are waiting for SCOM 2016 RTM to be generally available i’ve completely forgot to blog about my little Set-SCOMMaintenanceModeDeluxe.ps1 script I wrote a while ago.
TLDR Customer wanted to see all RDS Host servers in a view with their current total session count. Decided to use a powershell grid dashboard, and share the script. Here’s the gist of it: SCOM_RDSH_TotalSession_PoSHWidget.ps1 How To Keeping it fairly short this time. Pre-requisites are: System Center 2012 R2 with UR2 or later Microsoft RDS Management Pack Microsoft Windows Core OS Management Pack Create a dashboard Rightclick and create a new view somewhere, make it a Dashboard View Enter Name and Description Select Grid type, and layout
Was troubleshooting this little error message for a customer after deploying the SQL Server Management Pack version 220.127.116.11. The event is the generic “Health Service Script” with id 4001. Management G
Quick and unrefined notes on Update Roll-up 4 for System Center 2012 R2 - Operations Manager Preparation The usual routine applies. Check the KB for instructions and take not of known issues. Check if Kevin Holman has written something about it. As this is an update roll up, I pre-emptively expect that gotchas in UR3 may apply. Remember to open the update catalog in IE as the downloader is not working in other browsers. Download, unpack, toss what languages that does not apply to your organization. It is advised to disable any mail-generating alert subscriptions during the upgrade process to avoid unnecessary spammage. Issues - So Far [updated: 2014-11-03] Got a few problems with cross-platform monitoring templates not working after the update. This was due to missing files in the update package. Make sure you download the updated version! Have not updated a customer using gateways yet, so unless they have fixed the issues in UR3, expect an update to this section soon. Planning
Story-time I saw SquaredUp some year or two ago while googling about on behalf of a customer looking for a dashboard kind of thingy. It looked good and fairly simple, but for some reason it never clicked with the customer and we ended up going for some custom-made dashboards with a little scripting and some DB-queries. I kind of liked the look of their product though and have kept an eye on them now and then. Fast-forward to may 21st this year and the release of version 1.8 and a whole slew of nifty little features. What specifically piqued my interest was the linked dashboards, SharePoint integration and the included SLA and Map plugins. This basically ticked a lot of boxes many of my customers have looked for and something we’ve normally been looking into… err… other products for. That, coupled with some new videos on their Youtube-channel, a few well-placed tweets and a little mail-correspondence had me setting it up in my portable little lab. One of the interesting points is how, supposedly easy, it is to set the portal up. So I reset my lab – PDT is just wonderful – and decided to go for the hail-dummy approach. No manual, no preparations, no check-lists… Next-next-next then hopefully a working portal. The Installation First, download the installation file (yes, singular) through the link you’ve got in your email and save it somewhere proper. Doubleclicked the installation packaged and it now tells me it will install a few pre-requisites and configure the website for me. Next! The EULA I am sure each one of you are reading. Next! Installing, or rather configuring, IIS and pre-requisites for me. Very nice. Next!
Quick and unrefined notes on Update Roll-up 3 for System Center 2012 R2 - Operations Manager. Preparation Usual routine, check the KB for instructions and known issues. Double-check with Holman’s blog and take note of any irregularities. Download, curse your favourite deity and re-try in IE. Unpack the CAB-files (why do you keep putting them in CAB-files?) and throw away all the unnecessary language-specific console patches. Notify affected parts of the organisation, I’ve had the benefit to work with good release managers at all clients so far, so no biggie. Then try to close as many consoles as possible to avoid any blocks in database. Make sure you have the credentials for the Data Access service account at hand. Issues - So Far (updated 2014-09-02) The SQL Script for the OpsDW was deadlocked at all sites and customers without exception. It’s easily fixed by stopping the management server services. A few dashboards (the 2012 versions) went blank. Self-healed after some aggregation job over night. From what I’ve noticed, only the SLA Dashboards are affected but never all of them. Agents behind daisy-chained Gateways is not identified as in need of an update. “Repairing” them from the Console is one work-around that seems to work consistently. A few agents that was updated using Windows Update did not report as updated. Repair fixed that nuisance. Had to flush the cache on a few Gateways to avoid heartbeat failures from their agents. Only daisy-chained ones if I recall correctly. Planning
By request, I uploaded a short clip demonstrating how you would add a windows performance counter to a performance collection rule using the Authoring Console. It is a fairly simple task to complete b
Now we’re gonna make things even faster! In the previous post on the subject of Agent Fail-over in Operations Manager 2012 we created a script that will go through a selection of agents and make sure that they all have up-to-date fail-over settings. We are doing the same thing in this one, but making it go faster. In my lab, it’s about five times faster in fact and I only have about 20 agents to play with. Not really a big deal, but scale it up a bit and add a few thousand agents and the pay-off will be very significant. As usual, the script will work as is, but it really is more to show the concept. You would have to add filtering to make sure you don’t mix agents behind gateway servers and agents behind management servers. Giving an agent behind a gateway a management servers as it’s fail-over server will likely not help you in any way. We will pretty quickly go “advanced” this time, so buckle up. 😉 Being a slight modification of the script in the last post I am not going to go through those details. Use that post if you need references to the Inputs, the OpsMgr 2012 Modules, Management Group connection and gathering your agents and management servers. Why Is It Faster? We are doing the same thing, on the same agents and with the same servers. And we already did some optimization by loading them all into memory and working from there. How do you make it faster? Basically, I’m cutting the over-head of the cmdlets and how they work. You may have noticed that in the “Do Stuff” section, we are actually calling the Set-SCOMParentManagementServer cmdlet twice! Once for the primary Management Server and once for the fail-over Management Servers. In effect, we connect, fire a command, wait for result, and disconnect two times for each agent. And pretty much only because the cmdlet does not offer support to set primary and fail-over management servers at the same time. Any attempt to do so will return an ambiguous parameter error. I don’t like that. A brief look at the agent object class, Microsoft.EnterpriseManagement.Administration.AgentManagedComputer, revealed a method called SetManagementServers. This method takes, or actually “requires”, two parameters. One for primary and one for fail-over management servers. Yay! Using this method saves us a bunch of over-head and a couple of round-trips to the SDK-service. The Challenge