Tag: Errors

SNMP GET Errors in OpsMgr EventLog

I’ve been building a little SNMP Management Pack in the past few days to discover and monitor a bunch of PowerWare UPS’s, which turned out to take quite a lot more energy and time than expected. Mostly due to the facts that I am really bad with SNMP and how it works, I’ve never really looked into the inner working of building an SNMP management pack and also because we ran into a couple of errors preventing the discovery process to work alright. To make it clear right away, this is not going to be a “Building an SNMP Management Pack Tutorial” since there’s plentiful good ones out there already, and to be extra helpful I’m gonna include a few links right away: SNMP Setup and Simple Custom SNMP Discovery - Pretty much the basics SNMP Management Pack Example: NetApp Management Pack - Part 4 actually, but has the links to the other parts Creating SNMP Probe Based Monitors - No custom discovery, but a good and simple guide to SNMP Probes It’s the second, the NetApp one, I’ve used as a guide to building the UPS management pack since it goes through the process of building your own filtered discovery using SystemOID to identify your hardware-classes and then building the monitors on top of those. Let’s get to it When building the discovery of my hardware classes I ran into problems. The discovery simply did not work. At first I got some strange errors about “invalid queries”, something that turned out to be related to me reading two guides–seriously though, pick one guide that is closest to what you want to achieve and stick to it–and mixing up the XPathQuery variables. Silly me. I got those errors to go away and I was able to get a few objects to my base-class, but none of the hardware classes who was populated through the return value of an SNMP OID got discovered. The only error I got this time was the following: Log Name: Operations ManagerSource: Health Service ModulesDate: 2010-09-02 11:19:12Event ID: 11001Task Category: NoneLevel: ErrorKeywords: ClassicUser: N/AComputer: CENSOREDDescription:Error sending an SNMP GET message to IP Address XX.XX.XX.XX, Community String:=CENSORED, Status 0x6c.One or more workflows were affected by this.Workflow name: CENSORED.MP.CLASS.DISCOVERYInstance name: CENSORED_DEVICENAMEInstance ID: {5C7EFB30-D885-8843-0DD7-EA86B4FD2311}Management group: CENSORED I went through all the other logical steps of troubleshooting an error like that which include double-checking firewall settings, OIDs, IP-addresses, allowed hosts and so forth. It wasn’t until I loaded the PowerMIB into a MIB Browser installed on the proxy machine (in this case a Management Server) I realized that there was no problem sending an SNMP GET to the UPS from that server. I launched Wireshark and had it listen to SNMP traffic between the UPS and the Management Server. The thing that struck me right-away was the fact that I could see the a bunch of “SNMP Get-Request” but no “SNMP Get-Response” which means that Operations Manager did send an SNMP GET but there was no response. After a bit of intense staring i noticed what you see in the screenshot.

ESENT Error When Modifying OpsMgr Agent

Getting ESENT Kerys are required to install this application when you are trying to modify/change an agent installation? This seems to be  most common on Windows 2008 and i guess it’s because of the UAC and the fact that opening the Control Panel isn’t running in administrative mode. To work around this you need to run the msiexec command on the correct installation GUID from an administrative command prompt. Besides running through the registry to find the GUID, one of the easier ways is this: Open an administrative command prompt. run wmic product Locate your product by its name, the GUID (looks a bit like this {25097770-2B1F-49F6-AB9D-1C708B96262A}) directly after that is the one you want. Copy it. run msiexec /i <PASTEYOURGUIDHERE> Modify the agent as pleased That’s pretty much it. Good luck.

Cannot Delete Files with Long Paths?

What do you do when you cannot delete a file or folder on a windows server? Check the file permissions! And if that doesn’t help? Check the share permissions! Yes, if it is a shared folder. And if that doesn’t help? Check the file ownership! Great! But then what? Well, the file could be in use, and then you would have to shut the locking process down and perhaps kick a user out. In a really bad scenario it could also be a symptom of a broken filesystem, a reserved filename (like “lpt1” or “PRN”) or even an invalid name (silly things like a space in the beginning or the end of a filename). Another possible reason could actually be that the path to the file or folder is too long. You won’t actually get an error telling you that the filepath exceeds the 255 characters Windows can handle but a simple “Acces Denied”. There are some, more or less tedious, work-arounds for the problem. Like renaming, starting from the root, all the directories to shorter ones or using the old DOS (8.3, like “dokume~1.doc”) names that windows can auto-generate for you. Personally, I have two favourite ways of handling this. Map the parent-directory of the file/folder you are trying to access/delete as a network drive and access your files that way. This is particularly useful if the folder you are trying to access a DFS-share or perhaps a share on the central fileserver filepaths like \\servername01\Central Projects\Central Services\IT Department\Develop Methods for Automatically Deploying New Central Servers\2.2.1 Auto-Deploying SQL-Server 2005 Cluster\Documents\Preparations\Whitepapers\SQL Server 2005 Failover Clustering White Paper.doc Create a new share to a folder further down the hierarchy. This works locally too if you are logged on to, say, SRV01, you create a new share on D:\\Central Projects\Central Services\IT Department\Develop Methods for Automatically Deploying New Central Servers\ called Autodeploymethods and access it from \\SRV01\Autodeploymethods. That way the filepath doesn’t exceed 255 characters. Now. When designing fileservers, you really should think about how deep the filepaths may get. This is especially true on DFS-shares since you might have to deal with the full FQDN too, and not only the actual folder structure. Many big corporations I know uses “codes” for departments and assign a project ID (quite simply a number or maybe an abbreviation) to each project and uses theese for the fileshares too. Another scenario that could lead to similar problems are intranet sites where users can create and manage their own subsites and where filenames and folders are not stored in a database. I have only seen this phenomena on Windows systems so far, and I’ve actually used a linux Live-CD on occasion when admin access is denied. Read More: http://support.microsoft.com/kb/320081

Health Rollup not working in Exchange Management Pack

I’ve wrestled a bit with a critical status on one of the Organization States at a clients site that wont go back to green despite all the underlying monitors have gone back to green. And apparently I am not alone on this one. Others, like me, has read and re-read the MP-guide i search for a monitor/rule/discovery for overrides forgotten, and I don’t know how many times I’ve made a small change and tried resetting the health once again. Anyhow. Marius Sutara posted an answer on TechNet forums last week with a “fix” (-ish), or rather the acknowledgement that the problem is not a 40c. The problem might be related to other MP as well, but I’ve only seen it on the new Exchange MP so far. In that same post, Pete Zerger provided some links to two nifty little tools that will help you reset the health of the monitor. In case you wonder why on earth I post when there’s allready a “solution” out there; Pagerank, baby! Not for me, but for the forum post making it show up earlier on google.

The TCP Port Check: Use with caution!

Just wanted to raise a word of caution about the TCP Port Check in Operations Manager 2007. Some customers have notices the the system-logs on some Unix machines are completely swamped with “connection error”, “TCP Connect failed”, “TCP Session Lost” and similar and after a bit och research the problematic servers were narrowed down to those monitored by Operations Manager. Specifically, those who are targeted by a TCP Port Check. It would seem like the TCP-connection never fully initializes on the target server. Kind of like knocking on your neighbours door and then hiding. Then when the door opens, no one is there causing your friendly neighbour to hang around waiting for something to happen. Maybe there’s a setting somewhere to modify how “deep” a Port Check should go before closing. Perhaps fully initializing and then sending a proper “Close” instead of just cutting the connection. In a few extreme cases we have noticed that the target server even goes so far as to start a session, but never ending it since there’s no closure and finally having no sessions to spare for the real users. But on most servers it’s just an annoyance since the “real” errors is very hard to be found in all the connection related logs. Anyway. Just a good thing to keep in mind when running TCP Port Checks from Operations Manager 2007. Keep an eye on the logs when implementing the port checks.

MSMQ Management Pack: Subscript Out of Range

UPDATE: This problem seems to be fixed in the latest update! The MSMQ Management Pack seems to have a few problems with it’s discovery script that can lead to the following error showing up in the logs: The process started at 13:34:40 failed to create System.Discovery.Data. Errors found in output:C:Program FilesSystem Center Operations Manager 2007Health Service StateMonitoring Host Temporary Files 499788DiscoverQueues.vbs(107, 4) Microsoft VBScript runtime error: Subscript out of range: '[number: 0]'Command executed: "C:WINDOWSsystem32cscript.exe" /nologo "DiscoverQueues.vbs" {615D37C9-477D-62E2-0833-6ECBF0E89A87} {A176AC83-CC31-01C3-5DE9-E2DFF64E7CC7} "MASKED.server.fqdn" "MSMQ" "true" "true" "False" "false"Working Directory: C:Program FilesSystem Center Operations Manager 2007Health Service StateMonitoring Host Temporary Files 499788One or more workflows were affected by this.Workflow name: Microsoft.MSMQ.2003.DiscoverQueuesInstance name: MASKED.server.fqdnInstance ID: {A176AC83-CC31-01C3-5DE9-E2DFF64E7CC7}Management group: MASKED This seems to be related to the discovery of public queues on some servers that has none. One quick fix, or rather work-around, is to override the discovery on these servers to set DiscoverPublic to False.

NetworkAdapterCheck.vbs fails on Windows 2000

Problem Here’s my summary of the problems with the NetworkAdapterCheck.vbs script in the Windows Server 2000 Operating System Management Pack för Operations Manager 2007 that is causing the failed to create System.PropertyBagData error i wrote about earlier. This information in also available on https://connect.microsoft.com/feedback/ViewFeedback.aspx?FeedbackID=432627&SiteID=446 Symptoms This “research” comes from getting an obscene amounts of Script or Executable Failed to run in the Operations Console. Each time it was the NetworkAdapterCheck.vbs script that could not create PropertyBagData. The error message copied from one of the alerts looks like this: The process started at 14:29:26 failed to create System.PropertyBagData, no errors detected in the output. The process exited with 0Command executed: "C:WINNTsystem32cscript.exe" /nologo "NetworkAdapterCheck.vbs" MASKEDCOMPUTERNAME 0 false true falseWorking Directory: C:Program FilesSystem Center Operations Manager 2007Health Service StateMonitoring Host Temporary Files 2882781One or more workflows were affected by this.Workflow name: Microsoft.Windows.Server.2000.NetworkAdapter.NetworkAdapterConnectionHealthInstance name: 0Instance ID: {F4C478D3-38E5-8C29-3957-E3B7F486216E}Management group: MASKED This error repeats almost as often as the script is scheduled to run and appears on almost every Windows 2000 server. Probable Cause I am not really sure, but after a quite a bit of troubleshooting I am pretty sure it all boils down to a malformed WMI-query. What I basically did was to extract the script from the MP and dry-run it to see if I could find anything obvious, which I didn’t. Since I didn’t have a good debugging too available, like in PrimalScript, I added the VBS equivalent of old-school printf debugging. I basically added wscript.echo "Line XX:" & Err.Number