Friday, December 02, 2011

PAL – Performance Analysis of Logs

Wow…been a long time for an update eh?! Well most of this year I’ve been working on our Windows 7/Office 2010 rollout.  Also been working on a massive shift from one archive solution to another…but that’s another whole story…

After we upgraded one of our sites from Outlook 2003 to Outlook 2010 we coincidentally had huge issues with Exchange 2003.  We started getting 623 errors in event logs followed by 1022 errors which ended up raping, pillaging and plundering our Exchange information stores. So after weeks of the Microsoft Advanced Diagnostics team, which seems to be based out of somewhere in India, said our issues were caused by too many mailboxes with message counts greater than 5000 in folders. No shit..really?!  Still working on this issue so I’ll let you know how it turns out.

So back to PAL. During my research into Exchange performance I found the PAL tool at codeplex.comPAL is a tool that will look at your perfmon counter logs and tell you what’s up with your server.  Unlike, the Baseline Analyzer tools, it reacts to running conditions on your servers and lets you know what seems to be running “out of bounds”.  Turns out a bunch of subject matter experts at Microsoft got together and set some rules in the PAL tool to alarm/alert just like a traditional expert system would. Very cool stuff.

So here’s the quick and dirty on getting started with PAL:
  1. Download PAL from here
  2. Install PAL
  3. Run PAL and go to the Threshold File Tab
  4. Pick the type of analysis you want to run…Start with System Overview…it will get you started (Threshold File Title)
  5. Now click on “Export to Perfmon Template File”
  6. Now if it’s a 2003/XP system save the filetype as .htm, if it’s 2008/Win7 save the filetype as .xml.
  7. Copy the template to the system to be monitored
  8. Run perfmon
  9. (Note: The rest of this is steps for 2008/Win7..if you are on 2003/xp figure it out for yourself. ;P)
  10. Go to Data Collector Set
  11. Right click on “User Defined”
  12. New Collector Set
  13. Pick a name and select “from template”
  14. Browse for the template
  15. Hit Finish
  16. Right click on the collector and start it
  17. Run it for awhile and stop it
  18. Take the results file back to your PAL workstation and start from the first tab.
  19. On counter tab, select the resultant file
  20. On Threshold file Tab reselect the one you started with
  21. On Questions Tab, answer the 4 questions about the system you were monitoring
  22. Output Options Tab, leave it at auto for now
  23. File Output Tab, leave defaults
  24. Queue Tab, leave defaults
  25. Execute Tab, select Execute and hit finish
  26. Wait…it takes awhile to process
  27. Enjoy your html output file and analysis of what was up with your server
One last thing to note…scroll through your Threshold File Titles…there’s a lot to choose from.  You can run some very specific tests.  These all relate to specific counters to “watch” in perfmon so it’s a great learning tool just looking at what’s important to watch.


Anonymous said...

We're having the same situation as you are. Upgrading people from Outlook 2003 to 2010 and one of our Exchange 2003 servers doesn't like it at all. We're getting periodic 623 errors followed by a slew 1022 errors and users complaining about all sorts of Outlook issues. In my case Microsoft support is focusing on the amount of white space in 3 of our stores, which is rather high. I'll be interested to hear what you find.


Rolfsa said...

Well I guess misery loves company... The latest thing they are telling us is that the hidden search folders have too many objects in them. I'm posting tomorrow on a tool we found to show you each folder, including the hidden ones, and how many objects are in each folder.

Anonymous said...

You can get the folder sizes for those search folders in pfdavadmin using export properties under the Tools menu. Just make sure you include the PR_CONTENT_COUNT property.

We've been running offline defrags and isintegs and are getting close to trying to get our server back.