Tuesday, August 25, 2009

Sharepoint Troubleshooting

On the Front End

  • Troubleshooting Web Parts - These errors often don't get logged or even seen on a slow loading page, but you can see what's going on in the web part maintance page and delete the offending parts. There's a Ted Pattison Feature called SharePoint Debugger for this (turn it on right from site actions).
  • Verbose page errors - turn on debug mode in the web.config. There is actually a trace log provider with MSDN samples so you can write your logs to this.
  • IIS Logs - everything from server error numbers, client error numbers, usernames, session info, bytes in, bytes sent, a bit archaic if you don't know what you're looking. Best free tool - Microsoft IIS Log parser: runs like a SQL query tool against the logs
  • SharePoint ULS (United Logging Service) logs - These can be ugly, but they often are the only place that contains some of these errors. If you're having problems with features, activation, web parts, any timer jobs like profile import user info sync, and really that SharePoint goo, then this is definitely a place where you just have to be familiar with. Most of the work is searching for "error," "fail," "failed." You can crank up the logging through the SharePoint Central Admin logging Central Admin: Central Administration > Operations > Diagnostics Logging. Find these logs at C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\LOGS. Scott Hilier put up a codeplex feature called SharePoint "log viewer" that allows you to view these logs in Central admin. Looks like he also has some stuff on diagnostics up there. Also the SharePoint Ninja (new to me) has written a SharePoint ULS log parser.
  • IIS Worker Process Logs - I don't know many people who spend a lot of time here, but I've found most of my memory issues are manifested here. You can see a lot of what's loading up and what happens before the worker process cycles. They aren't that hard to read, but there is a lot of noise. All of these logs have lots of noise. Use the Trace stuff in IIS 7... it's awesome for getting to the bottom of issues that manifest themselves in the worker processes.
  • Application (Event) Log - My favorite place to spend time when funny things are happening on a server. Most of the work is filtering to find the time when the failure occured, walking backward to see what led up to the failure. There are some huge docs on Verbose WSS errors and MOSS events.
  • System (Event) Log - Memory and system events mostly around hardware failures, OS failures, and on and on. Don't forget this one. IIS often has errors here. Event viewer is the most common place this is viewed.
  • Security (Event) Log - You'll find a lot of the authentication stuff seems to end up here. If you're doing Kerberos, you can't live without this log.
  • All those user specific logs in %temp% logs- SharePoint Installation and Upgrade logs (version to version and build to build and hotfix, config wizard) Logs (also anything STSADM spits out in a log like an stsadm solution deployment) - If you're having problems with upgrade and you ignore these logs... troubleshooting will take 10X as long to troubleshoot maybe longer. These logs have tons and tons of noise. You basically have to find where the failure occurred and again walk backward. Searching for "fail" or "error" in a text editor is where a lot of us have lived. The SPS 2003 to MOSS logs are at C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\Logs\Upgrade.log
  • Performance Logs (generated by perfmon) - performance logs aren't logged unless you turn them on, but man I bet I use them 100% of the time if I'm troubleshooting Memory, CPU, Disk and Network issues
  • Netmon Logs (generated by NetMon) - again not turned on by default, but if you're trying to trace between the client and the server a good fiddler2 session or visual studio test team or classic network monitor from the SMS bits is going to help you understand what's coming in from your DCs, other member servers, SMTP, and on and on.
  • Gatherer/Crawl Logs - If you're troubleshooting Indexing you'll be going a few places, you'll see gatherer logs are in the database but there's an interface in the SSP admin UI. There's also the Change Log, but it isn't something you'll use much.

Backend Logs

  • SQL Logs - these are truly underrated. There is so much *good* information in the SQL Logs. I've found problems with memory, settings and configuration, things that expose needed hotfixes. The whole -g startup switch on 32 bit systems for extended memory addressing was only exposed by drilling into the logs and seeing the "out of memory" errors. The SQL team does a good job of helping you track error numbers to problems to then solutions.
  • SQL Profiler Logs - You have to know SQL profiler to truly troubleshoot backend issues.
  • System, App, Security (watch for kerb issues which require SPN), Netmon, Perfmon - see above

No comments:

Post a Comment