Quantcast
Channel: Mike Lagase
Viewing all articles
Browse latest Browse all 60

Troubleshooting Exchange 2007 Store Log/Database growth issues

$
0
0

One of the common issues we see in support is excessive Database and/or Transaction log growth problems. If you have ever run in to one of these issues, you will find that they are not always easy to troubleshoot as there are many tools that are needed to help understand where the problem might be coming from. Customers have asked why does the Server allow these type of operations to occur in the first place and why is the Exchange Server not resilient to this? That is not always an easy question to answer as there as so many variables as to why this may occur in the first place ranging from faulty Outlook Add-ins, Custom or 3rd party applications, corrupted rules, corrupted messages, online maintenance not running long enough to properly maintain your database, and the list goes on and on.

Once an Outlook client has created a profile to the Exchange server, they pretty much have full reign to do whatever actions they want within that MAPI profile. This of course, will be controlled mostly by your Organizations mailbox and message size limits and some of the Client throttling or backoff features that are new to Exchange 2007.

Since I have dealt with these type problems in great detail, I thought it would be helpful to share some troubleshooting steps with you that may help you collect, detect and mitigate these problems when and if you should see them.

General Troubleshooting

Exchange 2007 SP2 RU2 and Later

  • Exchange 2007 SP2 RU2 adds a new feature to help track these log growth issues much easier. All you have to do is to set some thresholds (warning/error) in the registry and then once the log growth problem starts occurring, you can simply view the application log for events where a user has crossed over the thresholds that you have set. Note: This is not set by default after installing SP2 RU2, so if you are in the middle of log growth issue, adding the appropriate registry keys to the server will help provide additional insight in to the problem. See http://support.microsoft.com/kb/972705 for more information on this new feature and how to determine what values to set these registry keys to.

Outlook 2007

  • A new Outlook 2007 fix has been created to allow any email with an attachment being sent via MAPISendMail to now honor message size limits. Below is a brief description of how MAPISendMail can affect log growth on an Exchange server. 
    • When using the Send To Mail Recipient facility in Windows using an online Outlook mode client to send a message/attachment over the max message size limit, Outlook will stream the data to the storeprior to performing any message size limit checking, thus creating log files for the amount of data that the attachment size is for. Once the outlook message comes up, the damage is already done on the Exchange server. If you add a recipient to the message and try to send the email, you will then receive an error "The messaging interface has returned an unknown error. If the problem persists, restart Outlook". If you then save the message in the mailbox it will be successful. If you then pull up the message and then send it, you will now get the error "The message being sent exceeds the message size established for this user".
    • Now if you attempt the same process using a cached mode client, Outlook will open a new message with the attachment without any limit checks. If you add a recipient and then send the message, it will sit in the users Outbox. Performing a send/receive on the client will now generate the error "Task 'Microsoft Exchange - Sending' reported error (0x80040610): 'The message being sent exceeds the message size established for this user.'" This is expected behavior. If the user goes in now and deletes the message, the message ends up in the users deleted items folder which is then synched to the server. Messages that are over the size limit that are either imported or saved in to a user’s mailbox does not honor overall message size limits during the sync process.
    • If you use the Send to Mail option in any Office program, you will receive the same results where if in online mode, we stream the data to the server prior to checking size limits. Cached mode reacts the same way as well.

      To resolve this issue for your Outlook 2007 users, install 978401 on every client machine.

Builds earlier than Exchange 2007 SP2 RU2

  1. Use Exchange User Monitor (Exmon) server side to determine if a specific user is causing the log growth problems.

    • Sort on CPU (%) and look at the top 5 users that are consuming the most amount of CPU inside the Store process. Check the Log Bytes column to verify for this log growth for a potential user.
    • If that does not show a possible user, sort on the Log Bytes column to look for any possible users that could be attributing to the log growth
    • If it appears that the user in Exmon is a ?, then this is representative of a HUB/Transport related problem generating the logs. Query the message tracking logs using the Message Tracking Log tool in the Exchange Management Consoles Toolbox to check for any large messages that might be running through the system. See step 5.9for a Powershell script to accomplish the same task.
  2. If suspected user is found via Exmon, then do one of the following:

    1. Disable MAPI access to the users mailbox using the following steps (Recommended):

      • Run Set-Casmailbox –Identity <Username> –MapiEnabled $False

      • Move the mailbox to another Mailbox Store. Note: This is necessary to disconnect the user from the store due to the Store Mailbox and DSAccess caches. Otherwise you could potentially be waiting for over 2 hours and 15 minutes for this setting to take effect. Moving the mailbox effectively kills the users MAPI session to the server and after the move, the users access to the store via a MAPI enabled client will be disabled.

    2. Disable the users AD account temporarily

    3. Kill their TCP connection with TCPView

    4. Call the client to have them close Outlook in the condition state for immediate relief.

  3. If closing the client down or killing their sessions seems to stop the log growth issue, then we need to do the following to see if this is OST or Outlook profile related:

    1. Have the user launch Outlook whileholding down the control key which will prompt if you would like to run Outlook in safe mode. If launching Outlook in safe mode resolves the log growth issue, then concentrate on what add-ins could be attributing to this problem.

    2. If you can gain access to the users machine, then do one of the following:

      1. Launch Outlook to confirm the log file growth issue on the server.

      2. If log growth is confirmed, do one of the following

        1. Check users Outbox for any messages.

          1. If user is running in Cached mode, set the Outlook client to Work Offline. Doing this will help stop the message being sent in the outbox and sometimes causes the message to NDR.

          2. If user is running in Online Mode, then try moving the message to another folder to prevent Outlook or the HUB server from processing the message.

          3. After each one of the steps above, check the Exchange server to see if log growth has ceased

        2. Call Microsoft Product Support to enable debug logging of the Outlook client to determine possible root cause.

      3. Follow the Running Process Explorer instructions in the below article to dump out dlls that are running within the Outlook Process. Name the file username.txt. This helps check for any 3rd party Outlook Add-ins that may be causing the excessive log growth.

        970920  Using Process Explorer to List dlls Running Under the Outlook.exe Process
        http://support.microsoft.com/kb/970920

      4. Check the Sync Issues folder for any errors that might be occurring

    3. Let’s attempt to narrow this down further to see if the problem is truly in the OST or something possibly Outlook Profile related:

      1. Run ScanPST against the users OST file to check for possible corruption.

      2. With the Outlook client shut down, rename the users OST file to something else and then launch Outlook to recreate a new OST file. If the problem does not occur, we know the problem is within the OST itself.

      3. If renaming the OST causes the problem to recur again, then recreate the users profile to see if this might be profile related.

  4. Ask Questions:

    1. Is the user using any type of mobile device?

    2. Question the end user if at all possible to understand what they might have been doing at the time the problem started occurring. It’s possible that a user imported a lot of data from a PST file which could cause log growth server side or there was some other erratic behavior that they were seeing based on a user action.

  5. If Exmon does not provide the data that is necessary to get root cause, then do the following:

    1. Check current queues against all HUB Transport Servers for stuck or queued messages

      get-exchangeserver | where {$_.IsHubTransportServer -eq "true"} | Get-Queue | where {$_.Deliverytype –eq “MapiDelivery”} | Select-Object Identity, NextHopDomain, Status, MessageCount | export-csv  HubQueues.csv

      Review queues for any that are in retry or have a lot of messages queued.

      Export out message sizes in MB in all Hub Transport queues to see if any large messages are being sent through the queues.

      get-exchangeserver | where {$_.ishubtransportserver -eq "true"} | get-message –resultsize unlimited | Select-Object Identity,Subject,status,LastError,RetryCount,queue,@{Name="Message Size MB";expression={$_.size.toMB()}} | sort-object -property size –descending | export-csv HubMessages.csv  

      Export out message sizes in Bytes in all Hub Transport queues.

      get-exchangeserver | where {$_.ishubtransportserver -eq "true"} | get-message –resultsize unlimited | Select-Object Identity,Subject,status,LastError,RetryCount,queue,size | sort-object -property size –descending | export-csv HubMessages.csv

    2. Check Users Outbox for any large, looping, or stranded messages that might be affecting overall Log Growth.

      get-mailbox -ResultSize Unlimited| Get-MailboxFolderStatistics -folderscope Outbox | Sort-Object Foldersize -Descending | select-object identity,name,foldertype,itemsinfolder,@{Name="FolderSize MB";expression={$_.folderSize.toMB()}} | export-csv OutboxItems.csv

      Note: This does not get information for users that are running in cached mode.

    3. Utilize the MSExchangeIS Client\Jet Log Record Bytes/sec and MSExchangeIS Client\RPC Operations/sec Perfmon counters to see if there is a particular client protocol that may be generating excessive logs. If a particular protocol mechanism if found to be higher than other protocols for a sustained period of time, then possibly shut down the service hosting the protocol. For example, if Exchange Outlook Web Access is the protocol generating potential log growth, then stopping the World Wide Web Service (W3SVC) to confirm that log growth stops. If log growth stops, then collecting IIS logs from the CAS/MBX Exchange servers involved will help provide insight in to what action the user was performing that was causing this occur.

    4. Run the following command from the Management shell to export out current user operation rates:

      To export to CSV File:

      get-logonstatistics |select-object username,Windows2000account,identity,messagingoperationcount,otheroperationcount,progressoperationcount,streamoperationcount,tableoperationcount,totaloperationcount | where {$_.totaloperationcount -gt 1000} | sort-object totaloperationcount -descending| export-csv LogonStats.csv

      To view realtime data:

      get-logonstatistics |select-object username,Windows2000account,identity,messagingoperationcount,otheroperationcount,progressoperationcount,streamoperationcount,tableoperationcount,totaloperationcount | where {$_.totaloperationcount -gt 1000} | sort-object totaloperationcount -descending| ft

      Key things to look for:
      In the below example, the Administrator account was storming the testuser account with email.
      You will notice that there are 2 users that are active here, one is the Administrator submitting all of the messages and then you will notice that the Windows2000Account references a HUB server referencing an Identity of testuser. The HUB server also has *no* UserName either, so that is a giveaway right there. This can give you a better understanding of what parties are involved in these high rates of operations

      UserName : Administrator
      Windows2000Account : DOMAIN\Administrator
      Identity : /o=First Organization/ou=First Administrative Group/cn=Recipients/cn=Administrator
      MessagingOperationCount : 1724
      OtherOperationCount : 384
      ProgressOperationCount : 0
      StreamOperationCount : 0
      TableOperationCount : 576
      TotalOperationCount : 2684

      UserName :
      Windows2000Account : DOMAIN\E12-HUB$
      Identity : /o= First Organization/ou=Exchange Administrative Group (FYDIBOHF23SPDLT)/cn=Recipients/cn=testuser
      MessagingOperationCount : 630
      OtherOperationCount : 361
      ProgressOperationCount : 0
      StreamOperationCount : 0
      TableOperationCount : 0
      TotalOperationCount : 1091

    5. Enable Perfmon/Perfwiz logging on the server. Collect data through the problem times and then review for any irregular activities. You can grab some pre-canned Perfmon import files at http://blogs.technet.com/mikelag/archive/2008/05/02/perfwiz-replacement-for-exchange-2007.aspx to make collecting this data easier.

    6. Run ExTRA (Exchange Troubleshooting Assistant) via the Toolbox in the Exchange Management Console to look for any possible Functions (via FCL Logging) that may be consuming Excessive times within the store process. This needs to be launched during the problem period. http://blogs.technet.com/mikelag/archive/2008/08/21/using-extra-to-find-long-running-transactions-inside-store.aspx shows how to use FCL logging only, but it would be best to include Perfmon, Exmon, and FCL logging via this tool to capture the most amount of data.

    7. Dump the store process during the time of the log growth. (Use this as a last measure once all prior activities have been exhausted and prior to calling Microsoft for assistance. These issues are sometimes intermittent, and the quicker you can obtain any data from the server, the better as this will help provide Microsoft with information on what the underlying cause might be.)

      1. Download Procdump 3.0 or greater from http://technet.microsoft.com/en-us/sysinternals/dd996900.aspx and extract it to a directory on the Exchange server

      2. Open the command prompt and change in to the directory which procdump was extracted in step A.

      3. Type procdump -mp -s 120 -n 2 store.exe d:\DebugData. This will dump the data to D:\DebugData. Change this to whatever directory has enough space to dump the entire store.exe process twice. Check Task Manager for the store.exe process and how much memory it is currently consuming for a rough estimate of the amount of space that is needed to dump the entire store dump process.

        Important: If procdump is being run against a store that is on a clustered server, then you need to make sure that you set the Exchange Information Store resource to not affect the group. If the entire store dump cannot be written out in 300 seconds, the cluster service will kill the store service ruining any chances of collecting the appropriate data on the server.

      4. Open a case with Microsoft Product Support Services to get this data looked at.

    8. Collect a portion of Store transaction log files (100 would be good) during the problem period and parse them following the directions in http://blogs.msdn.com/scottos/archive/2007/11/07/remix-using-powershell-to-parse-ese-transaction-logs.aspx to look for possible patterns such as high pattern counts for IPM.Appointment. This will give you a high level overview if something is looping or a high rate of messages being sent. Note: This tool may or may not provide any benefit depending on the data that is stored in the log files, but sometimes will show data that is MIME encoded that will help with your investigation

    9. Export out Message tracking log data from affected MBX server

      Method 1
      Download the attached ExLogGrowthCollector.zip file to this post and extract to the MBX server that experienced the issue. Run ExLogGrowthCollector.ps1 from the Exchange Management Shell. Enter in the MBX server name that you would like to trace, the Start and End times and click on the Collect Logs button.

      image

      Note: What this script does is to export out all mail traffic to/from the specified mailbox server across all HUB servers between the times specified. This helps provide insight in to any large or looping messages that might have been sent that could have caused the log growth issue.

      Method 2
      Copy/Paste the following data in to notepad, save as msgtrackexport.ps1 and then run this on the affected Mailbox Server. Open in Excel for review. This is similar to the GUI version, but requires manual editing to get it to work.

      #Export Tracking Log data from affected server specifying Start/End Times

      Write-host "Script to export out Mailbox Tracking Log Information"
      Write-Host "#####################################################"
      Write-Host
      $server = Read-Host "Enter Mailbox server Name"
      $start = Read-host "Enter start date and time in the format of MM/DD/YYYY hh:mmAM"
      $end = Read-host "Enter send date and time in the format of MM/DD/YYYY hh:mmPM"
      $fqdn = $(get-exchangeserver $server).fqdn
      Write-Host "Writing data out to csv file..... "
      Get-ExchangeServer | where {$_.IsHubTransportServer -eq "True" -or $_.name -eq "$server"} | Get-MessageTrackingLog -ResultSize Unlimited -Start $start -End $end  | where {$_.ServerHostname -eq $server -or $_.clienthostname -eq $server -or $_.clienthostname -eq $fqdn} | sort-object totalbytes -Descending | export-csv MsgTrack.csv -NoType
      Write-Host "Completed!! You can now open the MsgTrack.csv file in Excel for review"


      Method 3
      You can also use the Process Tracking Log Tool at http://msexchangeteam.com/archive/2008/02/07/448082.aspx to provide some very useful reports.

    10. Save off a copy of the application/system logs from the affected server and review them for any events that could attribute to this problem

    11. Enable IIS extended logging for CAS and MB server roles to add the sc-bytes and cs-bytes fields to track large messages being sent via IIS protocols and to also track usage patterns.

Proactive monitoring and mitigation efforts

  1. In backup-less environments, if the “Do not permanently delete mailboxes and items until the store has been backed up” setting is checked on an Exchange 2003 database or the RetainDeletedItemsUntilBackup parameter is set to $true on an Exchange 2007 database , then this setting over time could lead to consistent steady store growth since all whitespace in the database is being consumed/reused. Even with online maintenance running on the server, these pages in the database are never reclaimed to free up any space on them due to this flag being set.
  2. Check whether online maintenance for the database in question has been running nightly in the application log.
  3. Check whether any move mailbox operations are occurring that might be moving users to this database exhibiting the log growth issue.
  4. Increase Diagnostics Logging for the following objects depending on what stores are being affected:

    • MSExchangeIS\Mailbox\Rules
    • MSExchangeIS\PublicFolders\Rules
  5. Enable Client Side monitoring per http://technet.microsoft.com/en-us/library/cc540465.aspx
  6. Create a monitoring plan using MOM/SCOM to alert when the amount of Log Bytes being written hit a specific threshold and then alert the messaging team for further action. There are thresholds that are a part of the Exchange 2007 Management Pack that could help alert to these type situations before the problem gets to a point of taking a database offline. Here are 2 examples of this.

    ESE Log Byte Write/sec MOM threshold
    Warning Event
    http://technet.microsoft.com/en-us/library/bb218522.aspx

    Error Event
    http://technet.microsoft.com/en-us/library/bb218733.aspx

    If an alert is raised, then perform an operation to start collecting data.
  7. Ensure http://support.microsoft.com/kb/958701 is installed at a minimum for each Outlook 2003 client to address known log/database growth issues for users streaming data to the information store that have exceeded message size limits. This fix also addresses a problem where clients could copy a message to their inbox from a PST that during the sync process could exceed mailbox limits, thus causing excessive log growth problems on the server.

    These hotfixes make use of the PR_PROHIBIT_SEND_QUOTA and PR_MAX_SUBMIT_MESSAGE_SIZE  which is referenced in http://support.microsoft.com/kb/894795

    Additional Outlook Log Growth fixes:
    http://support.microsoft.com/kb/957142
    http://support.microsoft.com/kb/936184

  8. Implement minimum Outlook Client versions that can connect to the Exchange server via the Disable MAPI clients registry key server side. See http://technet.microsoft.com/en-us/library/bb266970.aspx for more information.

    To disable clients less than Outlook 2003 SP2, use the following entries on an Exchange 2007 server
    "-5.9.9;7.0.0-11.6568.6567"

    Setting this to exclude Outlook client versions less than Outlook 2003 SP2 will help protect against stream issues to the store. Reason being is that Outlook 2003 SP2 and later understand the new quota properties that were introduced in to the store in http://support.microsoft.com/kb/894795. Older clients have no idea what these new properties are, so if a user sent a 600MB attachment on a message, it would stream the entire message to the store generating excessive log files and then get NDR’ed once the message size limits were checked. With SP2 installed, the Outlook client will first check to see if the attachment size is over the set quota for the organization and immediately stop the send with a warning message on the client and prevent the stream from being sent to the server.

    Allowing any clients older than SP2 to connect to the store is leaving the Exchange servers open for a growth issue.

  9. If Entourage clients are being utilized, then implement the MaxRequestEntityAllowed property in http://support.microsoft.com/kb/935848  to address a known issue where sending a message over the size limit could potentially create log growth for a database.
  10. Check to ensure File Level Antivirus exclusions are set correctly for both files and processes per http://technet.microsoft.com/en-us/library/bb332342.aspx
  11. Enable Content Conversion tracing on all HUB servers per http://technet.microsoft.com/en-us/library/bb397226.aspx . This will help log any failed conversion attempts that may be causing the log growth problem to occur.
  12. If POP3 or IMAP4 clients are connecting to specific servers, then implementing Protocol Logging for each on the servers that may be making use of these protocols will help log data to a log file where these protocols are causing excessive log growth spurts. See http://technet.microsoft.com/en-us/library/aa997690.aspx on how to enable this logging.
  13. Ensure Online maintenance is completing a pass for each database within the past week or two. Query Application event logs for the ESE events series 700 through 704 to clarify. If log growth issues occur during online maintenance periods, this could be normal as Exchange shuffles data around in the database. We just need to ensure that we keep this part in mind during these log growth problems.
  14. Check for any excessive ExCDO warning events related to appointments in the application log on the server. (Examples are 8230 or 8264 events). http://support.microsoft.com/kb/947014 is just one example of this issue. If recurrence meeting events are found, then try to regenerate calendar data server side via a process called POOF.  See http://blogs.msdn.com/stephen_griffin/archive/2007/02/21/poof-your-calender-really.aspx for more information on what this is.

    Event Type: Warning
    Event Source: EXCDO
    Event Category: General
    Event ID: 8230
    Description: An inconsistency was detected in username@domain.com: /Calendar/<calendar item> .EML. The calendar is being repaired. If other errors occur with this calendar, please view the calendar using Microsoft Outlook Web Access. If a problem persists, please recreate the calendar or the containing mailbox.

    Event Type: Warning
    Event ID : 8264
    Category : General
    Source : EXCDO
    Type : Warning
    Message : The recurring appointment expansion in mailbox <someone's address> has taken too long. The free/busy information for this calendar may be inaccurate. This may be the result of many very old recurring appointments. To correct this, please remove them or change their start date to a more recent date.

    Important: If 8230 events are consistently seen on an Exchange server, have the user delete/recreate that appointment to remove any corruption

  15. Add additional store logging per http://support.microsoft.com/kb/254606 to add more performance counter data to be collected with Perfmon. This will allow us to utilize counters such as ImportDeleteOpRate and SaveChangesMessageOpRates which allows us to see what these common log growth rates are. 
  16. Recommend forcing end dates on recurring meetings.  This can be done through the usage of the registry key DisableRecurNoEnd (DWORD).

    For Outlook 2003:
    http://support.microsoft.com/kb/952144
    HKEY_CURRENT_USER\Software\Microsoft\Office\11.0\Outlook\Preferences

    For Outlook 2007:
    http://support.microsoft.com/kb/955449
    HKEY_CURRENT_USER\Software\Microsoft\Office\12.0\Outlook\Preferences
    Value: 1 to Enable, 0 to Disable
  17. Implement LimitEmbeddingDepth on the Exchange servers as outlined in KB 833607 to prevent log growth due to recursion looping. Note: This article states this if for Exchange 2000-2003, but the key is also still valid in Exchange 2007 per source code

Known Issues

Exchange Server

SP1 Release Update 9 fixes

  • 959559 - Transaction log files grow unexpectedly in an Exchange Server 2007 Service Pack 1 mailbox server on a computer that is running Windows Server 2008
  • 925252 - The Store.exe process uses almost 100 percent of CPU resources, and the size of the public folder store increases quickly in Exchange Server 2007
  • 961124 - Some messages are stuck in the Outbox folder or the Drafts folder on a computer that is running Exchange Server 2007 Service Pack 1
    970725 - Public folder replication messages stay in the local delivery queue and cause an Exchange Server 2007 Service Pack 1 database to grow quickly

SP1 Release Update 8 fixes

  • 960775 - You receive a "Message too large for this recipient" NDR that has the original message attached after you restrict the Maximum Message Send Size value in Exchange Server 2007

SP1 Release Update 7 fixes

  • 957124 - You do not receive an NDR message even though your meeting request cannot be sent successfully to a recipient
  • 960775 - You receive a "Message too large for this recipient" NDR that has the original message attached after you restrict the Maximum Message Send Size value in Exchange Server 2007

SP1 Release Update 1 fixes

  • 947014 - An Exchange Server 2007 mailbox server randomly generates many transaction logs in an Exchange Server 2007 Service Pack 1 environment
  • 943371 - Event IDs 8206, 8213, and 8199 are logged in an Exchange Server 2007 environment

Outlook 2007

  • 970944– Installing this hotfix package addresses and issue where log files are generated unexpectedly when a user is running Outlook 2007 in the cached Exchange mode and sends an e-mail message to the recipients who have a corrupted e-mail address and/or e-mail address
  • 970777 - Additional log files are generated on the Exchange server unexpectedly when you send an e-mail message to recipients who have a corrupted e-mail address or a corrupted e-mail address type by using Cached Exchange mode in Outlook 2007 
  • 978401 - Description of the Office Outlook 2007 hotfix package (Outlook-x-none.msp): February 23, 2010 (Includes a MAPISendMAIL fix)

Outlook 2003

  • 958701 - Description of the Outlook 2003 Post-Service Pack 3 hotfix package (Engmui.msp, Olkintl.msp, Outlook.msp): October 28, 2008
  • 936184 - Description of the Outlook 2003 post-Service Pack 3 hotfix package: December 14, 2007
  • 897247 - Description of the Microsoft Office Outlook 2003 post-Service Pack 1 hotfix package: May 2, 2005

Entourage

  • 935848 - Various performance issues occur when you use Entourage for Mac to send large e-mail messages to an Exchange 2007 server

Windows 2008

  • 955612 - The "LCMapString" function may return incorrect mapping results for some languages in Windows Server 2008 and in Windows Vista

Viewing all articles
Browse latest Browse all 60

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>