So we were working on a project for a customer who has 6 Exchange servers, 3 CAS and 3 MBX. The customer have an Exchange Server 2013 organization and about 200 users.
Basically it all sounded easy and quick project, however their Exchange Server has other saying on this! After we configured the co-existence scenario on their Exchange Server, we had an issue where their local anti-spam application started to block incoming email messages from Google.
This is a normal behavior of course as those emails might be considered spoofed emails. So we added Google IP address ranges for outbound SMTP. Once we did that and some other settings on their Exchange Servers we were able to get their email traffic going between G Suite and on-premises Exchange Servers.
This was the easy part… However there was something else being cooked for us by the Exchange Server
After few days of normal operation we get a complain from the customer IT admin that tells us they can no longer receive email messages on their Exchange Server. Worth to mention that we already have changed the MX records, so the emails are being received on G Suite for their whole domain.
So I started to troubleshoot the problem and basically the errors that show on the email log search were:
- Could not deliver the email message because the mailbox on the local server was not existing or suspended.
- Some email messages were getting dropped due to spam issues.
- Many emails were showing no errors on G Suite email log search, however the customer complained they did not receive them on Exchange Server.
- A lot of email messages show a delivery error generated by the customer Exchange Server (452 4.3.1 Insufficient system resources)
Only the last point is the relevant to this issue, and basically it tells me that there is a resources issue on the server. So I started to check what is going on on that server and I found a lot!
Basically there is an issue in the disk space on all the 6 servers. Then there seems to be a performance issue on the servers, despite them having a lot of physical resources. On the event viewer I only found few interesting logs such as:
The performance counter '\\SERVER_NAME\LogicalDisk(HarddiskVolume1)\Free Megabytes' sustained a value of 'xxx.xx', for the '15' minute(s) interval starting at 'DATE TIME'. Additional information: None. Trigger Name:DatabaseDriveSpaceTrigger. Instance:harddiskvolume1
Naturally when seeing the error “452 4.3.1 Insufficient system resources” I would look at the backpressure on the Exchange Server. However this turned out to be a lot more than I thought!
What is the problem then?
It took me a lot of time (a lot of it could have been saved if I knew where to look) to get to the problem. It turned out to be a sort of a bug in the Exchange Server.
Because I was looking at the symptoms and causes of backpressure, I actually did the following:
- I was looking in the event viewer for logs that are related to backpressure, but did not find anything other than the one mentioned above.
- Checked the memory, CPU, and disk utilization of EdgeTransport.exe service. It was at its normal.
- Moved the queue database from default location (C drive) to other place. Followed these steps:
- Stopped the transport service
- Went to the new location where I wanted to move the database to and created a new directory for it
- Opened the file ‘EdgeTransport.exe.config‘ using notepad (path: C:\Program Files\Microsoft\Exchange Server\V14\Bin\)
- Edit the following Key <add key=”QueueDatabasePath” value=”C:\Program Files\Microsoft\Exchange Server\V14\TransportRoles\data\Queue” />.The above value you see it is pointed to the C drive, so I simply changed it to the new location where I created the directory for this queue database.
- Started the transport service again.
After this there was no change at all on the performance and the error message was still being generated for all incoming messages.
Then I turned to the only error that I have which is related to system resources. The one shown above!
I know of a performance setting in the Exchange Server, which is some sort of an alert when the drive that hosts the database starts to run low on space. For some reason (bug) the alert was sort of ‘stuck’ and it was getting triggered and logged in event viewer. That was causing the server to stop accepting incoming emails thinking it is low on resources.
So as it turned out that was a bug that should have been fixed with updates to the Exchange Server. Except the Exchange Server that we have was not exactly at the latest update point. That’s why I was getting an issue that should not have happened and should not have led me to wast a lot of time on it.
At the end I was able to fix it by simply disabling the trigger that was causing this. It is also related to the diagnostics server NOT the edge transport service!
To fix this error, simply follow these steps:
- Stop the diagnostics service (MSExchangeDiagnostics).
- Navigate to the directory “C:\Program Files\Microsoft\ExchangeServer\V15\Bin\Microsoft.Exchange.Diagnostics.Service.exe.config“
- Find the line that contains “Triggers.DatabaseDriveSpaceTrigger” and change its status form True to False.
- That’s it, start the diagnostics service again
The error log stopped from showing in the event viewer, and the Exchange Server started to handle emails normally again!
The proper fix?
Make sure your Exchange Server is up-to-date with the latest Microsoft updates!
Disclaimer: The Microsoft Exchange Server logo is a property for Microsoft which I have no affiliation with nor own it. I only used it in this site for demonstration purposes only.
Checkout my other blog posts here.
Check out my channel on Youtube and subscribe ?