World is now on Opti ID! Learn more

Magnus Rahl
Nov 8, 2017
  1976
(0 votes)

Two bugs in ASPNET that break Cache memory management

TL;DR: A bug in ASPNET memory monitoring prevents the cache monitoring / scavenging tread from running if the memory usage in the host at the time of application startup happens to be in a specific range in relation to the configured cache thresholds. Another bug causes the same behavior if no cache threshold is explicitly configured (falling back to default). This can further cause problems in Azure Web Apps because it triggers the Proactive Auto Heal feature, repeatedly recycling the application for no reason. The workaround is to set the privateBytesPollTime option to less than 30 seconds and also provide a value for percentagePhysicalMemoryUsedLimit (90 % or less to not conflict with Proactive Auto Heal).

Intro and previous investigations

In my previous blog post I started to investigate a problem with the ASPNET HTTP runtime cache and Azure Proactive Auto Heal and the buggy behavior of the configured memory usage threshold. I found an apparent correlation between certain settings for percentagePhysicalMemoryUsedLimit and the application failing to keep the cache memory usage under control, causing Proactive Auto Heal to recycle the application. But I also acknowledged that the behavior was intermittent, and I now know why.

A quick overview of ASPNET memory management internals

ASPNET (this refers to .net Framework 4.7.1) uses the AspNetMemoryMonitor class to monitor the application's memory usage. Internally it uses the LowPhysicalMemoryMonitor class to periodically compare the current memory pressure to the configured (or default) thresholds, and trigger cache scavenging/trimming if needed. Or, at least it is supposed to do this...

Bug 1: Periodic checks fail to start under certain conditions

For the periodic memory checks, LowPhysicalMemoryMonitor uses a System.Threading.Timer instance that is initialized like this in its constructor:

this._timer = new Timer(new TimerCallback(this.MonitorThread), (object) null, -1, this._currentPollInterval);

Note the third argument, -1, which means that the timer is initialized with a due date (next time it will trigger the callback) "infinitely" far off in the future. So essentially the timer is disabled from the start.

The framework will then call the LowPhysicalMemoryMonitor.Start method which calls AdjustTimer(false). AdjustTimer is actually called within the timer callback, to adjust the time for the next callback according to the current memory pressure. The callback will run more often if the memory pressure is high. So AdjustTimer is designed to do more than just start the timer for the Start method. It looks like this (reflected code):

internal void AdjustTimer(bool disable = false)
{
    lock (this._timerLock)
    {
        if (this._timer == null)
            return;
        if (disable)
            this._timer.Change(-1, -1);
        else if (this.PressureLast >= this.PressureHigh)
        {
            if (this._currentPollInterval <= 5000)
                return;
            this._currentPollInterval = 5000;
            this._timer.Change(this._currentPollInterval, this._currentPollInterval);
        }
        else if (this.PressureLast > this.PressureLow / 2)
        {
            int num = Math.Min(LowPhysicalMemoryMonitor.s_pollInterval, 30000);
            if (this._currentPollInterval == num)
                return;
            this._currentPollInterval = num;
            this._timer.Change(this._currentPollInterval, this._currentPollInterval);
        }
        else
        {
            if (this._currentPollInterval == CacheSizeMonitor.PollInterval)
                return;
            this._currentPollInterval = CacheSizeMonitor.PollInterval;
            this._timer.Change(this._currentPollInterval, this._currentPollInterval);
        }
    }
}

PressureLast is the last observed memory pressure (% physical memory used).
PressureHigh is the configured (or default) percentagePhysicalMemoryUsedLimit.
PressureLow is a value calculated from PressureHigh, lower than PressureHigh (generally it is PressureHigh - 9).
s_pollInterval is a default timer interval that is initialized to the value of the privateBytesPollTime setting, converted to milliseconds. The default for privateBytesPollTime is 2 minutes.
_currentPollInterval is the timer interval to use, which is also set to the Timer using the Change method. This field is initialized to 30 seconds.

Now, assume AdjustTimer is called by Start to start the timer. The fields will have the values described above, and most importantly, PressureLast is some value it has actually measured. So it can be anything. It can fulfill PressureHigh > PressureLast > PressureLow / 2, going into the second else if. That will get the minimum of 30000 and s_pollInterval, which by default is 120000, so 30000 is used. Then it compares 30000 to _currentPollInterval and does an early exit if they are the same, which they are because that is what _currentPollInterval was initialized with! This means the timer is never activated, AdjustTimer is never called again, and the LowPhysicalMemoryMonitor and the cache scavenging handled by it will remain inactive for the lifetime of the application.

How does this explain the apparent correlation with percentagePhysicalMemoryUsedLimit I described in the previous blog post? Because values in the range identified in that post are likely to arrange PressureLow and PressureHigh in such a way that a typical PressureLast value at application startup would match the scenario described above. At low values for percentagePhysicalMemoryUsedLimit it is much more likely to end up in the PressureLast >= PressureHigh scenario, and at high values it is more likely to end up in the PressureLast <= PressureLow / 2 scenario.

Bug 2: Periodic checks are stopped if default memory threshold is used

Let's take a look at LowPhysicalMemoryMonitor.ReadConfig, which is called from the constructor:

private void ReadConfig(CacheSection cacheSection)
{
  int val2 = cacheSection != null ? cacheSection.PercentagePhysicalMemoryUsedLimit : 0;
  if (val2 == 0)
    return;
  this._pressureHigh = Math.Max(3, val2);
  this._pressureLow = Math.Max(1, this._pressureHigh - 9);
  LowPhysicalMemoryMonitor.s_pollInterval = (int) Math.Min(cacheSection.PrivateBytesPollTime.TotalMilliseconds, (double) int.MaxValue);
}

Note the first line. The default for percentagePhysicalMemoryUsedLimit if it is omitted from config is actually 0. So unless you set that setting, you will end up in the early exit in line 3. The most important implication of that is that the last line never runs, so s_pollInterval retains its default value of 0.

Now say that the Bug 1 was not triggered and the timer is now running, periodically calling into AdjustTimer. Then take look again at the PressureLast > PressureLow / 2 case. Because s_pollInterval is now 0, that will be used as the new interval set to the timer. The call chain from Timer.Change ends up in System.Threading.TimerQueue.UpdateTimer where the interval (period) is used like this:

timer.m_period = (period == 0) ? Timeout.UnsignedInfinite : period;

So passing 0 disables the timer. And again we are in a situation where the memory monitoring and cache scavenging will be off for the lifetime of the application.

Workaround

Bug 1 can be avoided by specifying a privateBytesPollTime setting of less than 30 seconds, forcing the first call to AdjustTimer to adjust the timer and thereby starting it, even in the PressureLast > PressureLow / 2 scenario.
Bug 2 can be avoided simply by specifying a setting for percentagePhysicalMemoryUsedLimit other than the default of 0.

Nov 08, 2017

Comments

Please login to comment.
Latest blogs
Make Global Assets Site- and Language-Aware at Indexing Time

I had a support case the other day with a question around search on global assets on a multisite. This is the result of that investigation. This co...

dada | Jun 26, 2025

The remote server returned an error: (400) Bad Request – when configuring Azure Storage for an older Optimizely CMS site

How to fix a strange issue that occurred when I moved editor-uploaded files for some old Optimizely CMS 11 solutions to Azure Storage.

Tomas Hensrud Gulla | Jun 26, 2025 |

Enable Opal AI for your Optimizely products

Learn how to enable Opal AI, and meet your infinite workforce.

Tomas Hensrud Gulla | Jun 25, 2025 |

Deploying to Optimizely Frontend Hosting: A Practical Guide

Optimizely Frontend Hosting is a cloud-based solution for deploying headless frontend applications - currently supporting only Next.js projects. It...

Szymon Uryga | Jun 25, 2025

World on Opti ID

We're excited to announce that world.optimizely.com is now integrated with Opti ID! What does this mean for you? New Users:  You can now log in wit...

Patrick Lam | Jun 22, 2025

Avoid Scandinavian Letters in File Names in Optimizely CMS

Discover how Scandinavian letters in file names can break media in Optimizely CMS—and learn a simple code fix to automatically sanitize uploads for...

Henning Sjørbotten | Jun 19, 2025 |