Yesterday we enabled auto login, a much asked for and anticipated feature. The feedback was great and it seem to work well…really, really well.
This morning we had an incident on our production instance. Around 9am the load on the DB servers showed a significant increase, all CPU cores fully loaded on 100%. In the beginning we were slightly baffled but busy with keeping the situation under control. Unfortunately we had a downtime and you experienced login and page loading errors.
When we checked the DB logs for the long running queries, we saw some very obvious candidates related to checks for user and group entitlements. It became clear that we had to disable the only significant change we rolled out in the last couple of days. Since this morning 11am GMT the auto login feature is disabled and the old miserable login process is again in place.
There is some good news: the auto login worked really well! Since yesterday we saw a 10 to 15 times higher amount of logins to SCN.
The unfortunate side is that, at least to what we know so far, the underlying platform can’t handle this amount of authenticated sessions and all the related permission checks.
Permissions on SCN are pretty straight forward. There are anonymous and registered users. Then there is a dozen of medium size groups for employees, customer, moderators, University Alliance etc. In addition there are up to three groups per space for moderators. To keep it simple and manageable, we don’t do any user specific permissions at all.
We do load test, multiple per week. But our three thousand load test users are simply not enough to simulate the load of authenticated sessions we currently see during peak times.
I hate it to take something away after it has been rolled out, but at this point there is no other option. I’m really sorry for that!
We have started investigating the issue. Until we find a proper solution, the auto login feature will be disabled. I don’t expect a solution to show quickly. Usually this takes a couple of days.
I wish there would be better news.