Skip to Content

Yesterday we enabled auto login, a much asked for and anticipated feature. The feedback was great and it seem to work well…really, really well.

This morning we had an incident on our production instance. Around 9am the load on the DB servers showed a significant increase, all CPU cores fully loaded on 100%. In the beginning we were slightly baffled but busy with keeping the situation under control. Unfortunately we had a downtime and you experienced login and page loading errors.

When we checked the DB logs for the long running queries, we saw some very obvious candidates related to checks for user and group entitlements. It became clear that we had to disable the only significant change we rolled out in the last couple of days. Since this morning 11am GMT the auto login feature is disabled and the old miserable login process is again in place.

There is some good news: the auto login worked really well! Since yesterday we saw a 10 to 15 times higher amount of logins to SCN.

SCN Logins.png

The unfortunate side is that, at least to what we know so far, the underlying platform can’t handle this amount of authenticated sessions and all the related permission checks.

Permissions on SCN are pretty straight forward. There are anonymous and registered users. Then there is a dozen of medium size groups for employees, customer, moderators, University Alliance etc. In addition there are up to three groups per space for moderators. To keep it simple and manageable, we don’t do any user specific permissions at all.

We do load test, multiple per week. But our three thousand load test users are simply not enough to simulate the load of authenticated sessions we currently see during peak times.

I hate it to take something away after it has been rolled out, but at this point there is no other option. I’m really sorry for that!

We have started investigating the issue. Until we find a proper solution, the auto login feature will be disabled. I don’t expect a solution to show quickly. Usually this takes a couple of days.

I wish there would be better news.

Oliver

To report this post you need to login first.

19 Comments

You must be Logged on to comment or reply to a post.

  1. Andy Silvey

    Hi Oliver,

    thank you for the update.

    As a SAP Portal person, I am very curious of the platform running SCN.

    Would it be possible to have a series of blogs on the SCN platform landscape, and sizings etc.

    Thank you,

    Andy.

    (0) 
  2. Former Member

    Thanks for the update Oliver and for keeping the community in the loop as well.  Agree with Matthias that I’m confident you guys will have it working soon, and keep up the great work!

    (0) 
  3. DJ Adams

    Echoing Matthias, thanks for your openness, Oliver. I am one of the many who don’t use SCN much at all because of this issue. While it’s temporarily bad news, it shows me that you folks are working on it, and I’m reminded of the scale issues.

    (0) 
  4. Tammy Powlas

    I agree with Matthias – this is great open communication and we appreciate your efforts on this Oliver

    I was very spoiled by auto-login yesterday – I look forward to having it back. 

    (0) 
  5. Frank Koehntopp

    Thanks for the update. As DJ said, that’s probably one of the most crucial features to increase participation, so it’s good to see people are working on it.

    (0) 
  6. Former Member

    Oliver, thanks for the recent help from you and Jason Lax! I’m summarizing my problem here, in case someone else runs into similar issues.

    When I visited scn.com recently, I clicked the “Login” link, expecting SSO to pick it up from there. Instead, I got a popup asking me to update my password, and asking me to accept the “Terms” by clicking a checkbox.

    The checkbox starts out disabled, and you have to scroll down to the bottom of the terms and conditions textbox before it is enabled. But for me, it failed to enable even after I scrolled to the bottom.

    Apparently the problem was that I had magnification set to 125% on my browser (IE 8)! After some back and forth with Oliver and Jason, I tried with mag set to 100%, and all was ok.

    (0) 
    1. Oliver Kohl Post author

      Hi Atul,

      thanks but this is not the appropriate place for this info. Nobody will find it here. Why not move it into the SCN Support space, maybe as a blog post describing the issue.

      I’ve forwarded the issue already to the SAP IDS team to look into it. Hopefully this will be fixed soon.

      (0) 
  7. Former Member

    I find it interesting and at the same time saddening that SCN is not learning from previous errors regarding login and more importantly: usage numbers.

    Load testing with an inferior number of simulated users is not load testing the system under realistic assumptions. My guess is that SCN knows this but somehow (money? resources?) cannot stress test the system as needed.

    The root cause was found pretty fast, but “I don’t expect a solution to show quickly. Usually this takes a couple of days” sounds to me that auto logon will return after a few weeks (between “found solution” and “its in production” is some hard work that needs time).

    Nevertheless, hope that SCN will give Oliver a environment to (stress) test SCN under realistic assumptions*.

    *which means: worst case + 20% ๐Ÿ™‚

    (0) 
    1. Samuli Kaski

      Yesterday when the automatic login was in place I did notice some weird behaviour. I have always used a SAP Service Marketplace generated certificate to make login to SCN more convenient. Yesterday while the automatic login was in place, I did see a increasing amount of certificate prompts. Usually I see two: when I first access any SCN URL and then when I press on the Login link in the header. Yesterday when the automatic login was put in place I usually got at least 3 certificate prompts, sometimes more. I do open up many discussion threads in new tabs or even new windows, I may have up to 10 SCN URLs open at one time.

      Maybe that is one of the reasons for the increased “logins” you were noticing?

      Have you ruled out DoS attempts?

      (0) 
      1. Oliver Kohl Post author

        Hi Samuli,

        the additional browser certificate popup are due to a known issue on our load balancer that requests these when accessing content via HTTPS. The issue is known and a fix is in the work. But the central LB is a critical component, so unfortunately this takes a bit longer then usual. The fix is planned for early March if I recall correctly.

        (0) 
    2. Oliver Kohl Post author

      Hi Tobias,

      If you have a look at the graph above, the number of logins per hour actually increased up to 1500%! This is not a resource issue. We have an excellent environment and the best tools to do our work here (something I should also blog about at some point).

      At our scale we sometimes hit the limits of the platform. Most of the times you won’t notice it, because the load tests identify the issue beforehand (e.g. ever wondered why push notifications for mobile came so late? ๐Ÿ™‚ ). This time unfortunately it hit you, the community.

      Another lesson learned…

      (0) 
      1. Former Member

        Hi Oliver,

        you received 1500% more logins because auto logon worked, not because there where 1500% more users. That’s what I mean: you have to take the max number of users (logged on or anonymous) that you ever suffered, add X% to be on the safe side and run a test to see if SCN logon goes down or not. It looks like the 3.000 users* are not sufficient to simulate the actual load.

        The “Why” and the answer to it is something really worth a blog.

        *aehm, how many (unique) users do actually access SCN per hour? Or do we have here also a usage behavior problem from users like me that open 10 browser tabs at the same time?

        (0) 
        1. Oliver Kohl Post author

          Hi Tobias,

          We should have checked the actual login sessions on IDS side or compare with the old platform. That is what I ment with lesson learned.

          And these figures on the graph, I left them out on purpose.

          (0) 
  8. Tom Cenens

    Hi Oliver

    Thanks for the work & support.

    Great to have my multiple accounts migrated together as well, now I’ll at least be informed when someone mentions me ๐Ÿ™‚ .

    Best regards

    Tom

    (0) 
  9. Jeff McDonald

    Hi Oliver – Just adding my thanks as well.  As was said before, the open communication is much appreciated.  I’m sure you and your team will get the issues resolved in no time ๐Ÿ˜Ž .

    Cheers – Jeff

    (0) 

Leave a Reply