HttpIOexception: Difficult to explain timeouts can be difficult to locate. (Upgrading, SSL and Proxy issues)
After a recent and eventful upgrade from SPS15 to SPS19 I feel that I have to blog this story so that others might be able to learn from my experiences. The upgrade is simple enough to understand. We need to upgrade to SPS19 so that Internet Explorer 7 will work with our web dynpro and special apps that we have on the portal. The variables are in the environment. We are lucky to have two testing environments. The first one is a flat out portal with no extras. The other one is a portal and proxy on the same box with SSL. The upgrade was uneventful in both cases. Test proved to be uneventful as well. As for production, there was a twist. Portal and proxy with SSL as before but, the proxy is now in the DMZ. The upgrade as with the production portal was uneventful. After the upgrade, I cleaned up a few loose ends and started testing. Everything was going great until I started looking at repositories and custom apps. I was getting 502 bad gateway errors. On the proxy logs it gave no clue on to what was throwing it. Server side was only this: Message: User paycosupply, IP address XXX.XXX.XXX.XXX Cannot read request body. com.sap.engine.services.httpserver.exceptions.HttpIOException: Read timeout. The client has disconnected or a synchronization error has occurred. Read  bytes. Expected . I found notes for lazy connection and slow connections for HTTP. I used them to do some adjusting in the past. I changed the lines ServletInputStreamTimeout and ServletsLongDataTransferTimeout and that solved the timeout issues then, but it didn’t this time. After two days, I was at a dead end and I made the decision to restore the portal from back up tapes back to the prior version. (Practice makes perfect when doing a restore. If you have the means, you should practice recovery from backups just in case disaster strikes. I had the portal back in just over an hour!) I opened up an OSS. We couldn’t pin it down. Everything worked fine in the development environment. The only difference was that the reverse proxy was on the portal and not in the DMZ. At a loss for a solution, I built another proxy and placed that in the DMZ for the development portals. And there it was, I was able to reproduce the problem. Now, what was it? The logs gave me no specific error to indicated nothing else than a timeout/synchronization problem with HTTP. After a few days and logs, we came up with the idea of removing things. SSL was first. As soon as we removed it and just used HTTP the httpioexceptions errors vanished. The timeouts were HTTP related alright but, only with SSL. That brought us to note 1000264. Mainly the RUNTIME_SO_TIMEOUT. For some reason yet discovered, when you are using a reverse proxy in a DMZ the timeout for SSL must be increased when upgrading to SPS19. After two weeks of tweaking, things are running quite well. It just goes to show you, slight differences in your landscape can make for some unusual problems.