After writing and publishing the BSP port for the http:BL-mechanism that is meant to prevent harvesters from gathering e-mail addresses on your website, I was left with a vague sense of dissatisfaction. As the comments say, http:BL is a great initiative. The BSP port isn’t rocket science, so there isn’t anything specifically wrong with that. Except for one thing. The original Apache version of http:BL works on a server base, whereas the BSP port, as demonstrated in the web log(s), works on an application and even on a page level. That means that you need to insert the method code in to every application and to trigger it in every page that you want to protect. That’s a bit of overkill. I thought that there must be a way to achieve this and let the BSP port act in the same manner as the Apache module. Indeed all one needs to do is to extend the BSP rendering engine in such a way that it first checks the IP on its trustworthiness before the page is displayed. One would have thought that this could be achieved by writing one’s own BSP extensions, but that still leaves us with the same problems as mentioned above.
Searching in the SDN web logs, and the more than excellent Advanced BSP programming book by Brian and Thomas, I discovered that I needed to write a HTTP handler. The technology isn’t that difficult to understand and implement, so I wanted to go a step better and implement something which is not yet available in the Apache module. In some cases one needs to prevent IP checking. There are several reasons for that: the page doesn’t contain any data which is valuable for the harvesters, or the application is only callable from an intranet (and hopefully you don’t have any harvesters over there). The main reason, from my point of view, is that checking IP addresses will prevent the honeypot project from working properly. The http;BL mechanism will prevent harvesters from visiting the page that the honeypot is installed on. And letting the harvesters visit that page is precisely what we want.
Step by step
What follows is a step by step guide on how to implement the code. I won’t reinvent the wheel and explain everything about http handlers. I gladly refer to the above book and web logs by the two BSP gurus.
Step 1: Create a table which looks like this
Make it maintainable via transaction SM30 and enter all the pages of the applications (both in capitals) that you don’t want to protect. This is a sort of white list.
Step 2: Create a class.
Since http handlers are in fact nothing more than classes, we need to make one. I called it
ZCL_HTTPBL. Implement the interface IF_HTTP_EXTENSION and put the following code in the HANDLE_REQUEST method:
\n\n\n\n‘. CONCATENATE html ‘
t x ' INTO html. CONCATENATE html 'k o a The website from' INTO html. CONCATENATE html ' toayou subject t other terms gove Websitedyou acce' INTO html. CONCATENATE html ' "read them carefu Any Non-Human Vi Furthermore, as ' INTO html. CONCATENATE html 'c a a a' INTO html. CONCATENATE html 'h ' INTO html. CONCATENATE html 'i VISITORS AGREE T |
' INTO html. CONCATENATE html ' p i TERMS ' INTO html. CONCATENATE html ' whichiyou acces sitors to the We PECIAL ' INTO html. CONCATENATE html 'LICENSE R ions on' INTO html. CONCATENATE html 'aa visito specified by the s AP' INTO html. CONCATENATE html 'PLICA s that any suit, a' INTO html. CONCATENATE html ' ah RECORDS HAT HARVEST' INTO html. CONCATENATE html 'ING,a |
pAND' INTO html. CONCATENATE html 'iCONDITIONS se' INTO html. CONCATENATE html 'd this agreeme bsite shall be c ESTRICTIONS FO' INTO html. CONCATENATE html 'R r\''s license to a BLE LA' INTO html. CONCATENATE html 'W ANDpJURI action or proce OF VISITORiUSE A GATHERING,dSTORI |
OF USE' INTO html. CONCATENATE html ' nt ("thetWebsite onsidered agents NO' INTO html. CONCATENATE html 'N-HUMAN VISITO ccess the Websit ction" flagpin t S' INTO html. CONCATENATE html 'DICTIONa eding brought b' INTO html. CONCATENATE html 'y ND ABUSE ng your I' INTO html. CONCATENATE html 'nternet NG,' INTO html. CONCATENATE html ' TRANSFERRING |
")disopro' INTO html. CONCATENATE html 'vided ' INTO html. CONCATENATE html ' of the RS eiapplyito ' INTO html. CONCATENATE html 'he headeripages such party hProtocol TOc' INTO html. CONCATENATE html 'A THIRD |
‘ INTO html. CONCATENATE html ‘‘ INTO html. REPLACE ALL OCCURENCES OF ‘\n’ IN html WITH cl_abap_char_uti
utilities=>newline. server->response->set_status( code = 200 reason = ‘OK’ ). server->response->set_header_field( name = if_http_header_fields=>content_type value = ‘text/html’ ). server->response->set_cdata( html ). * server->response->set_status( code = 404 reason = ‘Not Found’ ). if_http_extension~flow_rc = if_http_extension=>co_flow_ok. ELSE. if_http_extension~flow_rc = if_http_extension=>co_flow_ok_others_mand. ENDIF. ENDIF. ENDMETHOD.
Step 3: After activating the class, put it as a handler list in the node /default_host/sap/bc/bsp. Make sure that the handler precedes the CL_HTTP_EXT_BSP handler.
The code
I won’t explain each detail of the code since most of the code has already been explained in this earlier web log. These are the extra things you need to know.
After the data definition we want to retrieve the URL, the user called
url = server->request->get_header_field( if_http_header_fields_sap=>request_uri ).
And the IP address of that user
remote = server->request->get_header_field( if_http_header_fields_sap=>remote_addr ).
Where only interested in the part of the URL before the question mark. The rest are parameters
SPLIT url AT '?' INTO TABLE itab.
READ TABLE itab INDEX 1 INTO tmp_string.
We split the URL at each slash
SPLIT tmp_string AT '/' INTO TABLE itab.
The page name should be specified in the last record
idx = LINES( itab ). READ TABLE itab INDEX idx INTO pagekey.
Is it really a page, meaning does it contain a dot? We need to test this in order to see if one didn’t specify a page name (relying on the default page)
IF pagekey NS '.'.
If not, the page name is, in reality, the application name
applname = pagekey.
And we assign a dummy page name
pagekey = 'index.htm'. ELSE.
If it was a page name, the application name is the preceding record
idx = idx - 1. READ TABLE itab INDEX idx INTO applname. ENDIF.
We convert to uppercase.
TRANSLATE pagekey TO UPPER CASE. TRANSLATE applname TO UPPER CASE.
We check that the page isn’t in the white list
SELECT SINGLE * FROM zeu_httpbl INTO wa WHERE application = applname AND pagename = pagekey. IF sy-subrc EQ 0.
It does, so we give the control to the next handler and the page will be ‘executed’
if_http_extension~flow_rc = if_http_extension=>co_flow_ok_others_mand. RETURN. ELSE.
It is not in the white list, so reverse the IP and do a NSLOOKUP. Check my earlier web log for an explanation on how to do that.
In the NSLOOKUP we create HTML to show to the harvester
IF rc GT 0. html = '\n\n\n\n'. …
And give it back to the client
server->response->set_status( code = 200 reason = 'OK' ). server->response->set_header_field( name = if_http_header_fields=>content_type value = 'text/html' ). server->response->set_cdata( html ).
The output will look something like this
Alternatively we can give back a HTTP 404 result back
* server->response->set_status( code = 404 reason = 'Not Found' ).
Whatever method is chosen, we need to stop the handler flow and thus the execution of the called page is stopped
if_http_extension~flow_rc = if_http_extension=>co_flow_ok. ELSE.
It’s not a positive, so we pass the control to the next handler and the page will be ‘executed’
if_http_extension~flow_rc = if_http_extension=>co_flow_ok_others_mand. ENDIF. ENDIF.
Conclusion
Again, no rocket science this time. Once you know the HTTP handler basics it’s really easy to understand. I’m sure that this method will be much easier to implement and maintain.
P.S. Which type of SDN Ubergeek/BPX suit are you?