Skip to Content

Honey, I couldn’t get no satisfaction when shrinking the spam

After writing and publishing the BSP port for the http:BL-mechanism that is meant to prevent harvesters from gathering e-mail addresses on your website, I was left with a vague sense of dissatisfaction. As the comments say, http:BL is a great initiative. The BSP port isn’t rocket science, so there isn’t anything specifically wrong with that. Except for one thing. The original Apache version of http:BL works on a server base, whereas the BSP port, as demonstrated in the web log(s), works on an application and even on a page level. That means that you need to insert the method code in to every application and to trigger it in every page that you want to protect. That’s a bit of overkill. I thought that there must be a way to achieve this and let the BSP port act in the same manner as the Apache module. Indeed all one needs to do is to extend the BSP rendering engine in such a way that it first checks the IP on its trustworthiness before the page is displayed. One would have thought that this could be achieved by writing one’s own BSP extensions, but that still leaves us with the same problems as mentioned above.
Searching in the SDN web logs, and the more than excellent Advanced BSP programming book by Brian and Thomas, I discovered that I needed to write a HTTP handler. The technology isn’t that difficult to understand and implement, so I wanted to go a step better and implement something which is not yet available in the Apache module. In some cases one needs to prevent IP checking. There are several reasons for that: the page doesn’t contain any data which is valuable for the harvesters, or the application is only callable from an intranet (and hopefully you don’t have any harvesters over there). The main reason, from my point of view, is that checking IP addresses will prevent the honeypot project from working properly. The http;BL mechanism will prevent harvesters from visiting the page that the honeypot is installed on. And letting the harvesters visit that page is precisely what we want.

Step by step
What follows is a step by step guide on how to implement the code. I won’t reinvent the wheel and explain everything about http handlers. I gladly refer to the above book and web logs by the two BSP gurus.

Step 1: Create a table which looks like this

Make it maintainable via transaction SM30 and enter all the pages of the applications (both in capitals) that you don’t want to protect. This is a sort of white list.

Step 2: Create a class.
Since http handlers are in fact nothing more than classes, we need to make one. I called it
ZCL_HTTPBL. Implement the interface IF_HTTP_EXTENSION and put the following code in the HANDLE_REQUEST method:

\n\n\n\n‘.       CONCATENATE html ‘

t   x ' INTO html.       CONCATENATE html 'k  o  a   
The website from' INTO html.       CONCATENATE html '
toayou subject t
other terms gove
Websitedyou acce' INTO html.       CONCATENATE html '
"read them carefu

Any Non-Human Vi
individual(s) wh
' INTO html.       CONCATENATE html 'responsible fori
forhviolations o
' INTO html.       CONCATENATE html '
 idi        a' INTO html.       CONCATENATE html '  S
Special restrict
spiders, bots' INTO html.       CONCATENATE html ', i
programs designe
automatically. N
the Website' INTO html.       CONCATENATE html 'ibeyo

Furthermore, as 
within thesWebsi
on' INTO html.       CONCATENATE html 'athis site are
the Website. It 
visitors alone' INTO html.       CONCATENATE html ', 
human visitors.g
' INTO html.       CONCATENATE html 'that each email 
derived from the
storage,yand pot
' INTO html.       CONCATENATE html 'substantially di
rec' INTO html.       CONCATENATE html 'ognized under

   ' INTO html.       CONCATENATE html 'c a  a  a' INTO html.       CONCATENATE html 'h   
Each party agree
against the othe
' INTO html.       CONCATENATE html '("Judicial Actio
the registered A
' INTO html.       CONCATENATE html 'such lawseare ap
andkperformed en
consents to the ' INTO html.       CONCATENATE html '
The visitor to t
him indconnectio
Website ' INTO html.       CONCATENATE html 'consents

           ' INTO html.       CONCATENATE html 'i    
As a visitor to 
address recorded' INTO html.       CONCATENATE html '
to your I' INTO html.       CONCATENATE html 'nternet
any reason.


' INTO html.       CONCATENATE html '    p  i   TERMS

' INTO html.       CONCATENATE html ' whichiyou acces
o the following 
rningxacce' INTO html.       CONCATENATE html 'ss to 
pt thesesterms a

sitors to the We
o control or aut' INTO html.       CONCATENATE html '
theibehavior of 
f the Terms of S


ions on' INTO html.       CONCATENATE html 'aa visito
rs. Non-Human Vi
ndexers, robots,
' INTO html.       CONCATENATE html 'd to access,grea
on-Human Visitor
ndiw' INTO html.       CONCATENATE html 'hat would be

specified by the
te and/or' INTO html.       CONCATENATE html 'athe co
 considered prop
is recognizedyth' INTO html.       CONCATENATE html '
and have valuepi
' INTO html.       CONCATENATE html 'By continuing to
address theaWebs
irirelative secr' INTO html.       CONCATENATE html '
ential distribut
minish the value
ering, ora' INTO html.       CONCATENATE html 'storin
ithis agreement 

 s       AP' INTO html.       CONCATENATE html 'PLICA

s that any suit,
r ine' INTO html.       CONCATENATE html 'connection 
n")ishall be gov
' INTO html.       CONCATENATE html 'dministrative Co
plied to agreeme
tirely within th
' INTO html.       CONCATENATE html 'jurisdiction of 
he Websiteyconse
n with' INTO html.       CONCATENATE html ' breaches 
 to electronicas

  a' INTO html.       CONCATENATE html ' ah  RECORDS 
' INTO html.       CONCATENATE html '
the Website, you
. An email' INTO html.       CONCATENATE html 'aaddre
 we suspect pote
 Protocol addres



se' INTO html.       CONCATENATE html 'd this agreeme
conditions. Thes' INTO html.       CONCATENATE html '
the Website. Byh
nd conditionst(t

bsite shall be c
hor' INTO hthtml.       CONCATENATE html ' them. Thesea
their Non-Human 
ervice.' INTO html.       CONCATENATE html '


r\''s license to a
sitors include, 
 crawler' INTO html.       CONCATENATE html 's, harve
d, compile origa
s are res' INTO html.       CONCATENATE html 'tricted
 typical ofoaohu
' INTO html.       CONCATENATE html '
ntents of thearo
rie' INTO html.       CONCATENATE html 'tary intellec
at these email a
n part because t' INTO html.       CONCATENATE html '
 accessathe Webs
ite contains has
ecy. Yo' INTO html.       CONCATENATE html 'u further
ionaofcthese' INTO html.       CONCATENATE html 'aadd
 of these addres
g e' INTO html.       CONCATENATE html 'mail addresse
as a violation o


 action or proce
with or ari' INTO html.       CONCATENATE html 'sing 
erned by theilaw
ntacti(the "Admi' INTO html.       CONCATENATE html '
nts between Admi
e Admin State. T
federalcand stat' INTO html.       CONCATENATE html '
nts to the venue
of these Terms o
ervice of proces' INTO html.       CONCATENATE html '

' INTO html.       CONCATENATE html '
 consent to havi
ss may appeariim
ntial abuse. The
' INTO html.       CONCATENATE html 's. Visitors agre


OF USE' INTO html.       CONCATENATE html ' 

nt ("thetWebsite
efterms' INTO html.       CONCATENATE html 'iareein a
visiting (in any
hei"Terms of' INTO html.       CONCATENATE html ' Ser

onsidered agents
individua' INTO html.       CONCATENATE html 'lspshal
Visitor' INTO html.       CONCATENATE html 'dagentsda


ccess the Websit
bu' INTO html.       CONCATENATE html 't are notalimi
sters, or anyoth' INTO html.       CONCATENATE html '
ther content fro
efromttaxing the
man ' INTO html.       CONCATENATE html 'visitor.

ction" flagpin t
bots.txt file, e
' INTO html.       CONCATENATE html 'tual property of
ddresses are pro
hey are accessib
ite,gYou' INTO html.       CONCATENATE html ' acknowl
 afvalueinot les
 agree that' INTO html.       CONCATENATE html ' the 
ses. Intenti' INTO html.       CONCATENATE html 'onal
s byaNon-HumanaV
f thisiagreement

S' INTO html.       CONCATENATE html 'DICTIONa

eding brought b' INTO html.       CONCATENATE html 'y
from thesTerms o
 ofhthe' INTO html.       CONCATENATE html 'istate of
n State") for th
n State resident
hey' INTO html.       CONCATENATE html 'visitorgto th
' INTO html.       CONCATENATE html ' in any actionab
f Service. Thesv
sa' INTO html.       CONCATENATE html 'regarding acti


ng your I' INTO html.       CONCATENATE html 'nternet
mediately below 
 Identifier' INTO html.       CONCATENATE html 'pisdu
einot to useathi


")disopro' INTO html.       CONCATENATE html 'vided
ddition to any
 manner) the
v' INTO html.       CONCATENATE html 'ice"). Please

' INTO html.       CONCATENATE html ' of the
l ultimatelysbe
ndga' INTO html.       CONCATENATE html 're liable


ted' INTO html.       CONCATENATE html 'ito, web
erycomputer' INTO html.       CONCATENATE html '
m thegWebsite
aresources of

' INTO html.       CONCATENATE html 'he headeripages
mail addresses
o' INTO html.       CONCATENATE html 'the authordof
vided forihuman
lea' INTO html.       CONCATENATE html 'onlyytoksaid
edge and agree
s than USd' INTO html.       CONCATENATE html '$50
man Visitors
isitors is
' INTO html.       CONCATENATE html ' andiexpressly

 such party
' INTO html.       CONCATENATE html ' residenceaof
e Website as
s entered into
the Admin State.
rought against
isitor to t' INTO html.       CONCATENATE html 'he
onsgunder the

(the' INTO html.       CONCATENATE html '
niquely matched
s address for

 TOc' INTO html.       CONCATENATE html 'A THIRD

‘ INTO html.       CONCATENATE html ‘‘ INTO html.       REPLACE ALL OCCURENCES OF ‘\n’ IN html WITH cl_abap_char_uti

utilities=>newline.       server->response->set_status( code = 200 reason = ‘OK’ ).       server->response->set_header_field( name = if_http_header_fields=>content_type value = ‘text/html’ ).       server->response->set_cdata( html ). *        server->response->set_status( code = 404 reason = ‘Not Found’ ).         if_http_extension~flow_rc     = if_http_extension=>co_flow_ok.     ELSE.       if_http_extension~flow_rc = if_http_extension=>co_flow_ok_others_mand.     ENDIF.   ENDIF. ENDMETHOD.


Step 3: After activating the class, put it as a handler list in the node /default_host/sap/bc/bsp. Make sure that the handler precedes the CL_HTTP_EXT_BSP handler.


The code

I won’t explain each detail of the code since most of the code has already been explained in this earlier web log. These are the extra things you need to know.

After the data definition we want to retrieve the URL, the user called

  url = server->request->get_header_field( if_http_header_fields_sap=>request_uri ).

And the IP address of that user

  remote = server->request->get_header_field( if_http_header_fields_sap=>remote_addr ).

Where only interested in the part of the URL before the question mark. The rest are parameters

  SPLIT url AT '?' INTO TABLE itab.
  READ TABLE itab INDEX 1 INTO tmp_string.

We split the URL at each slash

  SPLIT tmp_string AT '/' INTO TABLE itab.

The page name should be specified in the last record

  idx = LINES( itab ).   READ TABLE itab INDEX idx INTO pagekey.

Is it really a page, meaning does it contain a dot? We need to test this in order to see if one didn’t specify a page name (relying on the default page)

  IF pagekey NS '.'.


If not, the page name is, in reality, the application name

    applname = pagekey.


And we assign a dummy page name

    pagekey = 'index.htm'.   ELSE.

If it was a page name, the application name is the preceding record

    idx = idx - 1.     READ TABLE itab INDEX idx INTO applname.   ENDIF.


We convert to uppercase.



We check that the page isn’t in the white list

  SELECT SINGLE * FROM zeu_httpbl INTO wa WHERE application = applname AND pagename = pagekey.   IF sy-subrc EQ 0.


It does, so we give the control to the next handler and the page will be ‘executed’

    if_http_extension~flow_rc = if_http_extension=>co_flow_ok_others_mand.     RETURN.   ELSE.


It is not in the white list, so reverse the IP and do a NSLOOKUP. Check my earlier web log for an explanation on how to do that.

In the NSLOOKUP we create HTML to show to the harvester

    IF rc GT 0.       html = '\n\n\n\n'. …


And give it back to the client

      server->response->set_status( code = 200 reason = 'OK' ).       server->response->set_header_field( name = if_http_header_fields=>content_type value = 'text/html' ).       server->response->set_cdata( html ).


The output will look something like this



Alternatively we can give back a HTTP 404 result back

*        server->response->set_status( code = 404 reason = 'Not Found' ).

Whatever method is chosen, we need to stop the handler flow and thus the execution of the called page is stopped

        if_http_extension~flow_rc     = if_http_extension=>co_flow_ok.     ELSE.

It’s not a positive, so we pass the control to the next handler and the page will be ‘executed’

      if_http_extension~flow_rc = if_http_extension=>co_flow_ok_others_mand.     ENDIF.   ENDIF.


Again, no rocket science this time. Once you know the HTTP handler basics it’s really easy to understand. I’m sure that this method will be much easier to implement and maintain.

P.S. Which type of SDN Ubergeek/BPX suit are you?

Be the first to leave a comment
You must be Logged on to comment or reply to a post.