Skip to Content

How to parse html content from websites using java api deliverd by SAP Portal


In this weblog i am going to present Java code which uses api delivered by SAP portal to parse html contents from other websites.

The code described in this weblog extracts the content described in following picture


Create a Dynpage component using wizard inside NWDS or Eclipse.


Cut and pase the following code to your Dynpage component.

“); response.write(“”); response.write(parser.htmltext); } } } 

Write a class that extends com.sapportals.portal.prt.util.html.parser.ParserVisitor. You don’t have to write any code, just cut and paste the following code.


Deploy you component inside portal and run it. You should get the following content.


You must be Logged on to comment or reply to a post.
  • Hi Rich,
        Thank you for the feedback. I was trying to put some humour in the heading. I will fix the heading.

    Prakash Singh

  • Hi Prakash:
    Nice weblog.
    You have just given a bare bones explanation about
    the functionality of the parser provided by SAP.
    We need more meat over it !!
    I feel these are the following areas.
    a.) Either in a intranet or internet situation why
    any one want to do this kind of parsing.
    b.) What are the other capabilities of this parser
    c.) What are the Jar files we need to import in to
    our NWDS.
    d.) From which version of EP, these api are availalable.
    e.) Are there any related api provided by SAP.

    With all this surrounding but useful material, I
    can remember it, and probably use it, when ever
    any similar scenario arises. I know, this information
    is not needed, when everybody is as smart techy as
    you are.

    Prasad Nutalapati

    • Thank your for your feedback. The code is pretty much self explanatory, why don’t try it out. You learn more more from debuging the code than somebody explaining you every line of the code. There are no external jars being bein used, NWDS should automatically find the jars(prtapi.jar).


  • Hi Prakash,

    Sorry to bother you about this but …

    The code in your article does not display correctly because the