Skip to Content

This document explains one of the ways to read the content from password protected PDF File using SAP PI Custom Adapter Module.

For reading PDF data we have different APIs available in the market where iText, Aspire and PDFBox are some I came across…

I did take iText APIs to read the content as its a Open Source Content and easy to use and understand to my knowledge.

Out of the different versions of iText jar files available I did use itext-1.4.8.jar, and as stated in the below document we also need bcprov-jdk14-138.jar.

http://scn.sap.com/docs/DOC-34713

I did take a sample PDF file which is password protected using the same itext jar and used the above document code to have an input file ready for this document.

To make it a simple document I did convert the data into String format reading the data into PDFReader class, we can still enhance the same using XML parsers to make XML data depending on the requirement.

Module code for Reading Password protected PDF file.

import javax.ejb.SessionBean;

import javax.ejb.SessionContext;

import java.io.InputStream;

import java.io.ByteArrayOutputStream;  

import com.lowagie.text.pdf.*;

import com.sap.engine.interfaces.messaging.api.*;

import com.sap.aii.af.lib.mp.module.*;

import com.sap.engine.interfaces.messaging.api.Message;

import com.sap.engine.interfaces.messaging.api.MessageKey;

import com.sap.engine.interfaces.messaging.api.PublicAPIAccessFactory;

import com.sap.engine.interfaces.messaging.api.auditlog.AuditAccess;

import com.sap.engine.interfaces.messaging.api.auditlog.AuditLogStatus;

  /**

* @author:Tilak Borra

* @version: 1.0V

* @Comments: This code works only with PDF files created with out additional formats.

*               The decryption works  only if BCPROV-jdk14-146.jar is included as well. 

*/

public class Read_PwdProtectedPDFBean implements SessionBean, Module {

    public static final String VERSION_ID = "$Id://tc/aii/30_REL/src/_adapters/_sample/java/user/module/Read_PwdProtectedPDFBean.java#1 $";  

    static final long serialVersionUID = 8555850550539048633L;  

    public void ejbRemove() {  

    }  

    public void ejbActivate() {  

    }  

    public void ejbPassivate() {  

    }  

    public void setSessionContext(SessionContext context) {  

    }  

    public void ejbCreate() throws javax.ejb.CreateException {  

    }  

    public ModuleData process(ModuleContext  moduleContext,  

               ModuleData inputModuleData) throws ModuleException {  

      AuditAccess audit = null;  

      Object obj = null;  

      Message msg = null;  

      MessageKey key = null;  

      Payload pload = null;

      String ErrorText = "";

      String Ownerpassword; // Password required to open PDF  

          try {  

              obj = inputModuleData.getPrincipalData();  

              msg = (Message) obj;          

              key = new MessageKey(msg.getMessageId(), msg.getMessageDirection());  

              audit = PublicAPIAccessFactory.getPublicAPIAccess().getAuditAccess();  

              audit.addAuditLogEntry(key, AuditLogStatus.SUCCESS,"Payload content passed to bean to read data");  

              //To get Message Payload  

              pload = msg.getMainPayload();

            if(pload == null){

                throw new NullPointerException("Message object is null");

            } 

              //Convert the payload data into inputstream to pass to PDFReader Method

              InputStream is = pload.getInputStream();

              if(is != null)

              audit.addAuditLogEntry(key, AuditLogStatus.SUCCESS,"Data converted to InputStream to pass to PDFReader");

              else

              audit.addAuditLogEntry(key, AuditLogStatus.SUCCESS,"Payload Content could not be converted to InputStream as PDF is not relevant File"); 

              //setting the runtime parameters which we can pass from Communication channel as Adapter Module parameters to the Module Context  

              Ownerpassword = (String) moduleContext.getContextData("pwd_Owner");

              if(Ownerpassword != null)

              audit.addAuditLogEntry(key, AuditLogStatus.SUCCESS," Fetched Password from Module Data");

              else

              audit.addAuditLogEntry(key, AuditLogStatus.SUCCESS,"Could not readh Password from Module Data");     

              //PDFReader to read protected file

              PdfReader readerx = new PdfReader(is,Ownerpassword.getBytes("UTF-8"));

              audit.addAuditLogEntry(key, AuditLogStatus.SUCCESS," Inputstream data passed to PDFReader");

              ByteArrayOutputStream outputs = new ByteArrayOutputStream();

              //PDF Encrypter Class User the PDFWriter Class methods in

              //turn to change properties of the PDF File

              PdfEncryptor.encrypt(readerx, outputs, null,

                         null, PdfWriter.AllowCopy

                         | PdfWriter.AllowDegradedPrinting | PdfWriter.AllowPrinting | PdfWriter.AllowScreenReaders, false);

              audit.addAuditLogEntry(key, AuditLogStatus.SUCCESS,"Data read from PDF File Successfull");

              //Read the number of pages in the PDF File

              int n = readerx.getNumberOfPages();

              //String Buffer to store the PDF Page content

              StringBuffer bufferx = new StringBuffer();

              //This loop reads each PDF page content starting from top to bottom

              for(int i=1;i<=n;i++){

                  PdfDictionary dictionaryx = readerx.getPageN(i);

                  PRIndirectReference referencex = (PRIndirectReference) dictionaryx.get(PdfName.CONTENTS);

                  PRStream streamx = (PRStream) PdfReader.getPdfObject(referencex);

                  //The page content is converted into Byte format and then data is read into String Buffer

                  byte[] bytesx = PdfReader.getStreamBytes(streamx);

                  PRTokeniser tokenizerx = new PRTokeniser(bytesx);                       

                  while (tokenizerx.nextToken()) {

                      if (tokenizerx.getTokenType() == PRTokeniser.TK_STRING) {

                          bufferx.append((tokenizerx.getStringValue()+ "\n"));

                        }

                  }

              }

              audit.addAuditLogEntry(key, AuditLogStatus.SUCCESS,"Data converted to string successfully");

              //converting the String Buffer Data into Byte Array  

              byte byt[] = String.valueOf(bufferx).getBytes("UTF-8");

              //If the PDF File contains heavy data it is suggested to remove the below line

              audit.addAuditLogEntry(key, AuditLogStatus.SUCCESS,"The actual content of the file is "   

                                               +bufferx.toString());  

              //set the payload in bytes  

              pload.setContent(byt);

              //set the main paylaod data with the payload content

              msg.setMainPayload(pload);

              //Passing the main payload to inputmoduledata  

              inputModuleData.setPrincipalData(msg); 

              audit.addAuditLogEntry(key, AuditLogStatus.SUCCESS,"Paylaod content set and passed for further processing");

           }  catch (NullPointerException e) {

               throw new ModuleException("Null pinter Error occured while "+ ErrorText);

           }

          //Exception Handling  

          catch (Exception e) {  

              if(inputModuleData != null)

            {

            throw new ModuleException("Exception occured while retriving principal data, input module data is null " , e);

            }else{

                throw new ModuleException("Exception occured while retriving principal data, input module data is NOT null " , e);

            } 

          }           

          return inputModuleData;  

        }  

    } 

Configuring the Sender Communication Channel with the Module:

/wp-content/uploads/2013/02/clip_image002_183729.jpg

The only value to be shared through the module keys is password, where if it starts with ‘pwd_’ it is treated as password text field.

For my sample example is did it using “sap@123”.

The content of the file is as pretty simple data used:

/wp-content/uploads/2013/02/clip_image002_183729.jpg

Once the file is picked up by the simple file adapter from the location we can see the status as below:

/wp-content/uploads/2013/02/clip_image002_183729.jpg

All the log we added in the code is shown as well in along with the data:

/wp-content/uploads/2013/02/clip_image002_183729.jpg

The Message Content would look as below:

/wp-content/uploads/2013/02/clip_image002_183729.jpg

Couple of errors which I faced during entire test:

1.If the iText Jar file MANIFEST.MF contains any additional Jar requirements we need to add those other we get the exceptions raised during deployment.

While creating this bean I did use earlier itext-2.1.3.jar however it still needs iText-toolbox.jar which I could not find, so used itext-1.4.8.jar.

2.If the PDF File is not a standard format file then we will have issues reading data and module will through the error as “PDF header signature not found”.

To report this post you need to login first.

1 Comment

You must be Logged on to comment or reply to a post.

  1. Former Member

    Hi Tilak,

    It seems to be quite innovative way of work around.

    Hope it will not trigger an alert “Forgot your password? …” again. 🙂

    Keep updated.!

    Regards,

    Sunil

    (0) 

Leave a Reply