Java Program to Format your SDN Weblog Content
h2. Introduction
All the bloggers in SDN face a common problem of formatting
the Weblog content
before posting it into SDN. Formatting is mostly removing
the unnecessary tags and having only the allowed tags in SDN
Weblogs which is a
painful task. To avoid this painful task, I had come up with a small solution which I thought of sharing with you all fellow bloggers through this blog.
You can also find an equivalent ABAP program by Brain McKellar (https://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/com.sap.sdn.businesscard.sdnbusinesscard?u=s1ett4b5nss%3d) in his weblog,
The 1-2-3 Steps To Producing a Weblog
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Vector;
import org.cyberneko.html.parsers.DOMParser;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;
/*
-
@author: Felix Jeyareuben, Cognizant Technology Solutions
*/
public class FormatSDNWeblog {
static Vector vTags = null;
static Vector noEndTags = null;
static boolean _flag = false;
static FileOutputStream fos = null;
static String outputFile;
public static void main(String[] args) {
try {
String inputFile = “weblog.htm”;
if (args.length >= 1)
inputFile = args[0];
else {
System.out.println(“Usage: java FormatSDNWeblog <html-file>”);
System.exit(-1);
}
outputFile = “sdn_” + inputFile;
/*
-
Ref: SDN: Weblogs and Formatting! By Craig Cmehil
*/
String validTags[] = { “p”, “b”, “i”, “em”, “strong”, “code”, “tt”,
“br”, “a”, “sub”, “sup”, “ul”, “ol”, “li”, “pre”, “img”,
“blockquote”, “small”, “div”, “hr”, “h2”, “h3”, “h4”, “h5”,
“table”, “tr”, “td”, “th”, “center”, “textarea”, “a” };
// No End Tags
noEndTags = new Vector();
noEndTags.add(“img”);
noEndTags.add(“br”);
noEndTags.add(“hr”);
vTags = new Vector();
for (int i = 0; i < validTags.length; i++)
vTags.add(validTags[i]);
DOMParser parser = new DOMParser();
fos = new FileOutputStream(new File(outputFile));
parser.parse(inputFile);
// A recursive function which does the stripping of unnecessary tags
SDNParser(parser.getDocument(), “”);
fos.close();
System.out
.println(“Filtered html successfully converted into SDN Weblog Content as “
+ outputFile + “!”);
} catch (FileNotFoundException e) {
System.out.println(e.getMessage());
} catch (SAXException e) {
System.out.println(e.getMessage());
} catch (IOException e) {
System.out.println(e.getMessage());
}
}
public static void SDNParser(Node node, String intend) throws IOException {
String _node = “”;
Node ch = null;
// To check if the current node is a TAG
if (node.getNodeType() == 1) {
_node = node.getNodeName();
// To remove unnecessary
if (_node.equalsIgnoreCase(“P”)) {
ch = node.getFirstChild();
if (ch.getNodeType() == 3
&& (int) ch.getNodeValue().charAt(0) == 160)
return;
}
// To check if the current TAG is a valid one
if (vTags.contains(_node.toLowerCase())) {
_flag = true;
fos.write((”
” + intend + “<” + _node).getBytes());
// Iterating through the attributes of the current node
NamedNodeMap a = node.getAttributes();
if (a != null) {
for (int i = 0; i < a.getLength(); i++) {
// Removing the ‘class’ attribute which might be found
// in the valid allowed TAGS
if (a.item(i).getNodeName().toLowerCase().startsWith(
“class”, 0))
continue;
// Removing the ‘style’ attribute which might be found
// in the valid allowed TAGS
if (a.item(i).getNodeName().toLowerCase().startsWith(
“style”, 0))
continue;
fos.write((” ” + a.item(i)).getBytes());
}
fos.write(“>”.getBytes());
}
} else
_flag = false;
for (Node child = node.getFirstChild(); child != null; child = child
.getNextSibling())
// Recursive call to it’s child node
SDNParser(child, intend + ” “);
// Ending the tag
if (vTags.contains(_node.toLowerCase())
&& !noEndTags.contains(_node.toLowerCase()))
fos.write((”
” + intend + “</” + _node + “>”).getBytes());
} else {
// Else part is of text and isn’t any TAG
// To check if it is the root document
if (node.getNodeType() != 9)
if (_flag)
fos.write(node.getNodeValue().trim().getBytes());
for (Node child = node.getFirstChild(); child != null; child = child
.getNextSibling())
SDNParser(child, intend + ” “);
}
}
}
The above code contains all the valid tags allowed in SDN
Weblogs.
Ref: SDN: Weblogs and Formatting! By Craig
Cmehil SDN: Weblogs and Formatting!
Step-by-Step Demo
After completing the blog, click Save As from the word document
Select Web Page, Filtered
Click Yes to save it in html
When we open the saved html in Notepad, there are many
unnecessary tags and attributes.
Execute the command(s)
javac -classpath
nekohtml.jar;xercesImpl.jar;xmlParserAPIs.jar
FormatSDNWeblog.java
java -classpath
nekohtml.jar;xercesImpl.jar;xmlParserAPIs.jar;. FormatSDNWeblog
weblog.htm
Make sure you have javac.exe & java.exe in path.
The generated output file containing only the allowed tags by SDN





did you know that there is also an ABAP Weblog formating programm avaliable:
The 1-2-3 Steps To Producing a Weblog.
Regards
Gregor
I certainly know about the Brain McKeller's ABAP program. Any blogger in SDN must have read his blog, The 1-2-3 Steps To Producing a Weblog! But many people like me who doesn't know ABAP and/or doesn't have 'developer access' may not be able create the ABAP program.
Most importantly, to execute an ABAP program, you need to have R/3 access through sapgui. But executing a java program you simply need a JRE on that OS! Hence, its far easy to execute a java program than an ABAP program for anyone. 😉
But I got the idea of putting this blog from SDN:Weblogs and Formatting! by Craig Cmehil, because he had clearly mentioned the tags allowed. But previously, many tags are allowed! I don't know if the ABAP program is coded to reflect the current changes in SDN Weblog (since I am not an ABAPer). Hence I decided to write a java program for it.
Best regards,
Felix
don't get me wrong. I think that it is great to have this Java program from you. But I think you should provide Links to the other Weblogs also.
Regards
Gregor
I didn't get you wrong 🙂
I also updated my blog to have links of other weblogs.
Best regards,
Felix