Categories:
Audio (13)
Biotech (29)
Bytecode (35)
Database (77)
Framework (7)
Game (7)
General (512)
Graphics (53)
I/O (32)
IDE (2)
JAR Tools (86)
JavaBeans (16)
JDBC (89)
JDK (337)
JSP (20)
Logging (103)
Mail (54)
Messaging (8)
Network (71)
PDF (94)
Report (7)
Scripting (83)
Security (32)
Server (119)
Servlet (17)
SOAP (24)
Testing (50)
Web (19)
XML (301)
Other Resources:
DomXmlParserWhitespace.java - Parse XML File without Whitespaces
How to parse an XML file with the DOM API without including whitespaces between XML elements?
✍: FYIcenter
In many cases, whitespaces are included in XML fiels before and after XML elements
to make the XML file more readable.
For example, the follwoing XML file, User.xml, includes whitespaces:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <!-- Copyright (c) 2017 FYIcenter.com --> <User> <ID>101</ID> <BirthDate>1970-01-01+00:01</BirthDate> <Name>Frank Y. Ivy</Name> <Sex> Male</Sex> </User>
If you want the DOM XML parser to ignore whitespaces, you need to do two things:
1, Add DTD (Document Type Definition) to define the element structure as shown in UserDTD.xml:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <!-- Copyright (c) 2017 FYIcenter.com --> <!DOCTYPE User [ <!ELEMENT User (ID, BirthDate, Name, Sex)> <!ELEMENT ID (#PCDATA)> <!ELEMENT BirthDate (#PCDATA)> <!ELEMENT Name (#PCDATA)> <!ELEMENT Sex (#PCDATA)> ]> <User> <ID>101</ID> <BirthDate>1970-01-01+00:01</BirthDate> <Name>Frank Y. Ivy</Name> <Sex> Male</Sex> </User>
2. Tell the parser to ignore whitespaces: setIgnoringElementContentWhitespace(true), as shown in DomXmlParserWhitespace.java:
// Copyright (c) 2017 FYIcenter.com import java.io.*; import javax.xml.parsers.*; import org.w3c.dom.*; public class DomXmlParserWhitespace { static String dot = "............................................................"; public static void main(String[] args) throws Exception { DocumentBuilderFactory f = DocumentBuilderFactory.newInstance(); f.setIgnoringElementContentWhitespace(Boolean.parseBoolean(args[1])); DocumentBuilder b = f.newDocumentBuilder(); Document d = b.parse(new File(args[0])); System.out.println("Implementation class:\n "+d.getClass().getName()); System.out.println("DOM object elements and text contents:"); Node n = d.getDocumentElement(); printText(n, 1); } public static void printText(Node n, int l) { String v = ""; if (n.getNodeType()==Node.TEXT_NODE) v = n.getTextContent(); System.out.println(dot.substring(0,l)+n.getNodeName()+":"+v); NodeList c = n.getChildNodes(); for (int i=0; i<c.getLength(); i++) { printText(c.item(i),l+1); } } }
Compile and run the example program, DomXmlParserWhitespace.java, with setIgnoringElementContentWhitespace(false):
>\fyicenter\jdk-1.8.0\bin\javac DomXmlParserWhitespace.java >\fyicenter\jdk-1.8.0\bin\java DomXmlParserWhitespace UserDTD.xml false Implementation class: com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl DOM object elements and text contents: .User: ..#text: ..ID: ...#text:101 ..#text: ..BirthDate: ...#text:1970-01-01+00:01 ..#text: ..Name: ...#text:Frank Y. Ivy ..#text: ..Sex: ...#text: Male ..#text:
Run it again with setIgnoringElementContentWhitespace(true):
>\fyicenter\jdk-1.8.0\bin\java DomXmlParserWhitespace UserDTD.xml true Implementation class: com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl DOM object elements and text contents: .User: ..ID: ...#text:101 ..BirthDate: ...#text:1970-01-01+00:01 ..Name: ...#text:Frank Y. Ivy ..Sex: ...#text: Male
The output tells you that Apache Xerces is able to ignore whitespaces based on the DTD definitions.
Â
2017-12-13, 1647👍, 0💬
Popular Posts:
A stream buffer is a stream-based representation of an XML infoset in Java. Stream buffers are desig...
Apache Axis2 is the core engine for Web services. It is a complete re-design and re-write of the wid...
iText is an ideal library for developers looking to enhance web- and other applications with dynamic...
JDK 11 java.sql.jmod is the JMOD file for JDK 11 SQL (Structured Query Language) module. JDK 11 SQL ...
What is the sax\Writer.java provided in the Apache Xerces package? I have Apache Xerces 2.11.0 insta...