Categories:
Audio (13)
Biotech (29)
Bytecode (36)
Database (77)
Framework (7)
Game (7)
General (507)
Graphics (53)
I/O (35)
IDE (2)
JAR Tools (101)
JavaBeans (21)
JDBC (121)
JDK (426)
JSP (20)
Logging (108)
Mail (58)
Messaging (8)
Network (84)
PDF (97)
Report (7)
Scripting (84)
Security (32)
Server (121)
Servlet (26)
SOAP (24)
Testing (54)
Web (15)
XML (309)
Collections:
Other Resources:
DomXmlParserWhitespace.java - Parse XML File without Whitespaces
How to parse an XML file with the DOM API without including whitespaces between XML elements?
✍: FYIcenter
In many cases, whitespaces are included in XML fiels before and after XML elements
to make the XML file more readable.
For example, the follwoing XML file, User.xml, includes whitespaces:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <!-- Copyright (c) 2017 FYIcenter.com --> <User> <ID>101</ID> <BirthDate>1970-01-01+00:01</BirthDate> <Name>Frank Y. Ivy</Name> <Sex> Male</Sex> </User>
If you want the DOM XML parser to ignore whitespaces, you need to do two things:
1, Add DTD (Document Type Definition) to define the element structure as shown in UserDTD.xml:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <!-- Copyright (c) 2017 FYIcenter.com --> <!DOCTYPE User [ <!ELEMENT User (ID, BirthDate, Name, Sex)> <!ELEMENT ID (#PCDATA)> <!ELEMENT BirthDate (#PCDATA)> <!ELEMENT Name (#PCDATA)> <!ELEMENT Sex (#PCDATA)> ]> <User> <ID>101</ID> <BirthDate>1970-01-01+00:01</BirthDate> <Name>Frank Y. Ivy</Name> <Sex> Male</Sex> </User>
2. Tell the parser to ignore whitespaces: setIgnoringElementContentWhitespace(true), as shown in DomXmlParserWhitespace.java:
// Copyright (c) 2017 FYIcenter.com import java.io.*; import javax.xml.parsers.*; import org.w3c.dom.*; public class DomXmlParserWhitespace { static String dot = "............................................................"; public static void main(String[] args) throws Exception { DocumentBuilderFactory f = DocumentBuilderFactory.newInstance(); f.setIgnoringElementContentWhitespace(Boolean.parseBoolean(args[1])); DocumentBuilder b = f.newDocumentBuilder(); Document d = b.parse(new File(args[0])); System.out.println("Implementation class:\n "+d.getClass().getName()); System.out.println("DOM object elements and text contents:"); Node n = d.getDocumentElement(); printText(n, 1); } public static void printText(Node n, int l) { String v = ""; if (n.getNodeType()==Node.TEXT_NODE) v = n.getTextContent(); System.out.println(dot.substring(0,l)+n.getNodeName()+":"+v); NodeList c = n.getChildNodes(); for (int i=0; i<c.getLength(); i++) { printText(c.item(i),l+1); } } }
Compile and run the example program, DomXmlParserWhitespace.java, with setIgnoringElementContentWhitespace(false):
>\fyicenter\jdk-1.8.0\bin\javac DomXmlParserWhitespace.java >\fyicenter\jdk-1.8.0\bin\java DomXmlParserWhitespace UserDTD.xml false Implementation class: com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl DOM object elements and text contents: .User: ..#text: ..ID: ...#text:101 ..#text: ..BirthDate: ...#text:1970-01-01+00:01 ..#text: ..Name: ...#text:Frank Y. Ivy ..#text: ..Sex: ...#text: Male ..#text:
Run it again with setIgnoringElementContentWhitespace(true):
>\fyicenter\jdk-1.8.0\bin\java DomXmlParserWhitespace UserDTD.xml true Implementation class: com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl DOM object elements and text contents: .User: ..ID: ...#text:101 ..BirthDate: ...#text:1970-01-01+00:01 ..Name: ...#text:Frank Y. Ivy ..Sex: ...#text: Male
The output tells you that Apache Xerces is able to ignore whitespaces based on the DTD definitions.
⇒ DomXmlSerializer.java - Serialize DOM to XML String
⇐ DomXmlParser.java - Parse XML File with DOM API
2017-12-13, 2259🔥, 0💬
Popular Posts:
pache Derby is an open source relational database implemented entirely in Java and available under t...
JDK 11 jdk.httpserver.jmod is the JMOD file for JDK 11 HTTP Server module. JDK 11 HTTP Server module...
JDK 17 jdk.localedata.jmod is the JMOD file for JDK 17 Localedata module. JDK 17 Locale Data module ...
What Is javamail1_1_3.zip? javamail1_1_3.zip is the binary package of JavaMail API 1.1.3 in ZIP form...
JAX-WS is an API for building web services and clients. It is the next generation Web Services API r...