SaxXmlWhitespace.java - ignorableWhitespace() Event Handler

Q

How do I catch those whitespace text contents that are removed from the SAX parser?

✍: FYIcenter

A

When DTD is applied to the XML document, the SAX parser will quietly remove whitespace text contents and not call the characters() event handler.

If you want to catch those whitespace text contents during the parsing process, you need add the ignorableWhitespace() event handler as shown in this example, SaxXmlWhitespace.java:

// Copyright (c) 2017 FYIcenter.com
import javax.xml.parsers.SAXParserFactory;
import javax.xml.parsers.SAXParser;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.Attributes;
import java.io.*;

public class SaxXmlWhitespace extends DefaultHandler {
   static String dot = "............................................................";
   static int l = 0;
   static String show = "noshow";
   public static void main(String[] args) throws Exception {
      SAXParserFactory f = SAXParserFactory.newInstance();
      SAXParser p = f.newSAXParser();
      System.out.println("Parser class: "+p.getClass().getName());

      show = args[1];
	  p.parse(new File(args[0]), new SaxXmlWhitespace());
   }
   public void startElement(String uri, String lName, String qName, Attributes atts) {
      l++;
      System.out.print("\n"+dot.substring(0,l)+lName+qName);
   }
   public void endElement(String uri, String lName, String qName) {
      l--;
   }
   public void characters(char[] ch, int start, int length) {
      System.out.print("("+(new String(ch,start,length))+")");
   }
   public void ignorableWhitespace(char[] ch, int start, int length) {
      if (show.equals("show")) {
         System.out.print("["+(new String(ch,start,length))+"]");
	  }
   }
}

Compile and run the example program, SaxXmlWhitespace.java:

>\fyicenter\jdk-1.8.0\bin\javac SaxXmlWhitespace.java

>\fyicenter\jdk-1.8.0\bin\java SaxXmlWhitespace UserDTD.xml show

Parser class: com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl

.User[
    ]
..ID(101)[
    ]
..BirthDate(1970-01-01+00:01)[
    ]
..Name(Frank Y. Ivy)[
    ]
..Sex(  Male)[
]

If you run it again with "noshow", you will not see those whitespace text contents in the output:

>\fyicenter\jdk-1.8.0\bin\java SaxXmlWhitespace UserDTD.xml noshow

Parser class: com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl

.User
..ID(101)
..BirthDate(1970-01-01+00:01)
..Name(Frank Y. Ivy)
..Sex(  Male)

 

Using XML SAX API with Apache Xerces

⇒⇒FAQ for Apache Xerces XML Parser

2017-12-09, 378👍, 0💬