Tutorial

Java Convert String to XML Document and XML Document to String

Published on August 3, 2022
author

Pankaj

Java Convert String to XML Document and XML Document to String

Sometimes while programming in java, we get String which is actually an XML and to process it, we need to convert it to XML Document (org.w3c.dom.Document). Also for debugging purpose or to send to some other function, we might need to convert Document object to String. Here I am providing two utility functions.

  1. Document convertStringToDocument(String xmlStr): This method will take input as String and then convert it to DOM Document and return it. We will use InputSource and StringReader for this conversion.
  2. String convertDocumentToString(Document doc): This method will take input as Document and convert it to String. We will use Transformer, StringWriter and StreamResult for this purpose.
package com.journaldev.xml;

import java.io.StringReader;
import java.io.StringWriter;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.xml.sax.InputSource;

public class StringToDocumentToString {

    public static void main(String[] args) {
        final String xmlStr = "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n"+
                                "<Emp id=\"1\"><name>Pankaj</name><age>25</age>\n"+
                                "<role>Developer</role><gen>Male</gen></Emp>";
        Document doc = convertStringToDocument(xmlStr);
        
        String str = convertDocumentToString(doc);
        System.out.println(str);
    }

    private static String convertDocumentToString(Document doc) {
        TransformerFactory tf = TransformerFactory.newInstance();
        Transformer transformer;
        try {
            transformer = tf.newTransformer();
            // below code to remove XML declaration
            // transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
            StringWriter writer = new StringWriter();
            transformer.transform(new DOMSource(doc), new StreamResult(writer));
            String output = writer.getBuffer().toString();
            return output;
        } catch (TransformerException e) {
            e.printStackTrace();
        }
        
        return null;
    }

    private static Document convertStringToDocument(String xmlStr) {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();  
        DocumentBuilder builder;  
        try  
        {  
            builder = factory.newDocumentBuilder();  
            Document doc = builder.parse( new InputSource( new StringReader( xmlStr ) ) ); 
            return doc;
        } catch (Exception e) {  
            e.printStackTrace();  
        } 
        return null;
    }

}

When we run above program, we get the same String output that we used to create DOM Document.

<?xml version="1.0" encoding="UTF-8"?><Emp id="1"><name>Pankaj</name><age>25</age>
<role>Developer</role><gen>Male</gen></Emp>

You can use replaceAll("\n|\r", "") to remove new line characters from String and get it in compact format.

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

About the author(s)

Category:
Tutorial
Tags:

While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.

Still looking for an answer?

Ask a questionSearch for more help

Was this helpful?
 
JournalDev
DigitalOcean Employee
DigitalOcean Employee badge
November 8, 2013

I m using same code but stringwriter truncates before printing entire xml in to string…if u can help why it is happening?

- Rishi Naik

JournalDev
DigitalOcean Employee
DigitalOcean Employee badge
November 11, 2013

is your xml very long? are you running on Eclipse or command line, try to write it on File and check if it’s writing full content or not.

- Pankaj

    JournalDev
    DigitalOcean Employee
    DigitalOcean Employee badge
    November 25, 2013

    I need to convert XML String to XML SAX document…how can that be done?

    - simran

      JournalDev
      DigitalOcean Employee
      DigitalOcean Employee badge
      February 10, 2014

      Hi, this line Document doc = builder.parse( new InputSource( new StringReader( xmlStr ) ) ); Gives me an error when i’m running… Fatal Error: XML document structures must start and end within the same entity. My xmlStr = " 1 2 3 "; There is something I’m not doing right? Thank you for you help and article!

      - German

      JournalDev
      DigitalOcean Employee
      DigitalOcean Employee badge
      February 11, 2014

      Your string is not a valid xml.

      - Pankaj

        JournalDev
        DigitalOcean Employee
        DigitalOcean Employee badge
        February 25, 2014

        Thanks Pankaj you r a lifesaver

        - Deepu

          JournalDev
          DigitalOcean Employee
          DigitalOcean Employee badge
          June 11, 2014

          Hi, Am using the above code example but getting null value returning in document

          - ragu

            JournalDev
            DigitalOcean Employee
            DigitalOcean Employee badge
            June 17, 2014

            Getting below error for DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); Exception in thread “main” javax.xml.parsers.FactoryConfigurationError: Provider for class javax.xml.parsers.DocumentBuilderFactory cannot be created at javax.xml.parsers.FactoryFinder.findServiceProvider(Unknown Source) at javax.xml.parsers.FactoryFinder.find(Unknown Source) at javax.xml.parsers.DocumentBuilderFactory.newInstance(Unknown Source)

            - RInu

              JournalDev
              DigitalOcean Employee
              DigitalOcean Employee badge
              June 24, 2014

              Getting the null from builder.parse( new InputSource( new StringReader( xmlStr ) ) ); … I validated my xml, it’s valid

              - Anuj

                JournalDev
                DigitalOcean Employee
                DigitalOcean Employee badge
                March 30, 2015

                successfully executed but did not found useful. I want convert doc file into xml

                - ahmad

                JournalDev
                DigitalOcean Employee
                DigitalOcean Employee badge
                December 22, 2015

                hi can u please tell me how to convert doc file into xml using java code

                - sunil

                JournalDev
                DigitalOcean Employee
                DigitalOcean Employee badge
                January 6, 2016

                package com.avankia.sunil; import java.io.ByteArrayInputStream; import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; import java.util.logging.Level; import java.util.logging.Logger; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.transform.Transformer; import javax.xml.transform.TransformerFactory; import javax.xml.transform.dom.DOMResult; import javax.xml.transform.dom.DOMSource; import org.w3c.dom.Document; public class DocToXmlResumeConvertor { // get path of xsl file private static String styleSheetPath = SystemManager.getInstance() .getConfigUrl().getPath() + “xhtml2fo.xsl”; // static String styleSheetPath = null; static java.util.logging.Logger logger = Logger .getLogger(DocToXmlResumeConvertor.class.getName()); private static Document xml2FO(Document xml, String styleSheetPath) throws Exception { DOMSource xmlDomSource = new DOMSource(xml); DOMResult domResult = new DOMResult(); Transformer transformer = getTransformer(styleSheetPath); if (transformer == null) { throw new Exception(“Error in creating trnasformer”); } try { transformer.transform(xmlDomSource, domResult); } catch (javax.xml.transform.TransformerException e) { logger.log(Level.INFO, “Error in transforming xml to xsl-fo: " + e.getMessage()); return null; } return (Document) domResult.getNode(); } private static Transformer getTransformer(String styleSheetPath) { try { TransformerFactory tFactory = TransformerFactory.newInstance(); DocumentBuilderFactory dFactory = DocumentBuilderFactory .newInstance(); dFactory.setNamespaceAware(true); DocumentBuilder dBuilder = dFactory.newDocumentBuilder(); Document xslDoc = dBuilder.parse(new File(styleSheetPath)); logger.log(Level.INFO, xslDoc.getTextContent()); DOMSource xslDomSource = new DOMSource(xslDoc); return tFactory.newTransformer(xslDomSource); } catch (javax.xml.transform.TransformerException e) { logger.log(Level.SEVERE, “”, e); return null; } catch (java.io.IOException e) { logger.log(Level.SEVERE, “”, e); return null; } catch (javax.xml.parsers.ParserConfigurationException e) { logger.log(Level.SEVERE, “”, e); return null; } catch (org.xml.sax.SAXException e) { logger.log(Level.SEVERE, “”, e); return null; } } /* private static byte[] fo2PDF(Document foDocument) { FopFactory fopFactory = FopFactory.newInstance(); try { ByteArrayOutputStream out = new ByteArrayOutputStream(); Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, out); TransformerFactory tFactory = TransformerFactory.newInstance(); Transformer transformer = tFactory.newTransformer(); Source src = new DOMSource(foDocument); Result res = new SAXResult(fop.getDefaultHandler()); transformer.transform(src, res); return out.toByteArray(); } catch (Exception ex) { logger.log(Level.SEVERE, “”, ex); return null; } } */ public static byte[] getXmlResumeBytes(byte[] bytes) throws Exception { byte[] XmlBytes = null; ByteArrayInputStream input = new ByteArrayInputStream(bytes); //final HtmlCleaner cleaner = new HtmlCleaner(); CleanerProperties props = cleaner.getProperties(); DomSerializer doms = new DomSerializer(props, true); Document xmlDoc = null; try { TagNode node = cleaner.clean(input, “UTF-8”); xmlDoc = doms.createDOM(node); // System.out.println(xmlDoc.getFirstChild().getTextContent()); } catch (Exception e) { throw e; } Document foDoc = null; try { foDoc = xml2FO(xmlDoc, styleSheetPath); // System.out.println(foDoc.getFirstChild().getTextContent()); } catch (Exception e) { logger.log(Level.INFO, “ERROR: " + e.getMessage()); throw e; } //XmlBytes = fo2PDF(foDoc); input.close(); if (XmlBytes != null) { logger.log(Level.INFO, “your doc has been converted into xml”); } else { String errorString = “doc File is not converted into xml properly”; XmlBytes = errorString.getBytes(); } return XmlBytes; } public static byte[] readBytes(String fileName) { FileInputStream fileInputStream = null; byte[] bytes = null; try { File file = new File(fileName); System.out.println(fileName); bytes = new byte[(int) file.length()]; fileInputStream = new FileInputStream(file); fileInputStream.read(bytes); fileInputStream.close(); return bytes; } catch (Exception ie) { bytes = null; logger.log(Level.SEVERE, “”, ie); return bytes; } } public static void main(String[] args) { // TODO Auto-generated method stub String htmlFileName = “C://Users//raktim//Downloads//ava.doc”; styleSheetPath = “D:/WORKAREA/AVANKIA/ResumeParser/src/www/WEB-INF/conf/xhtml2fo.xsl”; File htmlFile = new File(htmlFileName); byte[] XmlBytes = new byte[(int) htmlFile.length()]; File XmlFile = new File(htmlFileName.replace(”.doc”, “.Xml”)); FileOutputStream fop = null; try { pdfBytes = readBytes(htmlFileName); fop = new FileOutputStream(XmlFile); byte[] newBytes = DocToXmlResumeConvertor .getXmlResumeBytes(pdfBytes); fop.write(newBytes); fop.flush(); fop.close(); System.out.println(“Done”); } catch (Exception e) { logger.log(Level.SEVERE, “”, e); } } }

                - sunil

                  JournalDev
                  DigitalOcean Employee
                  DigitalOcean Employee badge
                  August 7, 2015

                  The variable doc allways return null

                  - mmonikm

                  JournalDev
                  DigitalOcean Employee
                  DigitalOcean Employee badge
                  January 7, 2016

                  Even for me :(

                  - Sridhar Raj

                  JournalDev
                  DigitalOcean Employee
                  DigitalOcean Employee badge
                  July 23, 2017

                  Please check the method carefully, that’s only in case of an exception.

                  - Pankaj

                    JournalDev
                    DigitalOcean Employee
                    DigitalOcean Employee badge
                    May 4, 2016

                    Where is the replaceAll() method supposed to be used? I was thinking it should be placed on the string str before printing it out, like so: String str = convertDocumentToString(doc); str.replaceAll(“\n|\r”, “”); System.out.println(str); But the output doesn’t change…

                    - nekonutchi

                    JournalDev
                    DigitalOcean Employee
                    DigitalOcean Employee badge
                    August 29, 2016

                    Same problem it doesn’t work…!

                    - Shailesh

                    JournalDev
                    DigitalOcean Employee
                    DigitalOcean Employee badge
                    July 23, 2017

                    return output.replaceAll("\n|\r", ""); in convertDocumentToString method. Come on guys, use some brains yourself too.

                    - Pankaj

                      Join the Tech Talk
                      Success! Thank you! Please check your email for further details.

                      Please complete your information!

                      Become a contributor for community

                      Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

                      DigitalOcean Documentation

                      Full documentation for every DigitalOcean product.

                      Resources for startups and SMBs

                      The Wave has everything you need to know about building a business, from raising funding to marketing your product.

                      Get our newsletter

                      Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.

                      New accounts only. By submitting your email you agree to our Privacy Policy

                      The developer cloud

                      Scale up as you grow — whether you're running one virtual machine or ten thousand.

                      Get started for free

                      Sign up and get $200 in credit for your first 60 days with DigitalOcean.*

                      *This promotional offer applies to new accounts only.