Tuesday, May 3, 2011

Simple XML Question

I am creating XML at runtime its schema is like

<Item> <Content>Hi</Content> </item>

The problem is when I am trying to save some HTML contents to this Tag

<Item> <Content><strong>Hi</strong></Content> </item>

How to resolve this issue

Thanks in advance

From stackoverflow
  • The HTML string needs to be properly escaped before you add it to the xml. If you are using .NET here are some ways to do it.

  • The less than tag (<) must be escaped using &lt; and same for the > tag (&gt;)

    Stijn Sanders : Don't forget the ampersand itself ('&'), that needs to be replaced with '&' first.
  • You can embed the HTML content in a CDATA section:

    <Item><![CDATA[  <Content><strong>Hi</strong></Content> ]]></item>
    
  • The correct answer is to not embed tags in XML. The XML should only define the data, the parser should put it in the right markup e.g. all Item->Contents in <strong></strong>.

    The other solution is to escape the tags using XML escapes: &lt; and &gt;.

  • I assume that you have a schema that permits an Item element to contain a Content element and that the Content element can only contain text or CDATA or similar. You have two options in that case.

    Firstly, you could escape the html somehow. Either you could use a CDATA section as Fredrik suggested above. Alternatively, you could escape the bracketing as above. Both of these solutions would allow you to continue to treat the contents of Content as text. This lets you have a simple content model for your element.

    Alternatively, you could extend your schema to allow xhtml elements as part of the Content element. I suggested a way to that here. Of course, if your content is html not xhtml this won't work.

    Really, the choice comes to whether or not you want to be able to parse the embedded html as part of your xml or not. If you want it be text, escape it. If you want it to be parseable, extend your schema.

    John M Gant : Agreed. It all depends on what you want to do with it. There could be legitimate uses for either method.

0 comments:

Post a Comment