[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [syndication] HTML Encoding BooHoo...



On Wed, May 23, 2001 at 12:57:37PM -0400, Morbus Iff wrote:
> ** I eventually tracked the culprit to nothing in my code, but rather the 
> XML::Simple perl module, which seems to magick &lt;XML&gt; into <XML> all 
> by itself. I'm still investigating, but seeing the file encoded, and then 
> loading it through XML::Simple and Data::Dump[ing] it shows that it's 
> autoconverted. Why, I'm not sure...

It does that because that's what it's supposed to do; XML processors
must resolve entities automagically. If they want "<XML>" to be
rendered by the final browser, it should be encoded in the RSS feed
as:

  &amp;lt;XML&amp;gt;
so that it will come out of the XML parser as:
  &lt;XML&gt;
which will be rendered by the HTML parser as:
  <XML>

Practically, the best thing to do is probably scan and allow a
pre-determined subset of HTML, and entity-encode everything else (as
is suggested by Dan).

Cheers,



--
Mark Nottingham
http://www.mnot.net/