[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [syndication] RSS 0.94



> Come up with a standardised way of including the character "<" in text
> in the <item><description> element. Is it &amp;lt; ? And does that mean
> that "&" should be &amp;amp; ?

This is already defined in XML.  It's &lt; and &amp;.  There have been tools
that incorrecty double-encode things.  The only time these would ever appear as
&amp;lt; or &amp;amp; would be if you truly intended to display the text as such
in an HTML interpreter.

If you're pushing in formatted text, using HTML, you'd use:
    <description>
        &lt;b>This is bold text&lt;/b> and this is &lt;b>italicized&lt;/i>
    <description>

If you,for some reason, needed to produce an HTML formatted string that also had
markup, say to demonstrate formatting XML, you would have to double-encode the
characters used for that markup.

    <description>
        A &lt;b>&amp;&lt;title>&lt;/b> element might have to use double
encoding.
    <description>

This follows XML guidelines.  There are ways, in RSS-1.0, to not use encoding.
It's possible to declare XTHML namespaces inside elements and use no encoding at
all.  This is problematic because you *need to be sure* you're passing along
completely valid XHTML.  Otherwise you'd break the XML parser expecting to read
the whole feed.  There is an alternative and that's to use a CDATA wrapper
around the contents.  Take care, however, some RSS readers appear to have
problems using CDATA blocks.

But more to the point, your feed content should be coming out a managed program.
Try to fiddle with all of this by hand is too much trouble and results in too
many errors.  If your tool can't properly handle what you need, encoding wise,
then it's time to shop around for a new one.

-Bill Kearney

-Bill Kearney