[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [syndication] HTML Encoding BooHoo...



In article <5.1.0.14.2.20010523113347.00a82050@mail.totalnetnh.net>, 
>The same "reversion of entities" affects the &lt;XML&gt; as well, making it 
><XML> as the final code to be sent for browser display. This, as some can 
>guess, causes problems. Specifically, in IE 6b for Windows, it stops the 
>browser display cold - IE thinks an XML document is on its way.
>
>Now, the "reversion of entities" code in my RSS reader doesn't know about 
>HTML - it just blindly reverts &lt; to < and so forth. Is the only solution 
>to my problem to make the code understand all the possible HTML entities? 
>Or is there something else?

There's a fair bit of code around that removes all tags except a subset
of "Allowable html". PHP even has this as a function built into the
scripting language.

I think the correct way to deal with this is for feed producers to:-
- Only include html in <description>, not <title>
- Escape all reserved characters in <description>
For people who turn feeds into displayable code
- Unescape all escaped reserved characters
- Trim the tags to a sub set that you feel comfortable with for your
display purposes.

And that's it. There's some strange problem with "&" or is that "&&" or
"&amp;&amp;" which I'll ignore for the moment.

It's a SMOP. (Simple Matter Of Programming) 

-- 
Julian Bond eMail: julian@netmarketseurope.com
HomeURL: http://www.shockwav.demon.co.uk/ 
WorkURL: http://www.netmarketseurope.com/
WebLog: http://roguemoon.manilasites.com/
M: +44 (0)77 5907 2173  T: +44 (0)20 7420 4363  
ICQ:33679668 tag:So many words, so little time