[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[syndication] HTML Encoding BooHoo...
Morbus Iff writes:
> At this point in time, I *do not* want to get into a discussion of the
> morality of HTML in RSS feeds and how the world is going to end. My point
> of view is that I flipping hate HTML in RSS feeds, but I've got to cope,
> just like everyone else.
I'd treat it just like untrusted user-entered text: Do the entity
encoding, then run through and find all unknown tags or tags with
unacceptable attributes and convert them back to <xml> like
escapes (HTML4 strict is a good list of things that won't let 'em
screw up your display too bady), do the clean-up of the unclosed tags.
You should probably also be looking out for common HTML coding
mistakes and handling them.
I've got Perl code to do this for my user comments if you want to
steal from it.
Dan