[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [syndication] HTML Encoding BooHoo...

To: syndication@yahoogroups.com, syndication@yahoogroups.com
Subject: Re: [syndication] HTML Encoding BooHoo...
From: Morbus Iff <morbus@disobey.com>
Date: Wed, 23 May 2001 12:57:37 -0400
In-reply-to: <GNRNxVAzc+C7EAUd@netmarketseurope.com>
References: <5.1.0.14.2.20010523113347.00a82050@mail.totalnetnh.net> <5.1.0.14.2.20010523113347.00a82050@mail.totalnetnh.net>

>>Now, the "reversion of entities" code in my RSS reader doesn't know about
>>HTML - it just blindly reverts &lt; to < and so forth. Is the only solution
>>to my problem to make the code understand all the possible HTML entities?
>>Or is there something else?
>
>There's a fair bit of code around that removes all tags except a subset
>of "Allowable html". PHP even has this as a function built into the
>scripting language.

Yes, but that wouldn't solve my above problem (** and see earlier message).In thiss case, <XML> wasn't a tag, it was part of the actual <title>.Removing all HTML tags wouldn't affect the <XML>, cos that's not a validHTML tag anyways... Right now, my reader:


  - loads in an XML file.
  - converts any encoded &lt/&gt's to </> (to cover encoded HTML).
    this is a mass replacement, which causes the above problem.

Ultimately, I don't want to remove tags (that's not a decision I'm willingto make for the users of my program, but it will be an option that they canchoose from).

In this case, it's not even an issue of allowable tags or not - it's anissue of preparing for people correctly encoding HTML (<b>) and notencoding HTML (<b>).

** I eventually tracked the culprit to nothing in my code, but rather theXML::Simple perl module, which seems to magick <XML> into <XML> allby itself. I'm still investigating, but seeing the file encoded, and thenloading it through XML::Simple and Data::Dump[ing] it shows that it'sautoconverted. Why, I'm not sure...



Morbus Iff
.sig on other machine.
http://www.disobey.com/
http://www.gamegrene.com/

Follow-Ups:
- Re: [syndication] HTML Encoding BooHoo...
  - From: Mark Nottingham <mnot@mnot.net>

References:
- HTML Encoding BooHoo...
  - From: Morbus Iff <morbus@disobey.com>
- Re: [syndication] HTML Encoding BooHoo...
  - From: Julian Bond <julian@netmarketseurope.com>

Prev by Date: Re: [syndication] HTML Encoding BooHoo...
Next by Date: [syndication] HTML Encoding BooHoo...
Previous by thread: Re: [syndication] HTML Encoding BooHoo...
Next by thread: Re: [syndication] HTML Encoding BooHoo...
Index(es):
- Date
- Thread