[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [syndication] html parsing as a horror story

To: <syndication@yahoogroups.com>
Subject: Re: [syndication] html parsing as a horror story
From: "Bill Kearney" <wkearney99@hotmail.com>
Date: Fri, 19 Jul 2002 10:55:19 -0400
References: <OE368kfWWTAnXKpauvh0000c238@hotmail.com> <87d6tk1qja.fsf@openprivacy.org> <OE25lL3N2TXZDVV5zi00000f6d4@hotmail.com> <02a101c22f32$13a70b60$33a1dc40@murphy>
Reply-to: "Bill Kearney" <wkearney99@hotmail.com>

> 1. Your stats page had nothing to do with it. Like many developers with a
> big product and a small team, we have a queue of bugs and features. That it
> took a few weeks to get it out says it was a *high* priority Bill.

And the larger consuming RSS community thanks you for it.

> 2. I did the work, not Jake.

Indeed, the comments in the script seem to reflect that.  So, now that you've
owned up to it, why does it mangle the UTF-8 characters?  Using a blanket
string.replace on the lone ampersand will end up producing double-encoded text.
It's incorrect to express &#999 as &amp;#999 yet that's precisely what
xml.entityEncode does.  Please, really and truly, please fix this.

The xml.entityDecode routine would also benefit from some strengthening.  It's
current handling of things could be improved.  Especially in the area of
handling HTML entity (which it does not handle now).

There's a tremendous audience of non-English speaking users out there.  There
are many tools available to them that understand how to properly encode
characters and express language tags.  It would be great to see Radio follow
their lead.

-Bill Kearney

Follow-Ups:
- Re: [syndication] html parsing as a horror story
  - From: "Dave Winer" <dave@userland.com>

References:
- html parsing as a horror story
  - From: "Bill Kearney" <wkearney99@hotmail.com>
- Re: [syndication] html parsing as a horror story
  - From: burton@openprivacy.org
- Re: [syndication] html parsing as a horror story
  - From: "Bill Kearney" <wkearney99@hotmail.com>
- Re: [syndication] html parsing as a horror story
  - From: "Dave Winer" <dave@userland.com>

Prev by Date: Re: [syndication] html parsing as a horror story
Next by Date: Re: [syndication] html parsing as a horror story
Previous by thread: Re: [syndication] html parsing as a horror story
Next by thread: Re: [syndication] html parsing as a horror story
Index(es):
- Date
- Thread