[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [syndication] O'Reilly's "Content Syndication with XML and RSS"
- To: syndication@yahoogroups.com
- Subject: Re: [syndication] O'Reilly's "Content Syndication with XML and RSS"
- From: burton@openprivacy.org
- Date: 02 May 2002 11:58:16 -0700
- In-reply-to: <+9aDNLAHjO08EAck@voidstar.com>
- References: <001c01c1f045$2d09c830$34177ac1@benhammersley.com> <87wuundrn7.fsf@openprivacy.org> <+9aDNLAHjO08EAck@voidstar.com>
- User-agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.1.50
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Julian Bond <julian_bond@voidstar.com> writes:
> In article <87wuundrn7.fsf@openprivacy.org>, burton@openprivacy.org
> writes
> >Specifically some of the RSS readers can't handle valid XML.
> >For example:
> >
> >http://relativity.yi.org/rss/index.rss
> >
> >This uses explicit CDATA sections (100% valid XML) for descriptions but a number
> >of RSS tools can't handle it. Specifically Radio Userland.
> >
> >I think that it needs to be strongly worded that if you want to use RSS, use an
> >XML parser. Do not try to parse RSS by hand.
>
> Well, yes. But IMHO, at least part of the success of RSS in the marketplace
> was that it was so easy to parse that all you needed was a few string
> manipulation functions and maybe some regex.
Well... RSS is XML. If you think this is a requirement then you should
recommend that we drop XML support from RSS.
I personally don't think that should happen.
> All those PHP driven sites like Postnuke would never have managed RSS, as PHP
> was a little late to the party with a good XML Parser.
I don't have a problem with people dropping out if they can't handle XML
correctly.
Yes it would mean less market penetration but it would also mean a higher
quality of clients.
> >Also... don't encode HTML entities within your title elements.
>
> I think you mis-spelt "include" when you wrote "encode". And as the biggest
> cause of damaged RSS is unencoded "&"s, this is dangerous advice. Perhaps
> you'd like to clarify what you were getting at.
Some people have DB backends that allow people to enter article titles like
'This is pretty <b>amazing!</b>'
Then they just encode them in their title
<title>This is pretty <b>amazing!&;lt;/b></title>
... and they expect us to handle that! :)
I posted to the list about updating mod_content to support this. I didn't get
any approvals but no disapprovals either.
> I have a sort of background beef that RSS is essentially an envelope in XML
> around a package that is just a collection of arbitrary bytes. It's mildly
> irritating that in order to transport that package, the package itself has to
> be transformed into valid XML. This has led to a whole bunch of stupidity
> about entities, doctypes, dtds, and practices like double encoding (& becomes
> &amp;).
Well... those are normal XML gripes. Get used to it as we will be using XML
for a while :)
There are a bunch of good things about XML including namespaces, API support,
parser impls, correct handling of unicode, etc.
I don't think we should ditch these :)
I am also curious how all those hand-rolled RSS parsers handle UTF-16 :)
<snip/>
Kevin
- --
Kevin A. Burton ( burton@apache.org, burton@openprivacy.org, burtonator@acm.org )
Location - San Francisco, CA, Cell - 415.595.9965
Jabber - burtonator@jabber.org, Web - http://relativity.yi.org/
In this business, the only real open industry standard in the computer industry
is Linux, which thankfully remains beyond the clutches of the moguls. Everything
else is hokum designed to lock developers (and by extension, customers) into
proprietary corners of the computing constellation.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Get my public key at: http://relativity.yi.org/pgpkey.txt
iD8DBQE80YxIAwM6xb2dfE0RAqXwAKCaZu3ZQlILULp8+iMbS4V4Ksfc2QCfR4F5
OwZ0PLKRzHwGh0efYdLrKNo=
=cypp
-----END PGP SIGNATURE-----