[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [syndication] XML Character encoding (again)



Nyah, nyah, told ya so.

For more headache inducing goobledygook read:
http://www.opengroup.org/onlinepubs/7908799/xbd/locale.html

One potential way out of it is to use a preview function with a drop-down of
encoding you know you support.  That way if a user wants to paste in something
beyond the normal range they could help the system figure it out.  Perhaps put
that on the preview screen with a "Hey, the letters aren't being shown right"
sort of tooltip or help link.

The drupal guys do a bunch of i18n work, perhaps you could ask them for
suggestions.

This link has a script for converting:
http://www.php.net/manual/en/function.convert-cyr-string.php

Then there's also iconv:
http://www.php.net/manual/en/ref.iconv.php
http://www.gnu.org/software/libiconv/

The hardest part is 'detecting' what you're being handed.  Then it's a matter of
transcoding the data into one of the charsets you know you can support, thus
UTF-8.

-Bill Kearney



----- Original Message -----
From: "Julian Bond" <julian_bond@voidstar.com>
To: <syndication@yahoogroups.com>
Sent: Wednesday, April 16, 2003 2:52 PM
Subject: Re: [syndication] XML Character encoding (again)


> Bill Kearney <ml_yahoo@ideaspace.net> wrote:
> >"aggressive parsers" wooo, there's an loaded statement.
>
> Heh!
>
> I just knew this would happen. A Korean ex-pat has started a Korean
> cooking club on the site and he's started putting Korean characters in
> the recipe names in his blog.
>
> That sort of thing can just ruin your XML day.
>
> I give up. My feed is broken.