<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Cleaning UTF-8 Streams for XML Parsing</title>
	<atom:link href="http://www.consil.co.uk/blog/2010/02/07/cleaning-utf-8-streams-for-xml-parsing/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.consil.co.uk/blog/2010/02/07/cleaning-utf-8-streams-for-xml-parsing/</link>
	<description></description>
	<lastBuildDate>Tue, 10 May 2011 09:14:14 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
	<item>
		<title>By: Jason</title>
		<link>http://www.consil.co.uk/blog/2010/02/07/cleaning-utf-8-streams-for-xml-parsing/#comment-39</link>
		<dc:creator>Jason</dc:creator>
		<pubDate>Wed, 05 Jan 2011 10:43:18 +0000</pubDate>
		<guid isPermaLink="false">http://www.consil.co.uk/blog/?p=611#comment-39</guid>
		<description>I realised I have missed out one important invalid character - the NULL byte (x00). I&#039;ll update the regular expressions when I get a chance.</description>
		<content:encoded><![CDATA[<p>I realised I have missed out one important invalid character &#8211; the NULL byte (x00). I&#8217;ll update the regular expressions when I get a chance.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jason</title>
		<link>http://www.consil.co.uk/blog/2010/02/07/cleaning-utf-8-streams-for-xml-parsing/#comment-38</link>
		<dc:creator>Jason</dc:creator>
		<pubDate>Wed, 05 Jan 2011 10:42:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.consil.co.uk/blog/?p=611#comment-38</guid>
		<description>Sorry Visu - I have no experience of Java to help with any migration.</description>
		<content:encoded><![CDATA[<p>Sorry Visu &#8211; I have no experience of Java to help with any migration.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Visu</title>
		<link>http://www.consil.co.uk/blog/2010/02/07/cleaning-utf-8-streams-for-xml-parsing/#comment-37</link>
		<dc:creator>Visu</dc:creator>
		<pubDate>Wed, 05 Jan 2011 09:27:22 +0000</pubDate>
		<guid isPermaLink="false">http://www.consil.co.uk/blog/?p=611#comment-37</guid>
		<description>Hi Jason...
Thanks for the post.
I need java version of the reg ex to filter the in-valid xml characters ,
as you said i also came across hat not all valid UTF-8 characters are in fact valid in well-formed XML, 

please help me...</description>
		<content:encoded><![CDATA[<p>Hi Jason&#8230;<br />
Thanks for the post.<br />
I need java version of the reg ex to filter the in-valid xml characters ,<br />
as you said i also came across hat not all valid UTF-8 characters are in fact valid in well-formed XML, </p>
<p>please help me&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: teo</title>
		<link>http://www.consil.co.uk/blog/2010/02/07/cleaning-utf-8-streams-for-xml-parsing/#comment-36</link>
		<dc:creator>teo</dc:creator>
		<pubDate>Sat, 14 Aug 2010 21:19:40 +0000</pubDate>
		<guid isPermaLink="false">http://www.consil.co.uk/blog/?p=611#comment-36</guid>
		<description>Thank you so much for this!
It really helped preparing some string for importing into magento using its soap api.
Really feel like the poster above! :)</description>
		<content:encoded><![CDATA[<p>Thank you so much for this!<br />
It really helped preparing some string for importing into magento using its soap api.<br />
Really feel like the poster above! <img src='http://www.consil.co.uk/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jason</title>
		<link>http://www.consil.co.uk/blog/2010/02/07/cleaning-utf-8-streams-for-xml-parsing/#comment-35</link>
		<dc:creator>Jason</dc:creator>
		<pubDate>Wed, 26 May 2010 11:25:27 +0000</pubDate>
		<guid isPermaLink="false">http://www.consil.co.uk/blog/?p=611#comment-35</guid>
		<description>Glad you found it useful Leon. If you find any bugs, just post them here and I&#039;ll get them included.

Now I&#039;ve come back I realise the code is not formatting properly in the post, but the downloadable version should still be okay.

-- Jason</description>
		<content:encoded><![CDATA[<p>Glad you found it useful Leon. If you find any bugs, just post them here and I&#8217;ll get them included.</p>
<p>Now I&#8217;ve come back I realise the code is not formatting properly in the post, but the downloadable version should still be okay.</p>
<p>&#8211; Jason</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: leon</title>
		<link>http://www.consil.co.uk/blog/2010/02/07/cleaning-utf-8-streams-for-xml-parsing/#comment-34</link>
		<dc:creator>leon</dc:creator>
		<pubDate>Tue, 25 May 2010 11:39:15 +0000</pubDate>
		<guid isPermaLink="false">http://www.consil.co.uk/blog/?p=611#comment-34</guid>
		<description>This is superb thanks for sharing this, it&#039;s made my day.

I had been hit by this problem using simple_xml with some third party xml which it refused to parse due to the invalid characters. Just using iconv or utf8_encode did not do the job and like you was also surprised to learn about the well formed UTF-8 problem. Other solutions I found removed too many characters but yours worked a treat.

Keep up the blogging and thanks again.</description>
		<content:encoded><![CDATA[<p>This is superb thanks for sharing this, it&#8217;s made my day.</p>
<p>I had been hit by this problem using simple_xml with some third party xml which it refused to parse due to the invalid characters. Just using iconv or utf8_encode did not do the job and like you was also surprised to learn about the well formed UTF-8 problem. Other solutions I found removed too many characters but yours worked a treat.</p>
<p>Keep up the blogging and thanks again.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

