<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Better competitive intelligence through scraping with Groovy</title>
	<atom:link href="http://www.keplarllp.com/blog/2010/01/better-competitive-intelligence-through-scraping-with-groovy/feed" rel="self" type="application/rss+xml" />
	<link>http://www.keplarllp.com/blog/2010/01/better-competitive-intelligence-through-scraping-with-groovy</link>
	<description>Blogging from the team at Keplar LLP</description>
	<lastBuildDate>Tue, 20 Sep 2011 14:56:03 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: Alex</title>
		<link>http://www.keplarllp.com/blog/2010/01/better-competitive-intelligence-through-scraping-with-groovy/comment-page-1#comment-905</link>
		<dc:creator>Alex</dc:creator>
		<pubDate>Fri, 19 Mar 2010 01:29:02 +0000</pubDate>
		<guid isPermaLink="false">http://www.keplarllp.com/blog/?p=198#comment-905</guid>
		<description>Thanks for your comment! You&#039;re right, the code isn&#039;t coping with relative URLs. I&#039;ve updated the offending line to be:

[sourcecode language=&quot;groovy&quot; light=&quot;true&quot;]
def productURL = new URI(seedURL.toString()).resolve(it).toURL()
[/sourcecode]

Hopefully that fixes it. I hadn&#039;t heard of &lt;a href=&quot;http://groovy.codehaus.org/Grape&quot; rel=&quot;nofollow&quot;&gt;Grape&lt;/a&gt; - it looks cool, will check it out!</description>
		<content:encoded><![CDATA[<p>Thanks for your comment! You&#8217;re right, the code isn&#8217;t coping with relative URLs. I&#8217;ve updated the offending line to be:</p>
<pre class="brush: groovy; light: true; title: ; notranslate">
def productURL = new URI(seedURL.toString()).resolve(it).toURL()
</pre>
<p>Hopefully that fixes it. I hadn&#8217;t heard of <a href="http://groovy.codehaus.org/Grape" rel="nofollow">Grape</a> &#8211; it looks cool, will check it out!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Guy</title>
		<link>http://www.keplarllp.com/blog/2010/01/better-competitive-intelligence-through-scraping-with-groovy/comment-page-1#comment-903</link>
		<dc:creator>Guy</dc:creator>
		<pubDate>Tue, 16 Mar 2010 23:11:24 +0000</pubDate>
		<guid isPermaLink="false">http://www.keplarllp.com/blog/?p=198#comment-903</guid>
		<description>Thanks, great post!

First run I got:  

Accessing product URL /shop/product1.html
Caught: java.net.MalformedURLException: no protocol: /shop/product1.html
	at x$_run_closure1_closure2_closure6.doCall(x.groovy:32)
	at x$_run_closure1_closure2.doCall(x.groovy:29)
	at x$_run_closure1.doCall(x.groovy:20)
	at x.run(x.groovy:15)

so I added &#039;http://labs.keplarllp.com/&#039; as the root of the URL and it worked

I guess relative paths support is missing.

Besides, I had to download cyberneko manually, anyone knows why
@Grapes(
    @Grab(group=&#039;nekohtml&#039;, module=&#039;nekohtml&#039;, version=&#039;0.7.6&#039;)
)
doesn&#039;t work?</description>
		<content:encoded><![CDATA[<p>Thanks, great post!</p>
<p>First run I got:  </p>
<p>Accessing product URL /shop/product1.html<br />
Caught: java.net.MalformedURLException: no protocol: /shop/product1.html<br />
	at x$_run_closure1_closure2_closure6.doCall(x.groovy:32)<br />
	at x$_run_closure1_closure2.doCall(x.groovy:29)<br />
	at x$_run_closure1.doCall(x.groovy:20)<br />
	at x.run(x.groovy:15)</p>
<p>so I added &#8216;<a href="http://labs.keplarllp.com/" rel="nofollow">http://labs.keplarllp.com/</a>&#8216; as the root of the URL and it worked</p>
<p>I guess relative paths support is missing.</p>
<p>Besides, I had to download cyberneko manually, anyone knows why<br />
@Grapes(<br />
    @Grab(group=&#8217;nekohtml&#8217;, module=&#8217;nekohtml&#8217;, version=&#8217;0.7.6&#8242;)<br />
)<br />
doesn&#8217;t work?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

