Let's write a browser inside the browser, in Javascript!
Freakonomics and Partial Feeds
The popular Freakonomics Blog has recently moved to the New York Times, and along the way it's dropped its full-text RSS feed. Stephen Dubner wrote a thoughtful explanation of the reasons for the switch — basically, advertisers aren't comfortable with RSS — but judging from the comment section the blog's readers are still upset.
Well, let me offer a gentle reminder that our very own full text RSS tool continues to work, and was designed for exactly this purpose. I've tested it with the new, truncated Freakonomics feed and it works great. Why not give it a try and help push advertisers just a little bit closer to grappling with the internet?
UPDATE: Alas! It looks like the Freakonomics authors have adopted the intermittent and irritating habit of writing descriptions in the RSS description field rather than including excerpts of the actual text. That confuses our general-purpose algorithm.
However, it won't stop a dedicated Freakonomics fan from creating a blog-specific script to provide full feeds. Here, I'll even get them started:
m/<div\s+class="post\-content">(.*?)\-+>/i
s/<\/?div[^>]*>//igx;







Comments
could you post a link for the full feed that works please? Thanks in advance
7 February 2007
9 weeks 8 hours
You might want to consult the comments on the Freakonomics post -- looks like someone's gone ahead and supplied a link.
If you're ever so inclined, you should take a look at Hpricot, the phenomenal Ruby-based HTML scraping library that supports XPath and CSS-style searching. It beats the pants of RegExps.
http://code.whytheluckystiff.net/hpricot/
I wrote my own Full Text Freakonomics feed in a few minutes. It looks like something like this:
feed.items.each { |item|
doc = Hpricot(open(item.link))
item.description = (doc/".post-content").inner_html
}