Click here to show or hide the menubar.
Thread started by Dave Winer on Saturday, October 13, 2012.

My linkblog has an archive

The RSS feed for my linkblog, which goes back to December 2010, has a feature that I think no other RSS feed has (or Atom for that matter). Not only does it have an archive, but the feed itself describes the archive. It contains enough information so you could easily write a script that downloads all the content of my linkblog, in RSS.

1. If you look in the feed, look for the <microblog:archive> element. Here's a screen shot.

2. I wrote a script that downloads the linkblog archive in full. Here's a text file containing the script. Here's the script as an OPML file, and as a .ftsc file. It's written in UserTalk, but could easily be converted to other scripting languages.

2a. For fun, I pasted the script into the outline for this blog post. Let's see if it works! :-)

on downloadLinkblogArchive (urlFeed, folder)

local (xmltext = tcp.httpreadurl (urlfeed, 3, false))

xml.compile (xmltext, @xstruct)

local (adrrss = xml.getaddress (@xstruct, "rss"))

local (adrchannel = xml.getaddress (adrrss, "channel"))

local (adrarchive = xml.getaddress (adrchannel, "archive"))

local (urlarchive = xml.getvalue (adrarchive, "url"))

local (fname = xml.getvalue (adrarchive, "filename"))

on getdate (name)

local (s = xml.getvalue (adrarchive, name))

return (date.set (string.nthfield (s, "-", 3), string.nthfield (s, "-", 2), string.nthfield (s, "-", 1), 0, 0, 0))

local (theday = getdate ("startDay"), endday = getdate ("endDay"))

loop

local (urlday = urlarchive + file.getdatepath ("/", theday) + fname)

local (fday = folder + file.getdatepath (file.getpathchar (), theday, false) + ".xml")

try

local (xmltext = tcp.httpreadurl (urlday, 3, false))

file.surefilepath (fday) //make sure all folders in path exist

file.writewholefile (fday, xmltext)

bundle //set mod and creation dates

local (xstruct, adrrss, adrchannel)

xml.compile (xmltext, @xstruct)

adrrss = xml.getaddress (@xstruct, "rss")

adrchannel = xml.getaddress (adrrss, "channel")

file.setcreated (fday, date (xml.getvalue (adrchannel, "pubDate")))

file.setmodified (fday, date (xml.getvalue (adrchannel, "lastBuildDate")))

msg (fday)

theday = date.tomorrow (theday)

if theday > endday

break

bundle //test code

downloadLinkblogArchive ("http://links.scripting.com/rss.xml", "Ohio:mylinkblog:")

3. I ran the script, it created a folder. Here's a screen shot of the folder, and the folder itself as a zip archive.

4. Here are the docs for the archive element.

Hopefully that covers all the bases.

XML