in site-related

A quick note on blog sustainability

[edit: I’ve been told the word I’m looking for is actually preservation, not sustainability. Whoops.]

Sustainability’s a tricky word. I don’t mean whether the scottbot irregular is carbon neutral, or whether it’ll make me enough money to see me through retirement. This post is about whether scholarly blog posts will last beyond their author’s ability or willingness to sustain them technically and financially.

A colleague approached me at a conference last week, telling me she loved one of my blog posts, had assigned it to her students, and then had freaked out when my blog went down and she didn’t have a backup of the post. She framed it as being her fault, for not thinking to back up the material.



Of course, it wasn’t her fault that my site was down. As a grad student trying to save some money, I use the dirt-cheap bluehost for hosting my site. It goes down a lot. At this point, now that I’m blogging more seriously, I know I should probably migrate to a more serious hosting solution, but I just haven’t found the time, money, or inclination to do so.

This is not a new issue by any means, but my colleague’s comment brought it home to me for the first time. A lot has already been written on this subject by archivists, I know, but I’m not directly familiar with any of the literature. As someone who’s attempting to seriously engage with the scholarly community via my blog (excepting the occasional Yoda picture), I’m only now realizing how much of the responsibility of sustainability in these situations lies with the content creator, rather than with an institution or library or publishing house. If I finally decide to drop everything and run away with the circus (it sometimes seems like the more financially prudent option in this academic job market), *poof* the bulk of my public academic writings go the way of Keyser Söze.

So now I’m going to you for advice. If we’re aiming to make blogs good enough to cite, to make them countable units in the scholarly economy that can be traded in for things like hiring and tenure, to make them lasting contributions to the development of knowledge, what are the best practices for ensuring their sustainability? I feel like I haven’t been treating this bluehost-hosted blog with the proper respect it needs, if the goal of academic respectability is to be achieved. Do I self-archive every blogpost in my institution’s dspace? Does the academic community need to have a closer partnership with something like archive.org to ensure content persistence?

Write a Comment


  1. ljegou, that looks like a good option, but I still would like to find more of a balance between hosting my own site and having it be archived somewhere. An elegant solution might be something that would just automatically archive an RSS feed.

  2. In the UK, there is the British Library-run UK Web Archive: http://www.webarchive.org.uk/ukwa/ to which one can nominate one’s own site. I don’t know how it works, but in the US there’s the Library of Congress: http://www.loc.gov/webarchiving/

    And of course, there’s archive.org, as you mention. Google also caches sites.

    But whether these archives alone constitute best practice is another matter. There are copyright questions, and technological matters: is the functionality and interactivity of the site preserved?

    I think also you are raising two different questions: long term preservation, and access during short-term downtime. For the latter, get a decent host and have a ‘back soon!’ holding page.

    • Good point in splitting the two up, John, thanks. Yes, I think the key – especially as these become more than simply static pages – is preserving the interactivity of the scholarly object. The difficulties in preserving something like ORBIS comes to mind particularly, where we’re worried not just about the storage of the content, but also the ability of future computers to actually render the content as it was intended.

  3. Scott,

    I polled my offices-mates in archives research and the quick not-enough-coffee-and-we’re-off-to-meetings 5 minute conversation concluded that sustainability is not quite the right term. Sustainability is generally about an institution’s relationship to *preservation* activities (usually in the form of a “sustainability plan”) that work towards object *persistence.* That last sentence is a bit awkward cause I tried to clarify each term in relation to each other. Anyway, I think the terms you want, as you indicated on twitter, are “preservation” and “persistence.” Preservation being the activity and persistence being an attribute of the object(maybe?).

    The archives research community has kinda sorta begun looking into the scholarly blog persistence problem. The most comprehensive treatment has been Carolyn Hank’s dissertation at the UNC iSchool: “Scholars and their blogs”

    She found the state of archiving scholarly blogs to be idiosyncratic at best. For more information I’d check out her slideshare, a few good presentations on the state of scholarly blogging from an archives perspective:
    “Blog preservation some considerations” – http://www.slideshare.net/carolynhank/blog-preservation-some-considerations
    a bunch more here: http://www.slideshare.net/carolynhank/presentations

    http://perma.cc, a project out of Harvard library, is an interesting solution, but is more focused on saving things from journal citations.

  4. Locally at Illinois, the Library has begun scraping and archiving faculty blogs. That’s a limited solution. And of course preserving html may not in itself guarantee legibility to future generations. But that’s probably where I would look for preservation — libraries.