Archiving

A couple of casual online conversations:

First, journalist Jamie Bartlett banging on on Twitter about blockchain.

It became fashionable in 2015 to dismiss bitcoin but get excited about blockchain.  I never really got it, because what makes the blockchain work is the fact that there are rewards for building it.  I can download the blockchain and not even know who I am downloading it from, but, because (a) it takes enormous resources to create that data, and (b) that enormous effort is only rewarded if the recent blocks were added to the longest chain that other bitcoin users were seeing at time, I can be very confident that the whole chain, at least up to the last few blocks, is the same one anyone else is seeing, though I don't know who I got mine from and I don't know who they would get theirs from.

A blockchain without a cryptocurrency to reward the miners who create the blockchain is just a collection of documents chained by each containing the hash of its parent. In other words, it is just git.

What I hadn't realised is that the people so excited about blockchains actually didn't know about git, even though this aspect of bitcoin's design was explicitly based on git, and even though git is about 100-1000X more widely used than bitcoin. They maybe knew that git was a source control system, and that you could store and share stuff on github.com, but they didn't know that it is impossible to publish a version of a git project with a modified history that wouldn't be obvious to anyone who tried to load it but who previously had the true version of that history.  If you publish something via git, anyone can get a copy from you or from each other, and anyone can add material, but if anyone tampers with history, it will immediately show.

So, when Bartlett said "Parliament should put its records on a blockchain", what I deduced he really meant was "Parliament should check its records into git". Which, if you happen to care for some reason about the wafflings of that bunch of traitors and retards, is a fairly sensible point.

So much for that. On to incidental conversation the second.

P D Sutherland has been in the news, speaking in his role as Special Representative of the Secretary-General of the United Nations. @Outsideness highlighted a tweet of his as "possibly the most idiotic remark I've ever seen"


The interesting thing is I distinctly remember a post on Sutherland, probably 2-3 years ago, on one of the then-young NRx blogs, and a bit of discussion on the comments. It's interesting because Sutherland is such a stereotype Euro-politician ( Irish bar -> Fine Gael -> Trilateral Commission -> European Commissioner -> United Nations ), to be worth attention. Further, it would be interesting to see what we saw and to what extent we might have anticipated the present.

However, I couldn't find the post or discussion. Blogs come and go, writers change personas, and either it's gone or the search engines couldn't find it.


Putting these two together, we need to archive our valuable materials, and the proper tool for a distributed archive is git. Spidering a blog might work for a dead one like Moldbug's, but is a poor way of maintaining a reserve archive of numerous live ones.

I've written some ruby scripts to convert blog export files and feed files into one file per post or comment, so they can be archived permanently.  All a bit scrappy at the moment, but it seems to work.

The idea (when it's a bit more developed) would be that a blog owner could offer the blog as a git archive alongside the actual web interface. Anyone could clone that, and keep it updated using the feed. If the blog ever vanishes, the git clones still exist and can be easily shared.

(I wouldn't advise posting the git archive to a public site like github. The issue is not privacy--the data is all public in the first place--but deniability.  If you decide to delete your blog, then a recognised public archive is something people can point to to use the content against you, whereas a personal copy is less attributable. Of course, you can't prevent it, but you can't prevent archive.org or the like either)

Labels: