Category: Scheme

Further progress on Ugarit archival mode (by )

Further to my last post on the matter, I've been working on the basic user interface to accessing archive metadata.

As before, let's do an import to an archive tag in a vault. I've made a manifest file with three MP3s in - all data that could be extract from ID3 tags, and I plan to write a tool to automate the generation of manifests by examining their contents in exactly that manner, but for now I had to hand-write one:

[alaric@ahusai ugarit]$ cat test.manifest
(object "/home/alaric/archive/sorted-music/UNKLE/Psyence Fiction/13 Be There.mp3"
        (title = "Be There")
        (track = 13)
        (artist = "UNKLE")
        (album = "Psyence Fiction"))

(object "/home/alaric/archive/sorted-music/UNKLE/Psyence Fiction/11 Rabbit in Your Headlights.mp3"
        (title = "Rabbit in Your Headlights")
        (track = 11)
        (artist = "UNKLE")
        (album = "Psyence Fiction"))

(object "/home/alaric/archive/sorted-music/Led Zeppelin/Remasters/1-09 Celebration Day.mp3"
        (title = "Celebration Day")
        (track = 9)
        (volume = 1)
        (artist = "Led Zeppelin")
        (album = "Remasters"))

As before, I import it, loading the files into the content-addressible storage of the vault, automatically deduplicating, and possibly storing the data on a cluster of remote servers (although in this case, I'm just using a local vault). This was done with Ugarit revision [80b324f3af]:

[alaric@ahusai ugarit]$ ugarit import test.conf music test.manifest
Loading manifest file test.manifest...
Importing from test.manifest to tag music...
Importing /home/alaric/archive/sorted-music/Led Zeppelin/Remasters/1-09 Celebration Day.mp3...
...imported with key 4d64e4650333741cb56c3e6a785b6de4d23324cb1055e529
Importing /home/alaric/archive/sorted-music/UNKLE/Psyence Fiction/11 Rabbit in Your Headlights.mp3...
...imported with key 370bee7debb458357a2b879014d4abbeb409215ed269c1c6
Importing /home/alaric/archive/sorted-music/UNKLE/Psyence Fiction/13 Be There.mp3...
...imported with key 39df8bafd530a66614ad60ab323033b1385cdd842528dbd2
Committing import...
Imported successfully to tag music with import key ac26354ccfb0530109932c1aaddd414b59d4394d44ec43cd
Written 16MiB to the vault in 24 blocks, and reused 0B in 1 blocks (before compression)

But now it's in, we can query the metadata. Firstly, let's see what properties are available - a combination of the ones we wrote in the manifest, and automatically-generated ones such as a MIME type and the original import path:

[alaric@ahusai ugarit]$ ugarit search-props test.conf music
album
artist
filename
import-path
mime-type
title
track
volume

Let's see what values there are for the "artist" property:

[alaric@ahusai ugarit]$ ugarit search-values test.conf music artist
UNKLE
Led Zeppelin

(they're sorted by popularity, and we have two UNKLE tracks, so that comes first)

Let's see what UNKLE albums we have, by filtering for objects with an artist property of "UNKLE" and asking what values of the "album" property are available:

[alaric@ahusai ugarit]$ ugarit search-values test.conf music '(= ($ artist) "UNKLE")' album
Psyence Fiction

Let's see what we know about music by UNKLE:

[alaric@ahusai ugarit]$ ugarit search test.conf music '(= ($ artist) "UNKLE")'
object 39df8bafd530a66614ad60ab323033b1385cdd842528dbd2
    (album = "Psyence Fiction")
    (artist = "UNKLE")
    (filename = "13 Be There.mp3")
    (import-path = "/home/alaric/archive/sorted-music/UNKLE/Psyence Fiction/13 Be There.mp3")
    (mime-type = "audio/mpeg")
    (title = "Be There")
    (track = 13)
object 370bee7debb458357a2b879014d4abbeb409215ed269c1c6
    (album = "Psyence Fiction")
    (artist = "UNKLE")
    (filename = "11 Rabbit in Your Headlights.mp3")
    (import-path = "/home/alaric/archive/sorted-music/UNKLE/Psyence Fiction/11 Rabbit in Your Headlights.mp3")
    (mime-type = "audio/mpeg")
    (title = "Rabbit in Your Headlights")
    (track = 11)

Ok, let's listen to all our music by UNKLE (the extra "keys" parameter to the search command says to just output the object keys, one per line, and the "archive-stream" command streams the contents of an archived file to standard output):

[alaric@ahusai ugarit]$ for i in `ugarit search test.conf music '(= ($ artist) "UNKLE")' keys`;
do ugarit archive-stream test.conf music $i | mpg123 -;
done

...music by UNKLE plays...

We're slowly moving towards having a usable and useful archival filesystem, backed on a modular content-addressible storage system! Isn't that neat? Of course, it's not amazingly useful as it stands - at first sight, it's like a very crude version of the browser found in any modern music collection management app these days; but this is the seed of something much more interesting. For a start, it can categorise files using any user-defined schema. The backend storage can be encrypted, and accessed remotely over a network (and, in future, replicated over a cluster, or mirrored between your laptop and a home fileserver, and automatically synchronised when they're connected). The same storage can be used to store backup snapshots as well as archives, and if files exist in any combination of archives and snapshots, then only one copy of it will be stored (or need uploading, even); most files in an archive will have started off in a backed-up directory tree, or will be extracted into one.

There are many interesting use cases for Ugarit, but my personal one is to have a fault-tolerant vault of all the data that matters to me, neatly organised so I can find things quickly, and so I can access things from different locations (even when offline). Rather than having files scattered over different disks on different machines, and having to move things around to make space, and remember where they are, I can add more disks to the vault when I need more capacity, and have Ugarit manage everything for me. With the amount of data I manage, that'll be a great weight off my mind!

Ugarit archive mode progress (by )

Ugarit's archive mode is getting along nicely. I now have importing from a manifest file that specifies properties for the import as a whole, and a list of files to import with their own properties, and basic browsing of the audit trail of an archive in the virtual file system. That includes access to the properties of an import via the virtual "properties.sexpr" file. Note also that lots of import and file properties are automatically added, such as the hostname we import from, the input path for each file, a MIME type deduced from the extension, and so on.

Below the fold is a transcript of it in use, which probably won't mean much to many people...

Read more »

Recent Ugarit progress (by )

I had some time to work on Ugarit yesterday, which I made good use of.

I really should have worked on raw byte-stream-level performance issues - I did a large extract recently, and it took a whole week - but, having a restricted time window, I caved in and did something fun instead; I started work on archival mode. As a pre-requisite for this, I added the facility to give tags a "type" so we can distinguish archive tags from snapshot tags - thereby preventing embarrassing accidents that end up with a tag pointing to a mixture of snapshot and archive-import objects...

(Not that I didn't think about the performance issues. I have a plan in mind to rearrange the basic bulk-block-shovelling logic to avoid any allocation whatsoever by using a small number of reusable buffers, which should also avoid the copying required when talking to compression/encryption engines written in C.)

Read more »

Insomnia (by )

There's something about the combination of having spent many weeks in a row without more than the odd half-hour here and there to myself (time when I get to do whatever I like, rather than merely choosing which of the list of things I need to get done urgently I will do next, or just having no choice at all), and knowing I need to get up even earlier the next morning than usual (to dive straight into a long day of scheduled activities), that makes it very, very, hard for me to sleep.

So, although I got to bed in good time for somebody who has to wake up at six o'clock, I have given up laying there staring at the ceiling, and come down to eat some more food (I get the munchies past midnight), read my book without disturbing Sarah with my bedside light, and potter on my laptop. I need to be up in five hours, so hopefully emptying my brain of whirling thoughts will enable me to sleep.

There's lots of things I want to do. Even though it's something I need to get done by a deadline, I'm actually enthusiastic about continuing the project I was working on today; making an enclosure for our chickens. This is necessary for us to be able to go away from the house for more than one night, which is something we want to do over Christmas; thus the deadline.

Three of the edges of the enclosure will be built onto existing walls or woodwork, but one of them needs to cut across some ground, so I've dug a trench across said bit of ground, laid an old concrete lintel and some concrete blocks in the trench after levelling the base with ballast, and then mixed and rammed concrete around them. When I next get to work on it, I'll mix up a large batch of concrete and use it to level the surface neatly (and then ram any left-overs into remaining gaps) to just below the level of the soil, then lay a row of engineering bricks (frog down) on a mortar bed on top of that in order to make a foundation that I can screw a wooden batten to. With that done, and some battens screwed into the tops of existing walls that don't already have woodwork on, I'll be able to build the frame of the enclosure (including a door), then attach fox-proof mesh to it, and our chickens will have a new home they can run around in safely.

Thinking about how I'm going to lay the next batch of concrete in a nice level run, working around the fact that I only have a short spirit level by placing a long piece of wood in there and levelling it with wedges and then using it as a reference to level the concrete to, has been one of the things running around in my head this evening.

Another has been the next steps from last Friday, when I had a fascinating meeting with a bunch of interesting people in the information security world. You see, I've always been interested in the foundation technologies upon which we build software, such as storage management, distributed computing, parallel computing, programming languages, operating systems, standard libraries, fault tolerance, and security. I was lucky enough to find a way into the world of database development a few years ago, which (with a move to a company that produces software to run SQL queries across a cluster) has broadened to cover storage management, distribution, parallelism, AND programming languages. So imagine my delight when said company starts to develop the security features in the product, and I can get involved in that; and even more when (through old contacts) I'm invited to the inaugural meeting of a prestigious group of peopled interested in security. That landed me an invite to the second meeting (chaired by an actual Lord, and held in the House of Lords!), the highlight of which was of course getting to talk to the participants after the presentations. I found out about the Global Identity Foundation, who are working pn standardising the kind of pseudonymous identity framework I have previous pined for; I'm going to see if I can find a way to get more involved in that. But I need to do a lot of reading-up on the organisations and people involved in this stuff, and figuring out how I can contribute to it with my time and money restrictions.

I'd really like to have some quiet time to work on my secret fiction project, too. And I want to investigate Ugarit bugs. Some bugs in the Chicken Scheme system have been found and fixed lately, so I need to re-test all these bugs to see if any of the more mysterious ones were artefacts of that. I'm in a bit of a vicious circle with that; the longer it is since I've been tinkering with the Ugarit internals, the longer it'll take me to get back into it, and the more nervous I feel about doing so. I think I might need to pick off some lighter bit of work with good rewards (adding a new feature, say) and handle that first, to get back into the swing of things. Either way, I'll need a good solid day to dig into it all again; trying to assemble that from sporadic hours just won't cut it.

I'm still mulling over issues in the design of ARGON. Right now I'm reading a book on handling updates to logical databases - adding new facts to them, and handling the conflicts when the new facts contradict older ones, in order to produce a new state of the database where the new fact is now true, but no contradictions remain. I need to work this out to settle on a final semantics for CARBON, which will be required to implement distributed storage of knowledge within TUNGSTEN. I need a semantics that can converge towards a consensus on the final state of the system, despite interruptions in internal network connectivity within the cluster causing updates to arrive in different orders in different places; doing that efficiently is, well, easier said than done.

I really want to finish rebuilding my furnace, which I hoped to get done this Summer, but I'm still assembling the structural supports for it. I've made a mould to cast shaped refractory bricks for the lining of the furnace, but I've yet to mix up the heatproof insulating material the bricks need to be made out of and start casting the bricks, as I still need to work out how I'll form the tuyere.

I want to get Ethernet cabled to my workshop, because currently I don't have a proper place for working on my laptop; I have to do it on the sofa in the lounge to be within range of the wifi, which isn't very ergonomic, doesn't give me access to my external screens, and is prone to interruption by children. I find it very motivating to be in "my space", too; the computer desk in the workshop is all set up the way I like it. And just for fun, I'd like to rig the workshop with computer-controlled sensors and gizmos (that kind of thing is a childhood dream of mine...).

This past year, I've tried booking two weekend days a month for my projects, in our shared calendar. This worked well at the start of the year, with projects such as the workshop ladder and eaves proceeding well, but it started to falter around the Summer when we got really busy with festivals and the like. I started having to fit half-days in around other things, which meant spending too much time getting started and clearing up compared to actually getting things done, so my morale faltered; and with so much other stuff on, I've been increasingly inclined to spend my free time just relaxing rather than getting anything done. On a couple of occasions I've tried taking a week off work to pursue my projects, but I then feel guilty about it and start allocating days to spending more time with the children or tidying the house, and before I know it, five days off becomes one day of actual project work. I need to stop feeling guilty about taking time to do the things I enjoy, because if I don't, I'll be too tired and miserable to do a good job of the things I should be doing! And rather than booking my monthly project days around other stuff that's going on, next year I'm going to mark out my two days each month in advance, and then move them elsewhere in the month if Sarah needs me to do something on that particular day, to decrease the chance of ending up having to scrape together half-days around the month (or to skip project days entirely, as I ended up doing last month). I feel awful about saying I'm going to spend days doing what I feel like doing rather than the things the rest of my family need me to drive them to, but if I don't, I think I'm going to fall apart!

Now... off and on I've spent forty minutes writing this blog post. So with my whirling thoughts dumped out, I'm going to go back to bed and see if I can sleep this time around. Wish me luck!

Felix Winkelmann interviewed in Atomic Spin (by )

Here's an interview with my favourite Scheme implementer!

WordPress Themes

Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales
Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales