Category: Computing

Venti: Append-Only Storage Management (by )

Check out this PDF:

http://www.cs.bell-labs.com/sys/doc/venti/venti.pdf

It's a description of a new mass storage system designed for Plan 9. Venti offers permanent archival storage, with basically two operations:

  1. Given a block of up to 52KB of data (and some metadata, which is fairly irrelevant), the system returns an SHA1 hash of that data
  2. Given an SHA1 hash, return the previously stored block of data with that hash.

    Read more »

A new computer architecture (by )

I was in a hardware mood on a train a few months ago, so I typed up some notes about a possible different architecture for CPUs that might make good use of internal parallelism, have asynchronous control, and a high code density. The result would be an efficient CPU, but it does make interrupt handling a headache.

I then go overboard, designing a device interconnection framework for expansion! It was a boring train journey...

= UPDATE =

It occurs to me that a neat way of easing the problems of interrupt handling in highly asynchronous and parallel CPUs would be, quite simply, to not bother with them - let a dedicated I/O processor with an architecture designed for low-latency context switches handle them, and handle pre-emption of user code (eg, an interrupt saying that a block is ready for reading from the disk controller causing the code that was blocked waiting for that data to become runnable, which then preempts the current process since it has a higher priority) by allowing the I/O processor to instruct the main CPU to do so.

So the main CPU would still need context save and restore logic, but it wouldn't need to be able to nest IRQs or anything; all it would need is the ability to save the current context to RAM and then load another context from elsewhere in RAM, as an atomic operation, and it wouldn't need to be as fast about it as if it was in the critical path of interrupt handling.

Within my architecture below, this can be handled by having an (on-chip) interrupt processor which is a tiny stack-based MISC with local SRAM for code and data (shared with the main CPU). When an interrupt occurs, push the program counter (the only register!) of the MISC onto the stack, and jump to a vector in the SRAM. If the interrupt handler feels it needs the main CPU to reschedule, then it tells the CPU to switch to a context, giving it the address of the new context. The address the current context was loaded from is kept around in a register, so the CPU suspends instruction fetching, waits for all execution units to finish, then saves the contents of registers and FIFOs to the context in RAM, loads the new context pointer into the context pointer register, and then loads the new context and resumes execution. The CPU should check to see if the new context being switched to is the same as the old context (by comparing the provided pointer with that in the context register), and if so, do nothing.

Most of the time, the interrupt handlers would just cause a switch to a special scheduler context, which would work by choosing a new process to run, manually invoking a context switch to it, then looping back to the top of the code. While the scheduler context is running further requests from the MISC to reschedule will be ignored since that context is already active; this means that an interrupt causing a context switch request between (or during) the schedule algorithm and the context switch, which makes a higher-priority process runnable, may not result in that process running until the next context switch. So perhaps the CPU should handle a request to switch to the context it's already in by just reloading the context, which would restart the scheduling algorithm.

Any context switch request coming in while a switch is in process should be queued, perhaps suspending the MISC until the CPU is ready to perform the switch.

Read more »

SOCKS vs NAT (by )

The standard solution these days to the problem of a large internal network of client machines that need Internet access is to stock them behind a NAT, with a single external IP from which connections can originate.

However, before NAT was popular, I remember setting up SOCKS proxies, which did more or less the same thing.

The downside was that not all applications supported SOCKS, and annoyingly there generally wasn't an easy way of telling the whole machine to use SOCKS; each application had to be configured manually, while NAT works by providing the illusion of real Internet access.

The upside, however, is that SOCKS doesn't work by fooling anyone. The application knows that the IP address it gets from the local stack is not necessarily global, and can ask the SOCKS server what the global address is. And the application can ask for a listening socket, making peer to peer file transfers and the like work properly.

Perhaps rather than using NAT for more and more, we ought to be putting support for SOCKS right into the IP stacks of operating systems, so applications using the standard TCP/IP APIs work with SOCKS right out of the box, and specifying a DHCP option so a DHCP server can nominate a SOCKS server to a client machine?

Then we wouldn't have all this pain with peer-to-peer file transfers...

AJAX (by )

Eeeeeaaarrrggghhhhh....

People seem to have found a new way to make Web pages that aren't very usable for the blind.

AJAX is its name, and it's causing a lot of excitement about creating "rich user experiences".

I automatically cringe when I hear that. The Web works because HTML doesn't really define the "user experience", it defines the structure of a document with headings and text and links and stuff. You can layer on a "user experience" with some optional, ignorable, CSS. This means that screen readers can ignore the "user experience" designed by people who assume everyone can see a two-dimensional display surface, and examine the actual structure of the page.

Garbage collection (by )

One of the interesting little problems in converting a CPU, some RAM, and a bunch of I/O devices into a useful system is memory management.

Your everyday PC with 512MB of RAM contains about 512 million little memory cells, each of which contains an eight-digit binary number. This is where all running applications, and all the information they are dealing with, must be stored; programs, text, images, and whatnot are all encoded as sequences of these eight-digit numbers (known as bytes) and stuffed into memory.

The problem is - where to put everything? After all, when you double click on a program, the computer looks at the size of the program on disk, and needs that many bytes of memory to load the program into. How does it know which bits are free and which are in use? Different computers have different amounts of memory, and may be running any combination of apps already, so there's no way to reserve a given bit of memory for a particular program; the system has to keep track of what's in use and what isn't in real time. And when you load your app, it will start asking for bits of memory to keep track of the windows it has open, to store your document in, and so on.

Now, the first problem to be solved is how to keep track of which bits of memory are in use and which aren't, in such a way that the computer can efficiently find a block of free memory of a specified size - that problem is harder than it may seem, especially when you consider that multiple threads of execution will be requesting memory at once. But that's not the problem I was pondering as I sat on the train today.

My problem is how to figure out when a block of memory isn't used any more, so that it can be handed back to the system that keeps track of free blocks and reused.

WordPress Themes

Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales
Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales