Category: Crypto / security

A draft specificion for IRIDIUM (by )

As discussed in my previous post, I think it's lame that we use TCP for everything and think we could do much better!. Here's my concrete proposal for IRIDIUM, a protocol that I think could be a great improvement:

Underlying Transport

IRIDIUM assumes an underlying transport that can send datagrams up to some fixed size (which may not be known) to a specified address:port pair (where address and port can be anything, no particular representation is assumed); sent datagrams can be assigned a delivery priority (0-15, with 0 being the most important to deliver first) and a drop priority (0-15, with 0 being the most important to not drop), but the transport is free to ignore those. They're just hints.

You can also ask it to allocate a persistent source port (the transport gets to choose, not you), or to free one up when it's no longer needed, and to use that when sending datagrams rather than some arbitrary source port. The actual source port is not revealed by the underlying transport, nor is the source address, as it may not be known - Network Address Translation may be operating between the source and destination. All that matters is that it is consistent, and different to any other allocated source port from the same address.

The transport can also be able to be asked to listen on a specified port or address:port pair, and will return incoming datagrams that show up there. Replies to sent datagrams will also be returned, marked as either a reply to a datagram with no specific source port, or indicating which allocated source port the reply was received at (without necessarily revealing what the actual port is). Incoming datagrams may be provided with a "Congestion Notification" flag to warn that congestion is occurring within the network, and will all be marked with the source address and port they came from so that replies can be sent back.

The transport can also return errors where datagram sending failed; errors will specify the source and destination address and port of the datagram that caused the error.

(This pretty much describes UDP, when datagrams are sent with the Don't Fragment flag set and ECN enabled, and ICMP; but it could also be implemented as a raw IP protocol alongside UDP and TCP, or over serial links of various kinds, as will be described later).

The transport can also provide a "name resolution" mechanism, which converts an arbitrary string into an address:port pair.

Application-visible semantics

  1. Given an address:port combination, you can:
    1. Send a message of arbitrary size, with a drop priority (0-15) and a delivery priority (0-15). If the drop priority is 0, then the caller will be notified (eventually) of the message having arrived safely, or not. If the drop priority is non-zero, then the caller might be notified if the message is known not to arrive for any reason.
    2. Send a request message of arbitrary size, with a delivery priority (0-15). The caller will be notified, eventually, of non-delivery or delivery, and if delivery is confirmed, it will at some point be given a response message, which can be of arbitrary size. The response message may arrive in fragments.
    3. Request a connection, including a request message of arbitrary size, with a delivery priority (0-15). The caller will be notified, eventually, of non-delivery or delivery, and if delivery is confirmed, it will at some point be given a response message of arbitrary size and a connection context.
  2. Set up a listener on a requested port. Given a listener, you can:
    1. Be notified of incoming messages, tagged with the address:port they came from. They may arrive in fragments.
    2. Be notified of incoming request messages (which may arrive in fragments), tagged with the address:port they came from, and once the request has fully arrived, be able to reply with a response message of arbitrary size.
    3. Be notified of incoming connection requests (with a request message, that may arrive in fragments), tagged with the address:port they came from, and be able to reply with a response message of arbitrary size to yield a connection context, which you can immediately close if you don't want the connection.
    4. Close it.
  3. Given a connection context, you can:
    1. Do any of the things you can do with an address:port combination (optionally requesting ordered delivery, as long as the drop priority of any message is 0 and the delivery priority of anything sent ordered is always consistent).
    2. Do any of the things you can do with a listener (including closing it).
    3. Be notified of the other end closing the connection.

Datagram format

  1. Flag byte:
    1. Top 4 bits: protocol version (0000 binary).
    2. Lower 4 bits: datagram type (values defined below).
  2. 56-bit ID
  3. Type of Service (ToS) byte:
    1. Top 4 bits: drop priority.
    2. Lower 4 bits: delivery priority.
  4. Flags byte. From most significant to least significant bits:
    1. Control chunks are present.
    2. Ordered delivery (connection only).
    3. Rest unused (must be set to 0).
  5. If the "control chunks are present" flag is set, then we have one or more control chunks of this form:
    1. Control chunk type byte:
      1. Top bit set means another control chunk follows; unset means this is the last.
      2. Bottom seven bits are the control chunk type (values defined below).
    2. Control chunk contents (depending on the control chunk type):
      1. Type 0000000: Acknowledge
        1. 56-bit ID.
        2. 1 bit flag to indicate that the request being responded to arrived with the Congestion Notification flag set.
        3. 31 bits, encoding an unsigned integer: the number of microseconds between the request datagram arriving and the response datagram being sent.
      2. Type 0000001: Reject datagram
        1. 56-bit ID.
        2. Error code (8 bits).
        3. Optional error data, depending on error code:
          1. Type 00000000: CRC failure.
          2. Type 00000001: Drop priority was nonzero on a datagram other than type 0001 or 0110.
          3. Type 00000010: Spans were contiguous, overlapping, or extended beyond the datagram end in a type 1100 datagram.
          4. Type 00000011: Delivery priority differed between different ordered-delivery datagrams on the same connection.
          5. Type 00000100: Unknown datagram type code. Body is the type code in question, padded to 8 bits.
          6. Type 00000101: Unknown control chunk type. Body is the type code in question, padded to 8 bits.
          7. Type 00000110: Datagram arrived truncated at the recipient. Body is the length it was truncated at, as a 32-bit unsigned integer.
          8. Type 00000111: Datagram dropped due to overload at the recipient.
          9. Type 00001000: Temporary redirect. Body contains an address:port in the correct format for the underlying transport; caller should retry the datagram with the new destination.
          10. Type 00001001: connection ID not known for this source address:port / destination address:port combination.
          11. Type 00001010: Response received to unknown request ID.
          12. Type 00001011: Large datagram fragment is too small.
          13. Type 00001100: Large datagram begin rejected, it's too big and we don't have enough space to store it!
          14. Type 00001101: Large datagram fragment received, but we don't recognise the ID.
          15. Type 00001110: Ordered delivery flag was set on a datagram other than a connection message, request, or sub-connection open, or with a drop priority other than zero.
      3. Type 0000010: Acknowledge connection datagram
        1. 56-bit connection ID
        2. 56-bit datagram ID
        3. Acknowledgement data:
        4. Most significant bit is set to indicate that the request being responded to arrived with the Congestion Notification flag set.
        5. 31 bits, encoding an unsigned integer: the number of microseconds between the request datagram arriving and the response datagram being sent.
      4. Type 0000011: Reject connection datagram
        1. 56-bit connection ID
        2. 56-bit datagram ID
        3. Error code (8 bits).
        4. Error data, depending on error code, as per Type 0000001.
  6. Then follows the datagram contents (depending on the datagram type), all the way up to the end of the datagram minus 32 bits:
    1. Type 0000: Control chunks only - no contents.
    2. Type 0001: Message: Application data.
    3. Type 0010: Request: Application data.
    4. Type 0011: Response:
      1. Embedded ACK:
        1. Most significant bit is set to indicate that the request being responded to arrived with the Congestion Notification flag set.
        2. 31 bits, encoding an unsigned integer: the number of microseconds between the request datagram arriving and the response datagram being sent.
      2. Application data.
    5. Type 0100: Connection Open: Application data.
    6. Type 0101: Connection Open Response:
      1. Embedded ACK (see above).
      2. Application data.
    7. Type 0110: Connection Message:
      1. 56-bit Connection ID.
      2. Application data.
    8. Type 0111: Connection Request:
      1. 56-bit Conection ID.
      2. Application data.
    9. Type 1000: Connection Response:
      1. 56-bit connection ID.
      2. Embedded ACK (see above).
      3. Application data.
    10. Type 1001: Sub-Connection Open:
      1. 56-bit connection ID.
      2. Application data.
    11. Type 1010: Large datagram begin
      1. 4-bit inner datagram type (values marked with * are not valid)
      2. 4-bit size-of-size (unsigned integer). Eg, 0000 means a 16-bit size, 0001 means a 32-bit size, 0010 means a 64-bit size. Values above that are probably unneeded on current hardware.
      3. 16 * 2 ^ size-of-size bits of size (unsigned integer), the size of the inner datagram in bytes.
      4. Some prefix of the inner datagram, at least half the estimated maximum packet size.
    12. Type 1011: Large datagram fragment
      1. 16 * 2 ^ size-of-size bits of offset (unsigned integer), the byte offset in the inner datagram that this fragment begins at.
      2. Some fragment of the inner datagram, at least half the estimated maximum packet size.
    13. Type 1100: Large datagram ack
      1. 16 * 2 ^ size-of-size bits of offset (unsigned integer) of the last "Large datagram fragment" received (or all 0s if the last received fragment of this large datagram was the prefix in the "Large datagram begin")
      2. 1 bit: if set, the list of missing spans in the body of the datagram is complete. If not set, more will be sent later.
      3. 31 bits, encoding an unsigned integer: the number of microseconds between the latest fragment datagram arriving and this acknowledgement being sent.
      4. 16 bits (unsigned integer) for the number of fragments received since the last update.
      5. 16 bits (unsigned integer) for how many of those arrived with the Congestion Notification flag set.
      6. (Repeated until end of datagram contents):
        1. 16 * 2 ^ size-of-size bits of offset (unsigned integer) of a section of the large datagram that has not yet been received.
        2. 16 * 2 ^ size-of-size bits of length (unsigned integer) of that section.
        3. (The offsets must increase from start to end of the datagram, and no sections may be contiguous or overlapping).
    14. Type 1101: Connection close. Contents is a 56-bit connection ID.
    15. Type 1110: Large datagram cancel. No contents.
  7. CRC32 of all of the previous bytes.

What the heck?

That's a lot to take in in one sitting, so I'm now going to go through what the IRIDIUM implementation needs to send in various situations.

Sending a small message to a host

Firstly, pick a random 56-bit ID. What you use for randomness is up to you, but it should be hard to predict in order to make spoofing tricky.

Wrap the message up in a type 0001 datagram, with your chosen delivery and drop priorities in the header, and send it off. If the drop priority was non-zero, then wait a while for a type 0000000 (acknowledge) control chunk to come back acknowledging it (the ID in the control chunk will match the ID we sent). If one doesn't come in a reasonable timeframe, send the datagram again, with the same ID.

If you get back a type 0000001 (rejection) control chunk, if the datagram was truncated, try using the large message process (and using the truncation point as an estimate of the path MTU). For any other rejection type, something probably got corrupted, so try again. Give up if you keep retrying for too long.

If the underlying transport returns a path MTU error, try using the large message process with the advised MTU as the path MTU estimate. If the underlying transport returned some other error, retry as before.

Receiving a small message

If you get a datagram with type 0001, you've received a message!

If the datagram is valid, and the ID isn't in your list of recently seen message IDs (otherwise, just ignore it), you can pass it to your application. If it has a zero drop priority, the send is expecting an acknowledgement, so craft a type 0000000 control chunk acknowledging it. You can send it in any datagram you were going to send to that address:port anyway, or create a type 0000 datagram just for it if there's no other traffic headed that way.

If the datagram is invalid, craft a type 0000001 (rejection) control chunk explaining why. Again, you can embed it in an existing datagram of create a type 0000 datagram just for it.

Sending a small request to a host

This starts off the same as sending a message: you pick an ID, and send your request in a datagram of type 0010 with the chosen delivery priority (drop priority isn't supported for requests).

As with the message case, keep retrying until you receive an acknowledgement. However, you may also directly receive a type 0011 datagram containing a response, which has an implicit acknowledgement inside it.

When you receive a type 0011 response datagram with the same ID as your request, send an acknowledgement in a type 0000000 control chunk.

Receiving a small request

If you get a datagram with type 0010, you've received a request!

If the datagram is invalid, craft a type 0000001 (rejection) control chunk explaining why and send it back.

If the datagram is valid, and the ID isn't in your list of recently seen request IDs (otherwise, just ignore it), you can pass it to your application and await a response.

If you don't get a response from the application within a reasonable timeframe, craft a type 0000000 control chunk to acknowledge the request and send it back.

When you get a response from the application, if it looks like it'll fit into a single datagram and be no larger than twice the size of the request datagram (to avoid amplification attacks), craft a type 0011 datagram with the same ID as the request and send the response back. As with sending a small message, you'll need to wait for acknowledgement, and retransmit the response if you don't get it or if you get a rejection back that looks like it was corruption (don't bother retransmitting if you get a type 00001010 error code, indicating the request ID wasn't recognised).

If your response looks too big for a datagram, or you got a rejection saying it was truncated or the underlying transport rejects it saying it was too big, you need to follow the "sending a large response" flow.

Sending a large message, request, or response

If you want to send a message, request, or response that you know won't fit in a single datagram, or you tried it and got got rejected for being too big or arriving truncated, you need to send it as a large datagram.

Form an estimate of the largest datagram you can send, using whatever information you have to hand. Take a guess.

Send a type 1010 datagram containing the details of the large datagram and as many of its initial bytes as you think you can fit. If this is a response to a single-datagram request outside of a connection, do not send an initial datagram more than twice the size of the request datagram (to mitigate amplification attacks).

If the type 1010 datagram gets acknowledged in a type 0000000 control chunk, start sending type 1011 datagrams with further sections of as many bytes as you think you can fit, in order; if it gets rejected by a type 0000001 control chunk, don't.

When the recipient starts sending back type 1100 "large datagram ack" datagrams, use that to update your knowledge of what fragments of the large datagram have been received. Any fragment that has been sent and not acknowledged within a reasonable timeframe should be retransmitted. If you never get any acknowledgements after a while, give up.

If you receive evidence that the datagrams are too large for the link, reduce the size of datagram you are sending. If you receive a rejection with error code 00001101, abort sending. If you want to give up sending, send a type 1110 large datagram cancel datagram with the same ID.

When the recipient has sent a type 1100 "large datagram ack" that is marked as complete and indicates that no fragments are missing, you can stop.

Receiving a large message, request, response, connection open, or connection open response.

If you receive a type 1010 datagram, you're getting a large datagram! You can reject it with error code 00001011 if it could easily have fitted into a normal datagram. Anyway, look at the ID and the type code and decide if you want to accept it or reject it (eg, if it's a response to an ID you don't have an oustanding request for, it's bogus). You might even examine the prefix of the body, which is included in the request, to make that decision.

If you want to reject it, send a suitable rejection control chunk. If you want to accept it, send an acknowledgement control chunk and pass the total length and prefix to the application as an incoming large datagram. Keep track of what sections of the large datagram you're missing, which will initially be everything except the prefix.

Any type 1010 or 1011 datagram that is too small can be rejected with a control chunk rejection code of 00001011. Most underlying transports have a minimum MTU, and datagrams smaller than that didn't need to be split into fragments, so anybody doing so is probably just trying to waste your resources. The exception, of course, being the final fragment of the datagram!

If you receive a type 1011 datagram with an ID that doesn't match a current large datagram receive in progress, reject it with a control chunk rejection code of 00001101.

All parts of a large datagram should come from the same source address:port and to the same destination port. If any don't, reject them with error code 00001101.

As type 1011 datagrams with the same ID flood in, pass them on to the application along with their fragment offsets, and subtract them from the list of missing sections - unless you already had that fragment, in which case ignore it. There may be an overlap, in which case, only send the new data (for instance, the sender sent two 16KiB fragments, only the second of which arrived; but the send mistakenly thought that it was sending too-large datagrams and tried again, sending 16KiB fragments - the first of which is entirely novel and can be sent to the application, the second of which overlaps the second 16KiB received originally, so the application should be sent only the first, missing, 6KiB; and the third one falls entirely inside the initially-received second 16KiB fragment so can be ignored).

At reasonable intervals, send a type 1100 datagram containing the offset of the most recently received fragment, how long since you received it, the count of fragments received since the last type 1100 datagram, how many of them had congestion warnings, and the current list of missing spans. If the whole list won't fit in a datagram, just cut it short and don't set the "this list is complete" bit; as we send the list in order, we will be omitting spans towards the end of the large datagram, and as the sender sends fragments in order, it is less likely to care about them yet anyway.

If you receive nothing for a while, give up. If you receive a type 1110 large datagram cancel datagram with the same ID, give up. If you want to give up for your own reasons, send a control chunk rejection with the ID of the large datagram and error code 00001101 indicating that you no longer recognise the ID.

When you have all of the large datagram, send a type 1100 datagram confirming that you are missing no parts (and make sure to mark it as complete information), to tell the sender they can finish.

Opening a top-level connection

Every connection should have a dedicated source port (and consistent source address), so ask the underlying transport to allocate you a consistent outgoing port and use it for all traffic pertaining to that top-level connection and all its subconnections. Any datagram pertaining to a particular connection that arrives on the wrong port should be rejected with an error code of 00001001.

To request opening a connection, pick an ID for the connection and send a type 0100 (connection open) datagram containing the initial application data accompanying the request. If it's too large for one datagram, use the large datagram process above.

Much as with a request, you'll either get back a type 1000 datagram with the response, or a control chunk of type 0000000 acknowledging it then a later type 1000 response - or maybe a control chunk of type 0000001 rejecting it. If you hear nothing within a reasonable timeframe, retry. The type 1000 datagram may be a large one, in which case you'll get it embedded inside type 1010 and 1011 datagrams, as usual. Report this back to the application, and keep a track of that connection ID.

Closing a connection

Send a type 1101 datagram with the connection ID inside it and its own unique ID. It must have the "ordered delivery" flag set if you want to ensure that any ordered datagrams sent are actually delivered, as it could arrive before them; or omit it if you just want the connection closed quickly. As usual, if you don't get an ACK, retry for a reasonable timeframe.

Answering a top-level connection request

If you receive a type 0100 datagram (possibly embedded in a large datagram), somebody wants to open a connection to you. Keep track of the source address:port, we'll need it later!

Pass the embedded request data to the application and await its response. If you don't get one soon, send an acknowledgement control chunk to stop the sender retrying. When you get a response, reply with a type 1000 datagram containing the response (again, embedding it in a large datagram if it's too big).

If the application doesn't really want the connection, it will express that somehow in its response, and immediately close the connection afterwards.

Sending unordered messages or requests responses, or connection closes within a connection

These follow exactly the same processes as above, except that we use different datagram types: type 0110 for a message, type 0111 for a request, and type 1000 for a response. We also use type 0000010 control chunks to acknowledge datagrams and type 0000011 to reject; all of these are the same as their normal types except they include an additional 56-bit connection ID.

Receiving unordered messages, requests, responses, or connection closes within a connection

These follow the same processes as above, except we use the different datagram types, and find the connection ID inside the extended datagram formats in order to know which application-level connection to report the datagram to. We also check that the incoming source address and port matches what we expect for the connection ID.

Opening sub-connections within a connection

Again, we follow the same process as opening a top-level connection, except we use a datagram of type 1001, including the parent connection ID. We use the normal connection accept/reject datagram/control chunk types for the rest of the protocol; the link between the connection and its parent is established only during the open process.

Sub-connections use the same source port as the top-level connection.

Answering a sub-connection request

This is just like opening a top-level connection request, but with a type 1001 datagram initiating the process, and indicating a parent connection ID which we use to tell the application what connection this is a sub-connection of.

Sending ordered messages, requests, or sub-connection opens within a connection

Ordered messages must have a drop priority of zero, and the delivery priority of all ordered datagrams must be consistent for the lifetime of a connection - whatever delivery priority we first use, all subsequent ones must be the same. You will receive a rejection with an error code of 00000011 for changing the delivery priority during a connection, or 00001110 for violating the static invariants.

To send ordered datagrams, follow the usual process for that type of datagram, but set the "ordered delivery" flag - and ensure that the ID of the datagram is one larger than the ID of the previous ordered datagram within that connection (with wraparound from 2^56-1 to 0). To keep that ID "available", the process of randomly selecting IDs for datagrams within a connection should avoid ones within some safety margin above the current last-ordered-datagram-ID. Of course, it's fine to re-use datagram IDs "after a while" (see below), and 56 bits is a large space for randomly-chosen values to not collide in, so this should be easy to ensure.

Note that the sequentially increasing ID constraint is independent for each direction of the connection.

Notes

Check source addresses!

As I have mentioned at various points above, but not exhaustively, we can be very picky that the source address and port of a datagram matches what we expect:

  1. A response should come from the address:port we sent the request to.
  2. Any control chunks pertaining to a datagram ID should come from the address:port we sent that datagram to originally.
  3. All fragments, cancel datagrams, and control frames pertaining to a large datagram should come from the same address:port.
  4. All datagrams pertaining to a top-level connection and all its subconnections should come from the same address:port.

This is a measure to make it harder to spoof datagrams to interfere with others' communications - but if a 56-bit random ID is sufficient for that alone, we could relax all those restrictions. This would have the following benefits:

  1. Less storing of expected source ports in protocol state, saving memory.
  2. Less checking of source ports against expected values, saving time.
  3. Most importantly, if a mobile device changes IP address due to switching networks, its in-progress communications will succeed. Responses/acks/rejections/etc send to its old address will be lost into the void, but that will be handled like any other lost packet - the mobile device will retry sending things from its new IP, and the recipient will receive them and subsequent replies to those new datagrams will be routed to its new home.

There is one problem, though: the state for a connection, or an in-progress large datagram send in either direction, MUST include the address:port of the other end, so that new datagrams can be sent there. That is set when the connection or large-datagram starts, but at what point do we update it if we start receiving datagrams for that connection from a different address:port?

If an attacker guesses the connection ID, they could send a single datagram using that connection ID to "hijack" the connection, by having their source address:port set as the new far end of the connection so they receive all subsequent traffic; or an attacker could send a request that triggers a large datagram response and then spoof a packet with the same request ID from a victim host to bombard them with large-datagram fragment traffic.

The cryptographic layer I propose adding on top would fix that (see below) for connections, but more careful thought is required before enabling this. Perhaps when a change of source address:port is detected, communication on that connection/large datagram send should be paused and a probe datagram of some kind sent, containing a random value which must be returned (so a spoofing attacker can't fake the acknowledgement), before traffic resumes?

Note that this needs to happen for both ends of a large datagram transmission - sender or recipient could migrate. And, of course, once opened, connections are symmetrical - so with time, both ends of a connection could migrate to different addresses and ports, multiple times!

Picking IDs

Every datagram has an ID. They are not necessarily unique - a request and a response will have the same ID to tie them together, and the start, fragments, acknowledgement, and cancellation of a large datagram will all have the same ID. In fact, if a request is large and the response is large, then the same ID will be used for the process of sending the request in fragments and then for the process of sending the response back in fragments; these two processes can't overlap in time, so there is no ambiguity.

However, other than those allowed forms of re-use of the same ID because they pertain to related datagrams, IDs must be unique with the context of (for non-connection traffic) the source address:port and destination port, and (for connection traffic) the source address:port, destination port, and connection ID. Receivers will keep track of the IDs which further datagrams are expected for in any given context (eg, the response to a request that was sent out, or an in-progress large datagram transfer) so that incoming datagrams can be routed to the appropriate process, and they will also keep track of "recently used" IDs when those IDs are no longer expected again, so that any duplicated datagrams can be quietly rejected.

The "recently used" IDs should be kept for a reasonable time period - long enough for any lurking datagrams trapped in the network to have drained out. It is suggested that the process of picking a new ID be random (with the exception of ordered datagrams in a connection, although the initial ordered datagram ID in each direction in a connection should be random). This makes it harder for third parties in the network to forge rejection or close datagrams and mess with our communications. A sender could, for some extra robustness, check that a randomly allocated ID is not present in the list of current or recently used IDs in that context, as well as ensuring that a new ID is not too close to the sequential ID counter of the enclosing connection (if any) to risk a possible collision.

Analysis notes

Control chunks are never retransmitted; no record is made of which control chunks were carried with a datagram, so if a datagram is retransmitted, then it will just carry whatever control chunks were waiting at the time of the retransmit. Type 0000 datagrams are never acknowledged and hence never get retransmitted, nor are type 0001 or type 0110 (message and connection message, respectively) datagrams that have a nonzero drop priority, but all others are (although large datagram fragments have their own custom retransmit mechanism rather than per-datagram acknowledgements).

TODO

To make that a proper protocol specification, I need to:

  1. Write each process as a proper state machine.
  2. Define the exact response to every different kind of error code.
  3. Clarify all the validity checks, and how to respond to them with an error, and what to do next.

A few improvements I've already thought of are:

  1. Having duplicate datagram types for connection or connectionless versions of things, differing only in having a connection ID added, is perhaps over-complicated. Consider using one of those unused flag byte bits to say "Is it a connection datagram?", and if so, add a connection ID - only on datagram types that make sense inside a connection.
  2. When a request has been sent and the application at the receiving end is taking a while to respond, an acknowledgement control chunk is sent back and then the caller waits forever for a reply. If the server disappears, the caller has no way of knowing. Also, the caller has no way of requesting that the server abort if the caller realises it doesn't need the response. The server end should be made to send additional acknowledgement control chunks at regular intervals to "keep-alive" the request, so the client can retry or give up if they stop appearing. Also, define a chunk type to cancel a request, sent with the same ID as the request.
  3. The same applies for an open connection - if you've not sent anything on that connection within some reasonable timeframe, send a "keep-alive". If the far end hasn't sent you one within a reasonable timeframe, consider the connection closed by a failure (send the application an error rather than a normal close).
  4. Make the acknowledgement control chunk include the datagram type that it's acknowledging as well as the ID, so datagram senders can differentiate acknowledgement of a request datagram and acknowledgement of the cancellation of that request, the latter so it can stop retransmitting the cancel.
  5. Writing all those type codes in binary and then referring to them in the text by their binary numbers sucks. Use hexadecimal, and refer to the type names in the text rather than the numbers.
  6. I should clarify the existing of a "transaction", that being any operation which requires some synchronisation between two peers. Sending a single-datagram message and awaiting an ACK is a transaction; sending a request and awaiting a response and ACKing the response is a transaction; sending a large message in fragments is a transaction; any connection is a transaction (wich sub-transactions for any of the above happening over that connection, or sub-connections), etc. Making these clearer makes it easier to analyse the protocol. The sections on sending/receiving each side of the transaction should probably be merged together to clarify this.
  7. I have boldly spoken of underlying transport errors being detected in various cases in the spec, but of course we don't know exactly what datagram ID an error is returned by - just the source and destination address and port. This probably isn't a problem as the errors received generally apply to the host as a whole being unreachable or requiring smaller datagrams or the destination port not listening at all, which are broad-scope things that apply to all transactions in progress to that address or address:port pair, but the spec needs to clarify this.
  8. To avoid connectionless responses to be used as an amplification attack (send a small datagram to an innocent server requesting a large response, but with a spoofed source address pointing at your victim, to flood them with unwanted traffic), I've mandated the the response datagram must be no bigger than twice the size of the request or it goes into the large-datagram mode, which sends an initial datagram (again, no larger than twice the size of the original) then awaits an ACK before sending the rest. Might it be worth including optional padding (which must be zeroed) in a connectionless response, so that senders can increase the allowance for single-datagram responses, in effect performing a "bandwidth proof-of-work" by using some of their outgoing bandwidth? I'll need to do the maths on the cost of the padding vs. the cost of extra round-trips and headers to send a small response as a multipart datagram.
  9. I need to go through every datagram sent in the protocol and check that:
    1. The effect of it going missing isn't catastrophic, eg something is retransmitted until the implementation gives up trying.
    2. The effect of it being duplicated isn't catastrophic, eg the deduplication logic can make sense of it.
  10. Should I include any useful functionality at the lower layers, that the normal IP layer fails to provide for us in a helpful manner? The section about source address validation already opens the question of embedding a connection mobility mechanism. The protocol is already designed so that an implementation connecting to a well-known public address and port should get bidirectional communication even if they are behind NAT, by carefully not caring about the source address of datagrams other than to route replies back; a mobility mechanism will make this more robust by allowing recovery from loss of NAT state in the network, treating it as a migration of the connection to a new source address:port. But should we embed STUN support?
  11. To prevent large datagram responses being used as an amplification attack, I need to include a random value in the large datagram begin which must be returned in a special acknowledgement control chunk. Otherwise, an attacker can send a spoofed request followed by a spoofed large datagram begin acknowledgement control chunk, from the victim address.

How to implement this

The implementation of the above splits neatly into layers, which should make implementations easy to reason about. Of course the layers might get all smushed together in practice for performance reasons or something, but in theory at least, they can be isolated relatively cleanly.

UDPv4 underlying transport

The underlying transport semantics have a fairly obvious mapping to UDPv4. The UDPv4 transport needs to request ECN on outgoing packets by default, but it also needs to maintain some per-peer state (using the peer cache defined in the next layer) to detect non-ECN-capable transports and stop requesting ECN where it seems to be causing problems. See section 4 of RFC6679.

For name resolution, we should allow the caller to specify a service name and default port as well as the name. Then if we are passed a DNS hostname, we can attempt to look up the service via SRV records, or if that fails, look up an A record and use the default port. Passing in a raw IP address and skipping resolution (using the default port, or a specified port for IP:PORT strings), or a domain name and a port and skipping SRV lookup, should all be supported. This should work out of the box with mDNS .local hostname resolution if the underlying resolver supports it, of course, and DNS-SD is explicitly out of scope.

Flow control, retransmission, datagram encoding

The lowest level of the stack is the flow control engine, whose job is to send datagrams to and from the underlying transport, handling retransmission and flow control.

The packet formats and protocol above don't define how flow control happens, but they do provide the tools to do it; the actual flow control algorithm is up to the implementation, and may change without changing the protocol specification.

The implementation maintains a cache of known peers, based on the address only (not port). For each peer, we store its address, an estimated background packet loss rate, an estimated available bandwidth to that peer in bytes/tick (the tick is some arbitrary unit of time, perhaps 100ms), a leaky bucket counter in bytes, an estimated maximum datagram size (known as MTU), an estimated round-trip time (RTT), an observed datagram send rate in packets/second, an observed datagram retransmission rate in datagrams/second, an observed datagram congestion-notification rate in datagrams/second, an observed outgoing bandwidth usage in bytes/second, and a last-used timestamp. The peer cache also contains some data for use by the underlying transport for that peer (identified by the address type, if multiple underlying transports are in use), the format of which is opaque to this layer and is merely provided to the underlying transport on every operation.

I've hand-wavingly referred to "reasonable" timeframes for giving up on receiving a datagram or control chunk, in the protocol specification, and the estimated RTT to a peer is used to to tune this, plus some allowance for processing time at the destination - which also places an upper bound on how long an implementation can buffer control chunks for before sending them, and how long to wait for a response from the application before sending an explicit acknowledgement control chunk. I need to pick a reasonable processing timeframe and write it in the specification, as both sender and recipient need to agree on it (but allowing it to be negotiated in the protocol itself might allow for DoS attacks, by setting it unreasonably low to put a burden on the other end to respond quickly, or unreasonably high to disallow the peer from giving up quickly and throwing away state about things you never respond to).

When a request comes in to send to a previously unknown address, we initialise the peer entry with some sensible defaults: assume no background packet loss, and imagine we might have some default amount (say, a 100 kilobytes/second) of bandwidth; the MTU should be estimated (for the Internet, it's probably 1500) and the RTT set to some default (10ms?), and the observed datagram rates and leaky bucket counter should start at 0.

Peers can be dropped from the cache to make space, perhaps based on their last-used timestamp being excessively long ago.

This layer has a queue of outgoing datagrams from the layer above, and a queue of outgoing control chunks. If there are outgoing control chunks that have been waiting for longer than the maximum buffering time and no outgoing datagrams destined to the address:port pair that control chunk is for, then a datagram of type 0000 is automatically fabricated in the outgoing datagram queue. The outgoing queue is ordered by delivery priority, then datagrams that are retransmissions get to go before new datagrams, and datagrams with the same delivery priority and retransmission-ness are round-robin interleaved across IDs and connection IDs to ensure fair delivery.

Outgoing datagrams pulled from the top of the queue are inspected to see if they have any space left (by subtracting their size from the estimated MTU); as many control chunks destined to the same address:port as will fit are slipped into the datagram.

To send a datagram, the leaky bucket counter is inspected. If it's more than the estimated available send bandwidth, we wait until the send bandwidth has reduced. Once the leaky bucket counter is less than the estimated available send bandwidth, we send the datagram to the underlying transport and add its size to the leaky bucket counter. Update the datagrams sent and bytes sent rates, perhaps by using exponentially decaying moving averages.

Every tick, the leaky bucket counter is reduced by the available send bandwidth per tick, but never taking it below zero.

The implementation may choose to consider some datagrams as "urgent" - certainly not ones with application data in - and send them as soon as they're in the queue without waiting for the leaky bucket counter to reduce. Their size is still added to the leaky bucket counter, though. I've not yet thought about the situations where this should happen.

Datagram types that should be acknowledged - everything apart from type 0000 (control chunks only) and types 0001 or 0110 (messages) that have a non-zero drop priority - are kept in case they need to be retransmitted. Received datagrams are inspected to find acknowledgements and rejections (acknowledgements may be explicit control chunks, or implicit in responses), and the corresponding datagrams removed from the retransmission pool. They can also expire from the retransmission pool if no response ever arrives. Expiries are reported up the stack, just like rejections. Retransmissions are sent with the same delivery priority as the original datagram, but ahead of any other queued datagrams at the same delivery priority.

Acknowledgements are carefully examined. Every acknowledgement contains a congestion notification flag and a delay time in microseconds. The presence of the congestion notification flag should be tracked in the observed datagram congestion rate. The delay time should be subtracted from the measured time between when the original datagram was sent and the acknowledgement received, and considered a sample of the round-trip time; use an exponentially decaying moving average to update the current estimate. Rejections can be assumed to have a zero delay, and their raw round-trip time also used to update the RTT estimate.

When datagrams are retransmitted, update the average datagram retransmission rate. The ratio between this and the send rate gives us a recent datagram loss rate. We need to store a bit more state in the peer structure (I've not quite figured out what yet) to help it adjust the estimated send bandwidth up and down to try and find the point at which the datagram loss rate just starts to rise due to congestion, based on the observed datagram loss rate and the observed datagram congestion-notification rate, while being aware of underlying datagram loss due to link problems and not mistaking it for congestion, so we can obtain good utilisation of noisy links.

This layer is also responsible for managing the queuing of incoming datagrams being sent up to upper layers; if the incoming datagram queue is overflowing, datagrams should be dropped (those with the highest drop priority first) and rejections with error code 00000111 sent. The queue is, of course, ordered on delivery priority. It needs to silently discard duplicates, based on the datagram ID, type, and other type-specific fields which can be combined to form a "datagram de-duplication key" (DDDK). It needs to maintain a small cache of recently received DDDKs and reject any datagrams that arrive with the same DDDK.

When datagrams are converted to an actual byte stream to send to the underlying transport, which is the responsiblity of this layer, then any delay time fields in acknowledgements are filled in; this means that the datagram representation in the queue must include the receipt timestamp of the datagram it is in response to, and received datagrams must be timestamps as soon as they arrive, before being queued for upper layers to process.

Large datagrams / path MTU discovery

The previous layer gives us reliable flow-controlled datagram transport, but only for datagrams small enough to fit through the network.

This layer notices when datagrams sent by the layer above are too large for the estimated MTU of the destination peer, and converts them into fragments. It needs to watch the outgoing queue, not flooding it with fragments faster than they can be sent, but also keeping enough in there to keep the layer beneath supplied; perhaps fragment generation should be suspended whenever the queue size is above some threshold.

It also notices incoming underlying-transport errors pertaining to over-sized datagrams, and rejections due to datagrams arriving truncated, and reduces the MTU estimate to suit.

Perhaps it can also occasionally (using some extra state in the peer structure in the form of a last-MTU-update timestamp), if the MTU hasn't changed in some time, try increasing it a bit to see if it works. Sometimes the path MTU will increase, and it would be nice to not be "stuck" at a small MTU forever, harming efficiency.

And, finally, it's responsible for handling reassembly of incoming large datagrams, as described above. It should still issue the fragments as datagrams to the layer above (rather than trying to buffer a massive datagram in memory before passing anything up), but it will do all the duplicate/overlapping fragment removal and error handling as described in the protocol.

Communications between this layer and the one above need not be in the form of a queue, but direct calling - in effect, a queue with length zero.

Request/Response tracking

The next layer up can handle both sides of the request/response protocol; all layers below this have just dealt with datagrams flowing around in isolation, but here we tie requests and responses together. This is pretty trivial, so it might be worth merging it in with the next layer.

Connections

This layer handles keeping track of connections, implementing the connection open/close protocol, and attaching connection IDs to messages, requests, responses, and sub-connection opens within that connection.

ID allocation

To wrap all of the above into an API that can be used by actual applications, the only detail remaining is the ID allocator for new datagrams and connections!

Overall notes

The representation of datagrams (or application data blocks) used in the implementation should be some kind of list-of-segments rather than a block of contiguous memory. This means we can efficiently:

  1. Prepend headers to blocks of data without copying the block of data
  2. Generate fragment datagrams as references to subsequences of the original datagram, again without copying

The peer cache should, ideally, be shared across all users of the physical network interface so that they can share information about peers. On a UNIX system, it should be in some kind of shared memory (perhaps a small Redis instance?) so different processes can share it, if practical!

Future Extensions

There's a reason I included that four-bit protocol version field...

Multicast

Connectionless messages can map naturally to a multicast environment. If the underlying transport supports multicast addresses, then it would be possible to send a message to a multicast address (including large message datagrams having to be fragmented). As there is no scope for retransmission, then the information contained in the "large datagram begin" datagram would need to be replicated into every fragment, requiring a new datagram type code. Not all fragments might arrive, and applications would have to tolerate this.

Flow control would be a matter of being limited by the outgoing network interface, and either sending messages at some underlying rate (eg, sampling a sensor every second, or the data rate of a video stream), and maybe having some kind of degradation due to layering differential-quality-enhancement streams at different drop priorities.

Applications would need to be able to register to receive from a multicast address, a bit like opening a connection that they then receive messages on.

Forward error correction

If we can detect or predict high loss when sending multiple datagrams to a peer (or when sending to a multicast address), we might consider sending additional datagrams that contain error-correction information that can be used to reconstruct lost datagrams. This would probably involve collecting outgoing datagrams to the same destination into "groups" of some size, identified by a group ID, and adding some extra datagrams to the group, using a suitable forward error correction code.

If we notice a lot of CRC mismatch rejections coming back from hosts, we might also decide to start including some forward error correction data within each datagram, so that errors can be corrected!

Encrypted connections

I've kind of assumed that all the application data being thrown around will be signed and encrypted with some suitably secure technology that's beyond the responsibilities of a transport protocol, but there's one place where it would be beneficial to let encryption intrude at this level: the request/response data that gets to piggyback on the top-level connection open and connection open response could also include the first steps of a cryptographic mutual authentication and session key generation protocol, which can be carried on using connection messages and requests/responses if the connection open succeeds. Once a session key is established, then the connection's entire datagrams could be encrypted thereafter. If the connection can be identified by the source and destination address and port, then the appropriate decryption for that connection can be applied to received datagrams. To support this, the stack as presented above would require the ability for the encryption layer above it to specify a datagram-level encryption/decryption engine for a top-level connection, applied just between the flow control layer and the underlying transport, and applied to all datagrams once it has been enabled.

But this has an issue: To make the proposed connection mobility mechanism (not restricting connections to a fixed source address:port) work, we would need to identify connections by destination address:port alone - requiring a dedicated destination port per-connection, while the specification as given only requires connections to have a dedicated source port, allowing them to share the destination port.

To make that work, we could make the recipient of a top-level connection open request pick a dedicated port for the connection on their end, just as the sender of the open request picked a dedicated source port. The recipient then communicates that destination port back to the requester of the connection by ending the connection open response datagram from it, and when that datagram is received, its source address and port are used as the address to send datagrams within that connection to thereafter.

Note that this provides encryption/authentication purely for connections. Single request/response transactions are responsible for providing encrypted and/or authenticated requests and responses which, as they happen outside of a connection, will have no pre-existing security context unless one exists in some higher-level protocol (and I hope to make the connection mechanism in IRIDIUM good enough to largely make that unnecessary), so will need to be akin to PGP; fancy perfect-forward-secrecy ratchet protocols will only be possible via connection-based security protocols.

Serial transport

I have interest in communication via Unix domain sockets, stdin/stdout of subprocesses, over serial lines, and over point-to-point radio links. To do that, I'd like an underlying transport that handles point-to-point serial streams with possible byte loss. This would be a relatively simple matter of encoding datagrams using a suitable framing format that can resynchronise if bytes are lost, and either having null port identifiers or sticking a byte or two of port number in front of the datagrams if we want to multiplex things over a single link. There would be no inherent MTU of such a link, but we should enforce a configurable MTU to improve the responsiveness of multiplexed connections.

The name of the Unix domain socket, or the subprocess command, or the identifier of the serial port, would suffice as the address part for those transports.

UDPv4 VPN support

For cases where there's a pre-defined relationship between a group of hosts, it would be nice to define an encryption layer below IRIDIUM. You can do this at the operating system level with VPNs, which has the advantage of covering ALL traffic between those hosts, but is also a faff to set up and tends to be platform-specific. For IRIDIUM purposes only, the ability to set up an underlying transport on a group of hosts that joins a Tinc VPN entirely in userland could be useful?

UDPv6 transport

It would be a logical extension to support UDPv6 as well as UDPv4.

Other underlying transports

In theory, IRIDIUM could live alongside UDP and TCP as a native IP protocol. This would save us the 16 bits of the UDP checksum (which is superceded by our CRC-32), but that's not a huge deal. The downside would be that we'd need to implement IRIDIUM inside operating system network stacks and get that deployed, and likewise build support into packet shaping routers and firewalls, which would be a tiresome and unrewarding process to save 16 bits per datagram... so, I don't think so! Running on top of UDP is fine for now!

An implementation as a raw Ethernet frame type might be a fun exercise, and potentially useful in some embedded applications, but it's not on my radar personally.

An implementation on top of LoRa or HaLow might be more interesting, though!

Conclusion

This draft is open for discussion, before I go to the work of formalising all the edge cases by converting the protocol into exhaustive state machines, and implementing it. I particularly want to find:

  1. Practical DoS attacks where third parties, knowing there is communication between two address:port pairs, can forge datagrams to disrupt that communication.
  2. Practical DoS attacks where third parties can amplify their ability to overwhelm a target with data by sending datagrams to an IRIDIUM service with forged source addresses, causing response or error datagrams that are much larger than the original datagrams to be sent there.
  3. Practical DoS attacks where, in a situation where multiple processes in different security contexts are all on the same host, a compromised process can use the shared peer flow-control state to disrupt communications for other processes (any more than they could normally, by just hogging lots of bandwidth).
  4. Situations in the protocols where one side or the other can be left waiting forever, without timing out.
  5. Any way it could be cooler, more efficient in latency or bandwidth consumption, or more adaptable to a wider range of applications.

Your comments are eagerly awaited 🙂

Debugging poor home wifi at the Snell-Pym residence (by )

So, we have a fairly complicated network at home - the Snell-Pym Family Mainframe has a dedicated DSL link with a static IP for hosting various Internet-facing things, as well as providing internal services to the home LAN. The home LAN has the usual mix of desktop computers, the laser printer, and two wireless APs for mobile devices to connect to - one in the house and one in the workshop, because one can't get a good signal to both locations. And there's a separate infrastructure LAN for systems control and monitoring.

Now, we've often had on-and-off poor connectivity on the wifi in the house; this used to happen sporadically, usually for around a day, then just get better. The wifi signal strength would remain good, but packet loss was high (10-20%) so stuff just didn't work very well. TCP is poor at high packet loss; it's OK once a connection is open, but packet loss during the initial SYN/SYNACK/ACK handshake causes it to take a long time to retry on most implementations.

I went looking for interfering networks (we live in a pretty wifi-dense urban area) using an app called "Wifi Analyzer" on my Android phone, and it showed a strange network, always on the same channel as the house wifi (as in, if I changed the channel, it would move too). The network never had a name, and the signal strength was about the same as the house wifi; sometimes a bit stronger, sometimes a bit weaker. Read more »

Insomnia (by )

There's something about the combination of having spent many weeks in a row without more than the odd half-hour here and there to myself (time when I get to do whatever I like, rather than merely choosing which of the list of things I need to get done urgently I will do next, or just having no choice at all), and knowing I need to get up even earlier the next morning than usual (to dive straight into a long day of scheduled activities), that makes it very, very, hard for me to sleep.

So, although I got to bed in good time for somebody who has to wake up at six o'clock, I have given up laying there staring at the ceiling, and come down to eat some more food (I get the munchies past midnight), read my book without disturbing Sarah with my bedside light, and potter on my laptop. I need to be up in five hours, so hopefully emptying my brain of whirling thoughts will enable me to sleep.

There's lots of things I want to do. Even though it's something I need to get done by a deadline, I'm actually enthusiastic about continuing the project I was working on today; making an enclosure for our chickens. This is necessary for us to be able to go away from the house for more than one night, which is something we want to do over Christmas; thus the deadline.

Three of the edges of the enclosure will be built onto existing walls or woodwork, but one of them needs to cut across some ground, so I've dug a trench across said bit of ground, laid an old concrete lintel and some concrete blocks in the trench after levelling the base with ballast, and then mixed and rammed concrete around them. When I next get to work on it, I'll mix up a large batch of concrete and use it to level the surface neatly (and then ram any left-overs into remaining gaps) to just below the level of the soil, then lay a row of engineering bricks (frog down) on a mortar bed on top of that in order to make a foundation that I can screw a wooden batten to. With that done, and some battens screwed into the tops of existing walls that don't already have woodwork on, I'll be able to build the frame of the enclosure (including a door), then attach fox-proof mesh to it, and our chickens will have a new home they can run around in safely.

Thinking about how I'm going to lay the next batch of concrete in a nice level run, working around the fact that I only have a short spirit level by placing a long piece of wood in there and levelling it with wedges and then using it as a reference to level the concrete to, has been one of the things running around in my head this evening.

Another has been the next steps from last Friday, when I had a fascinating meeting with a bunch of interesting people in the information security world. You see, I've always been interested in the foundation technologies upon which we build software, such as storage management, distributed computing, parallel computing, programming languages, operating systems, standard libraries, fault tolerance, and security. I was lucky enough to find a way into the world of database development a few years ago, which (with a move to a company that produces software to run SQL queries across a cluster) has broadened to cover storage management, distribution, parallelism, AND programming languages. So imagine my delight when said company starts to develop the security features in the product, and I can get involved in that; and even more when (through old contacts) I'm invited to the inaugural meeting of a prestigious group of peopled interested in security. That landed me an invite to the second meeting (chaired by an actual Lord, and held in the House of Lords!), the highlight of which was of course getting to talk to the participants after the presentations. I found out about the Global Identity Foundation, who are working pn standardising the kind of pseudonymous identity framework I have previous pined for; I'm going to see if I can find a way to get more involved in that. But I need to do a lot of reading-up on the organisations and people involved in this stuff, and figuring out how I can contribute to it with my time and money restrictions.

I'd really like to have some quiet time to work on my secret fiction project, too. And I want to investigate Ugarit bugs. Some bugs in the Chicken Scheme system have been found and fixed lately, so I need to re-test all these bugs to see if any of the more mysterious ones were artefacts of that. I'm in a bit of a vicious circle with that; the longer it is since I've been tinkering with the Ugarit internals, the longer it'll take me to get back into it, and the more nervous I feel about doing so. I think I might need to pick off some lighter bit of work with good rewards (adding a new feature, say) and handle that first, to get back into the swing of things. Either way, I'll need a good solid day to dig into it all again; trying to assemble that from sporadic hours just won't cut it.

I'm still mulling over issues in the design of ARGON. Right now I'm reading a book on handling updates to logical databases - adding new facts to them, and handling the conflicts when the new facts contradict older ones, in order to produce a new state of the database where the new fact is now true, but no contradictions remain. I need to work this out to settle on a final semantics for CARBON, which will be required to implement distributed storage of knowledge within TUNGSTEN. I need a semantics that can converge towards a consensus on the final state of the system, despite interruptions in internal network connectivity within the cluster causing updates to arrive in different orders in different places; doing that efficiently is, well, easier said than done.

I really want to finish rebuilding my furnace, which I hoped to get done this Summer, but I'm still assembling the structural supports for it. I've made a mould to cast shaped refractory bricks for the lining of the furnace, but I've yet to mix up the heatproof insulating material the bricks need to be made out of and start casting the bricks, as I still need to work out how I'll form the tuyere.

I want to get Ethernet cabled to my workshop, because currently I don't have a proper place for working on my laptop; I have to do it on the sofa in the lounge to be within range of the wifi, which isn't very ergonomic, doesn't give me access to my external screens, and is prone to interruption by children. I find it very motivating to be in "my space", too; the computer desk in the workshop is all set up the way I like it. And just for fun, I'd like to rig the workshop with computer-controlled sensors and gizmos (that kind of thing is a childhood dream of mine...).

This past year, I've tried booking two weekend days a month for my projects, in our shared calendar. This worked well at the start of the year, with projects such as the workshop ladder and eaves proceeding well, but it started to falter around the Summer when we got really busy with festivals and the like. I started having to fit half-days in around other things, which meant spending too much time getting started and clearing up compared to actually getting things done, so my morale faltered; and with so much other stuff on, I've been increasingly inclined to spend my free time just relaxing rather than getting anything done. On a couple of occasions I've tried taking a week off work to pursue my projects, but I then feel guilty about it and start allocating days to spending more time with the children or tidying the house, and before I know it, five days off becomes one day of actual project work. I need to stop feeling guilty about taking time to do the things I enjoy, because if I don't, I'll be too tired and miserable to do a good job of the things I should be doing! And rather than booking my monthly project days around other stuff that's going on, next year I'm going to mark out my two days each month in advance, and then move them elsewhere in the month if Sarah needs me to do something on that particular day, to decrease the chance of ending up having to scrape together half-days around the month (or to skip project days entirely, as I ended up doing last month). I feel awful about saying I'm going to spend days doing what I feel like doing rather than the things the rest of my family need me to drive them to, but if I don't, I think I'm going to fall apart!

Now... off and on I've spent forty minutes writing this blog post. So with my whirling thoughts dumped out, I'm going to go back to bed and see if I can sleep this time around. Wish me luck!

Is information security good? (by )

One of the interesting things to have come from Edward Snowden's leaks of classified documents is that the American National Security Agency has been working to introduce flaws into the design and implementation of security technologies, in order to make it easier for them to break said security for their own ends.

There's been a lot of outrage about that. The argument for it is that the ready availability of strong security technology makes it easier for bad folks to conceal their crimes (and, worse, conceal the fact that they are planning such crimes, so they cannot be stopped in advance), so the NSA is right in acting to make sure people don't have strong security technology. However, even if we can trust the NSA (and that is far from certain) such vulnerabilities can be found by people we certainly can't trust: "cyber-criminals" intent on stealing our credit card details in order to rob us of our money, commercial competitors looking for strategic advantage, and so on.

There are also deeper issues that have been raised; this means that the NSA is covertly working to sabotage the products of US companies. Should they be allowed to do that? Can those companies now sue them for damages?

But I think that, at the heart of the debate over this, is an even deeper issue.

We have the NSA - the part of the US government officially responsible for information security - acting to subvert the information security available to US individuals and companies, on the grounds that it is harmful to the public if they have strong security. While on the other hand, we have individuals and companies striving for better security; working to make more secure products, choosing products that claim to provide security benefits, and so on.

This shows, to me, that there's a big unresolved question that US society as a whole - government and non-government together - needs to ask themselves: Is information security good? The government's official position seems to be that information security is harmful, as it makes it harder to provide a more general notion of security that is threatened by criminals, foreign governments, and terrorists; while everyone else's position seems to be that information security is good because they don't want information criminals and foreign governments stealing their secrets (terrorists don't seem to have cottoned to this trick yet) - and, maybe, because they don't want the government knowing ("stealing" is a contentious term here, as the government gets to define what "stealing" is) their secrets, too.

So before they can really debate whether the NSA's actions are justified or not, I think the US needs to step back and look at the bigger question: Should information security be a right, or not? If not, then they should just use legislation to stop companies and people from wasting resources trying to achieve it while other resources are being spent subverting it so they only receive an illusion thereof. That's just plain inefficient. And if information security is deemed good, then the NSA should be prevented from subverting it, and refocus its efforts on ways of doing its job without being able to break encryption; traffic analysis, meta-data analysis, exploiting specific installations of security systems where a threat is suspected, and so on are all time-honoured mechanisms that work even against well-educated adversaries that use encryption systems that the NSA hasn't been able to subvert.

Privacy (by )

I have a looser attitude towards privacy than most people, but I have began to reconsider that lately.

Generally, I believed (and still do) that anything I do in public is pretty much exempt from privacy. I have no privacy objection to pervasive CCTV, because if I do anything in a public place, somebody could be watching me anyway. The fact that my enemies can now just consult massive archives of CCTV to find me rather than having to get somebody to follow me around isn't, in my view, a huge deal. Indeed, I quite like the idea of sousveillance, having my own recording of what happens around me. It might be inappropriate to be doing that in circumstances that the people around me consider "private", so I'd turn it off for their comfort when it seemed right to do so, but I would still assume that anything I do in the presence of other people is basically recorded to some extent - after all, it's in their memory, at least!

Likewise with monitoring my network traffic at my ISP; I have never had any illusion of privacy there. I encrypt traffic that matters, and accept that the existence and destination/origin of encrypted traffic might be used by my enemies for traffic analysis.

So, I didn't really have any objections to mass surveillance; I had far more objection to the facts that encryption is far from ubiquitous and that information security is not taught in schools. My feeling was that if I can't stop an enemy that doesn't abide by the law (eg, organised criminals) from performing traffic analysis on me, then I can't assume it's private; I can stop them reading my stuff or impersonating me by using public key cryptography, so as long as the law doesn't hinder that, I'm content.

As such, I always wished that Web browsers would just include some kind of unique user ID in the headers, ideally backing it up with a public-key signature of the entire HTTP request. Then we could dispense with session cookies, logins, and even things like OpenID; we'd just authenticate to our browser by supplying the keypair in some browser-dependent way, and then head out onto the secure-single-sign-on Web. There's no loss in privacy compared to the current status quo that people are happy to identify themselves to web sites with email addresses, but it'd be a whole lot simpler for users and for developers. And so that, basically, is the security model I developed for ARGON.

However, I am starting to change my mind.

I've always felt that the "hole" in my approach to privacy was that it depended on my own knowledge of security and my enlightened use of encryption; I wanted sufficient education to bring everyone to that level. Encryption tools are generally a bit clunky, but if more people wanted to use them, that would create demand for better tools (or, more pertinently, better integration into the tools they already use). I felt that if we could just get people to encrypt and sign their communications, and encrypt their storage, and use Tor for things where the cost is worth the protection against traffic analysis, everything would be fine.

However, what has made me start to change my mind is the move towards storing one's data on third-party servers. By which I mean, living your life through Facebook, or letting Google store your email and your documents. People are moving away from having a computer full of their stuff, and communicating semi-directly with their peer's computers, towards letting third parties hold all their stuff. Often third parties they don't pay money to and are in no contract with, so they have little or no leverage over.

It's easy to say that educating people in computer security would make them realise that's a bad idea, but I use many of these services despite not trusting them one bit; I do it because network effects force me to. I could run my own StatusNet server on my own hardware, but instead I use Twitter in order to make it easy for people to communicate with me. I use Facebook because it's the easiest way to keep up with my many peers that do, and sometimes because I am forced to; an organisation I am a member of uses a Facebook group for important announcements. Many people do not publish an email address, but instead require me to contact them through various third-party services.

In effect, we are being forced to hand our information to third parties, and to trust them with it. Variations on these services that store your information on hardware you control exist; variations on those services where you actually pay a service provider to store it on their hardware (in exchange for them looking after maintenance, amortizing up-front costs, and so on for you, and where they are more incentivised to keep your stuff secure so you trust them than to try and find ways to make money out of it) also exist.

But they are not popular, as the big "free" providers have the vast majority of the users, and the value of these services is in all your peers already being on them. Now that worries me.

I'd really like to see more push-back against this. If enough people used decentralised software like Diaspora or ran their own mail systems, then the network effects would benefit those, rather than centralised commercial outfits. Clearly, some large incentive needs to be found to push people over, and an unpleasant transition period where everyone needs to be on both. Eventually, organisations like Facebook, Twitter and Google would find themselves forced to interoperate with the decentralised protocol or lose their place in the market, and then would find themselves having to compete on points such as "privacy" when the same ease-of-use and functionality can be had elsewhere for little cost.

But, we need technical measures as well. Build sensible public-key infrastructure into the core of applications (including Web browsers). Ditch cookies, and replace them with explicit authentication: provide a system of public-key-signing HTTP requests as I suggest, but turn it off by default, and force web servers to request it with a status code, as is already done for HTTP authentication (not that that is used for web applications, alas). Let browsers seamlessly support multiple identities, and when a web site requests identification, let the user choose which identity to use; and then colour the border of the Web page according to the identity in use so they don't forget. And while providing identity management through that (controlled) mechanism, try as hard as possible to remove all other means of identification - don't send headers leaking lots of information about the user-agent and its capabilities and settings, and disallow Javascript from querying that sort of thing. Bundle Tor with browsers, so it can be turned on and off with the click of a button, as part of the "private browsing mode" found in many browsers.

I still don't think there's much point in trying to fix this with making information gathering and retention illegal (the recent PRISM scandals suggest that legitimate authorities will find ways to work around limitations on their information gathering, and organised criminals simply won't give a damn anyway); we need better technology that makes us anonymous by default and pseudonymous when we want to be. But there may be some value in legislation helping to break the stranglehold on the social software market held by big centralised organisations!

I'm updating the ARGON security model to work like this (not that that makes a difference to the Real World, mind...)

WordPress Themes

Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales
Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales