Category: Computing

Designing software (by )

One thing that has often annoyed me in my career as a software designer is coming into conflict with people who want me to do the worst possible job that is "just good enough".

This often comes out in statements like:

  • "There isn't time to design it properly, just hack something so we can start demoing it"
  • "We'll have to rewrite it anyway, requirements always change"
  • "Supporting more features than we need right now is a waste, add them when we need them"
  • "Can't we just do something quick and simple?"

Reading between the lines, they seem to think that "more designing" will mean more complicated software with more features, that will take more time to build.

I think the problem comes from how product management thinks of software - they want a feature, they request the engineers add it (ideally specifying the feature well enough that they actually describe what they want), they get that feature. And there's often some discussion about reducing the scope of the feature by removing some criteria to get it implemented sooner. It seems very much like "more features with more acceptance criteria equals more work".

I'd like to dispell that myth.

I assert that better-designed software.will take less time to write, and will be better due to being more flexible rather than through having more features, and through being easier to extend in future when new requirements come up.

There's a paragraph in the Scheme language standard known as the "Prime Clingerism", and it reads thus:

Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary. Scheme demonstrates that a very small number of rules for forming expressions, with no restrictions on how they are composed, suffice to form a practical and efficient programming language that is flexible enough to support most of the major programming paradigms in use today.

It has been my experience that this approach applies to all forms of software design, not just programming languages. As a software architect, I see designing a feature as more like Michelangelo making the statue of David (famously, he said that the process of making a statue is just to get a big enough block of stone, and remove all the bits of it that aren't the statue).

Rather than thinking in terms of each acceptance criterion in the specification for the feature as a thing that will have to be done separately (so more ACs means more work), I like to work the other way: what's the simplest piece of software that can do all of these things? Most problems have a "core" that it can be boiled down to, plus some "details". I try to design software that has to meet specific requirements by making the "core" with the fewest assumptions and then adding "details" to tailor it to the exact requirements - which tends to mean I can add more details later to fulfill more requirements.

An example

For instance, many years ago, I worked at a CRM SaaS company. A big part of the app was sending out HTML emails to subsets of a user's customers to offer them some special offer, serving up the images in the email from our HTTP server (and tracking who saw the email by including the email message ID and the recipient ID in the image URL), and tracking what links they clicked by replacing URLs in the email body with URLs hosted by our HTTP server that would log the email message ID and recipient ID then redirect the user.

I was given a feature request: people were occasionally forwarding the emails to their friends, and when they did so, their viewing of the email images and their clicking of links would be logged against the original recipient, as their ID was in all the URLs. Our server would see the same person viewing the email lots of times and clicking the links lots of times. Learning how users responded to these messages was a big priority for our customers (we had extensive reporting systems to analyse this data!), so they were tetchy about this, and desired the ability to put "forward to a friend" forms in the email HTML. The recipient would put their friend's email address into this form and hit a button, and our system would send a fresh copy of the email to their friend, with their own recipient ID - so our user would know the forward had happened, and could see the friends' response separately.

Now, the "simplest" solution would be to add an extra feature to our HTTP server, accepting a form submission to a URL containing the recipient and message IDs, extracting an email address from the form submission in the usual manner, creating a new recipient in the DB with the email address (or finding an existing one), recording a forward from the original recipient ID to the "friend"'s ID, and sending a copy of the email to the "friend" - then responding to the form submission with either a hardcoded thankyou page or responding with custom thankyou-page HTML uploaded by our user, or a redirect to a URL set by the user.

A quick chat with the boss revealed that being able to host custom thankyou pages was a very desirable feature, and as that would involved embedded CSS and images (and maybe javascript), I clearly needed some kind of generic user-uploaded-content HTTP serving infrastructure. So I threw that together, letting customers created websites in the application and upload files, and an HTTP server that would server resources from them with a URL containing the website ID and the path of the file "within" that website - and with the ability to tag URLs with email message and recipient IDs for tracking purposes. We already had an engine to "personalise" HTML message content for a specific recipient, which also handled modifying all the URLs to embed email message and recipient IDs, so I used that again for hosting textual content such as HTML, meaning that if a link came in from an email message with message and recipient IDs, they would be fed into all further URLs returned by the system. To prevent spoofing of the IDs in the URLs, I reused the method we used on the image and redirect server URLs: the URL itself was "signed" with a short cryptographic signature when the software generated it.

But rather than hardcoding the forward-to-a-friend feature into that, I did something that took perhaps ten minutes more programming: I allowed an arbitrary number of named "commands" to be included in a URL. The only named command I implemented was "forward" that would do the forward-to-a-friend logic (finding all form submission fields called "forward-to" and using them as recipient email addresses, so you could make forms that accepted lots of addresses; making it loop over all fields with that name rather than just looking for one was the first assumption I removed to simplify the software while making it more flexible) then proceed with handling the HTTP request in the normal way.

But when a requirement came in to support unsubscribe forms, I just added an "unsubscribe" command, which took all of five minutes. Adding an "update" command handler to let people update their details in the master DB took a little longer, just because I had to list all the supported field names in the documentation and include a more lengthy example. And then, because I'd implemented the recipient and email message IDs in the URLs as optional things from the start, adding a "subscribe" command handler that created a new recipient in the DB from scratch, along with a tweak to the UI to let the user get an anonymous URL for any file in their website took an hour or so - and meant that users could now create subscription forms and generate a URL they could publish. I think I also added a "send" command handler to send a copy of a specific email message to the recipient ID in the tracking context; as the "subscribe" command put the new recipient's ID in the tracking context, a URL with a "subscribe" command followed by a "send-" command would handle a subscription and then send a welcome email to the new subscriber...

I added a few other command types that did various other things in the system, and all was good.

Now, I didn't "set out" to design an "overcomplicated super-flexible" HTTP server system that could do all these things when I was asked to add the forward-to-a-friend system. I just spotted that the following assumptions would be really easy to NOT bake into the core of the implementation:

  • Forward to a friend is always to a single friend; it's easy to create multiple form fields with the same name in an HTML form, and easy to make my server loop over any it finds.
  • Every visit to the new generic HTTP server will have a recipient ID and an email message ID; although forward-to-friends will as they will be linked from an email, it's easy to imagine that we might want to host subscription forms in future. So making the tracking IDs in the email optional (or allowing for other kinds of tracking IDs in future) by making the URL structure by a series of tagged fields (it looked something like http://domain/sss/Rxxx/Eyyy/Wzzz/foo.html for a request for file foo.html in website ID zzz for recipient xxx from email yyy, with sss being the anti-tamper signature) was worth the few minutes' work.
  • Requests to the web server will be forward-to-friend requests or just static resources in support of a response page; it was easy (given we already had an ordered list of tagged components in the URL) to added an ordered list of arbitrary commands (with URL path components like forward, subscribe, etc; any component before the Wzzz/filename part was considered a command or tag, and tags started with an uppercase letter).

The "extra work" required to do things that way rather than the "simple" way was tiny. But it meant that the core of the HTTP server was simple to read and didn't need to change as we extended it with more and more commands (by adding them to a lookup table); it made a powerful system that was easy to extend, easy to understand, and capable of things we didn't have to "add as features" (some or our users did quite creative things by stacking commands!). As the HTTP server core and the commands were small separate modules with a clearly-defined interface, they could be understood individually and were well isolated from each others' implementation details, the benefits of which are well documented.

So please help me fight the assumption that putting thought into designing software means it'll be complicated, more effort to implement, and that "it'll need rewriting anyway so there's no point"! Let me do my job well 🙂

Sharing is Caring, but Resharing is Poison (by )

I've noticed a trend that has led me to develop a theory.

It's widely said that social networks start off fun and then decline; I've usually hard this attributed to some combination of (a) all your colleagues, family, and former schoolmates joining or (b) it "becoming mainstream" and a rabble of ignorant masses pouring in.

This implies an inevitability - such environments are fun when they're occupied by an exclusive bunch of early adopters, but if they're fun they'll become more popular, and before long, they'll be full of Ordinary People who Ruin It. Good social networks are, therefore, destined to either to be ruined by going mainstream, or die out because they never take off.

I disagree. The elitism inherent in that viewpoint is a warning sign that it's a convenient and reassuring fiction, for a start; and I have an alternative theory. As you may have guessed from this post's title, I think that the provision of a facility to reshare (retweet, repost) other's content with a simple action is a major contributing factor to making a social network descend into a cesspit of fake news and hate.

Back in the early days of Twitter, most of the tweets were things that people had typed out themselves. Many of them were links to other things, but doing that required manually copying the URL and pasting it into a tweet, and most people added a word or two of commentary when they did so.

But Twitter these days is dominated by retweets. In a quick survey of the current tops of my various Twitter timelines, I saw 7 retweets and 5 original tweets. I see less of what my follows are doing, and more of what my follows are liking about what others are doing.

As these centralised social networks are advertising companies, this is a desirable state of affairs for them, for at least two reasons:

  1. Single-click resharing means that content can spread virally across the platform, getting seen by millions of people in a very short timeframe. This is attractive to advertisers, so the network can make money selling tools to help them encourage this, to track the spread of content, and to generally spread the idea that their network is a place where things spread quickly and influence culture.
  2. A big part of their business model is to better profile their users, so they can sell targeted advertising. It's harder for a computer to analyse your prose to learn about you (bearing in mind you might use complicated linguistic tricks such as irony) than to just see if you click a button in response to something or not. The algorithm might not be entirely clear on the meaning of the content you've just reshared, but it now knows that you have something in common with the four million other people who also reshared it; and cross-referencing that with other information it holds about you and them is a powerful predictive tool.

But that same ability for things to rapidly spread is the driving force behind:

  1. The rapid spread of fake news; tools designed to help advertisers are easily adopted with people wanting to control our minds for reasons even worse than mere financial gain.
  2. Hate storms, when something gets widely shared between a community of people who hate the behaviour implied by the original content; who then all respond angrily to it within the social network and often, due to the amplified feeling of communal hate and the wide reach bringing it to the attention of unhinged and morally dubious people, leading to crimes being committed against the target as "revenge".
  3. A decreased sense of community, due to seeing more and more content from outside your group. Interacting with the social networks becomes more like watching TV than sitting chatting with your friends.

I think the elitist complaint that social networks go wrong when they "go mainstream" and "the normals come and ruin it" is really just a misguided attempt to put the lingering feeling embodied in that last point into words.

Looking back at the original decentralised social networks such as email, Usenet and IRC, they all lacked a single-click "reshare" facility - but some of the criticisms of email and usenet (excess crossposting, forwarded chain emails) both come down to it still being a bit too easy to share things across community boundaries. IRC escaped this.

I think there's no reason a social network can't scale to cover the planet without becoming a cesspit - but I suspect that making forwarding content on too easy is a great way to drag it down the pan.

Don’t fund your online business with advertising (It’ll only make everyone hate you) (by )

I first got online in 1994 or so, and the Internet was a very different place to how it is now. It was like a busy marketplace - thousands of FTP servers, things you could telnet to, email addresses, Usenet groups, IRC channels, gophers, MUDs and, increasingly, Web sites. Directories like DMOZ and Yahoo!, as well as FAQs for relevant newsgroups and mailing lists, were how I found things. It was cheap to set up servers and run services on them, so lots of people did. Companies and universities got leased lines to provide Internet access to their folks, and ran servers to provide their presence to the Internet; while individuals got dialup Internet access, and basic email/Web hosting capability from their ISPs; or for the nerdier amongst us, wrangled or paid for "colocation", getting somebody with a leased line to let you put your computer on a shelf somewhere, hooked up to their power and network.

It was pretty chaotic, but it worked. Internet usage exploded in that period, but the rate of technological advancement wasn't that fast (relatively speaking). All the technologies we used - TCP/IP itself, DNS, Email, Usenet, IRC, the Web - were built around some documents describing how the system worked (usually in the form of RFCs). Most of these technologies were implemented in two parts: the client that somebody ran on their computer to interact with it, and the server that somebody ran on a big permanently-Internet-connected computer with a fixed IP address and a nice hostname. For instance, with the Web, the client is your Web browser, and the servers are the computers that actually hold all the web pages; your web browser talks over the Internet to the server responsible for the page you want, gets it, and then shows it to you. Because the client and the server talk to each other using the protocol defined in the documents, there would often be several clients and several servers available, written by different people and aimed at various different kinds of users - and they would largely work together. Read more »

Kitten Technologies and OhBot Supporting Ada Lovelace Day (by )

Robo Rob part of the programming board game designed for Cuddly Science

Really wonderful news, the Cuddly Science event Ada Lovelace's Coding Time at the Museum of Gloucester on Saturday 14th Oct 2017 is not only having a wonderful OhBot Robot but also Simple Graphics by Kitten Technologies 😀

Kitten Technologies

This event is aimed at kids and is ticketed at £5 per child, there is puppet story telling, colouring in sheets, Robo Robs Jobs the board game as well as the OhBot programmable robot head and the simple graphics programme. I am very excited about how this event is shaping up 🙂

Oh Bot programmable robot head at the Cheltenham Fun Palace 2017

Ada Lovelace Day 2017 – Dr Rebecca Wilson (by )

Today is Ada Lovelace Day - an annul celebration of women in Science, Technology, Engineering and Maths (STEM), named after the Victorian mathematician and visionary Ada Lovelace.

Each year we try to do a little write up on women who have inspired us in the sciences. There are many entries for previous years - in fact later today I am going to make a special category for them all 🙂

This year I have chosen my friend Dr Rebecca Wilson.

Broken lift

Rebecca started off in Geology studying at Imperial College's Royal School of Mines, where she not only excelled in her own studies but helped me with some of the more advanced GeoChemistry elements, lending books and explaining things in multiple ways.

She was part of the posse that went with me to the Natural History Museum London to get work experience and helped me get into the meteoritics department. A PhD at the Planetary and Space Science Institute looking for organic material in micrometeorites.

She went on to post doc and research and science outreach at Leicester University and the associated Space Centre. During this time she developed some pretty awesome out reach kits. Those that can be available to the public/teachers are downloadable here.

Rebecca also won an science journalism internship which took her to Ireland, she has in fact been all over the globe studying, researching and presenting.

She has side stepped into medical data visualisation realm where she is pushing the frontiers of science ever forwards as well as highlighting the issues of accessibility on her various travels.

Rebecca has rubbed shoulders with the top people in both space and planetary science as well as within the deep data computering spheres not to mention the odd science communicator such as Brian Cox! Becca he is highly versatility and extremely dedicated and she is also a hell of a lot of fun to be around 🙂

She was even chosen by Jean for a school project on role models and heros!

WordPress Themes

Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales
Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales