Snell-Pym

Opening up Open Source (by alaric)

One of the awesome things about free/open source software (FOSS) is that, as you have access to the source code, you have exactly as much power as the original authors of the software to modify and extend it.

When you are upset about something in closed-source software, or one hosted on the owner's servers on the Internet as a web app or API, all you can do is beg and plead with them, and threaten to take your future business elsewhere; they control the source code, so they have ultimate power. With FOSS software, in theory, the original authors have no special powers.

However, it often doesn't quite work out like that in practice.

Most FOSS software comes onto people's computers as a precompiled binary; they download it and run it. This is convenient and efficient. But if they then want to use their theoretical right to get in there and modify it, they need to do the following things:

Learn the programming language(s) it's written in, and the libraries/frameworks/APIs it builds on top of. (This is particularly tricky if they have no previous programming experience).
Find and download the sources (which shouldn't be too hard, but can still be tricky).
Set up a build environment that can compile the thing. This may involve installing compilers, and development versions of libraries in order to get the header files, and so on.
Actually make it compile. A lot of the time, a source package as shipped by the authors won't compile perfectly outright, as you need to apply patches written by the people who maintain the package for your platform, as each platform has their own conventions as to where things are placed in the filesystem, and their interaction with less-standardised bits of infrastructure such as service management frameworks, management of network interfaces, low-level hardware access, and so on.
Learn the workings of the codebase. For a large project, this can be VERY daunting, even to a seasoned programmer.
Actually design, implement, and debug the change. If the codebase is poorly architected, this can be made unnecessarily difficult, and require refactoring of existing code elsewhere within the codebase.
Install the software once it's building. As it's been custom-built, it may not interact nicely with your platform's package manager, so you may end up running it out of /usr/local/bin or ~/bin or similar, and have trouble making managed packages that depend on the one you're tinkering with correctly linking to it, interaction with system configuration tools, and so on.

I think this is harder than it needs to be. It puts a lower bound on the effort required to make even a trivial change that, in many cases, means it's not worth making it; we're as beholden to the whims of the developers of the package as we would be to a closed-source software company, and the supposed benefits of open source are denied to you.

We can break down the barriers into a number of categories, and look at what can be done to solve each.

Learning the programming environment

Whether you're a seasoned programmer or not, any given piece of software you didn't write will involve a set of languages, tools, libraries, and other things, that you may not be familiar with, and these will need to be learnt.

What might help is better standards for automatically-generated API documentation, so you can find the documentation of API functions as they are used in the software you're learning how to edit by just pressing a button in your editor.

But if it became easier, commonplace, and expected for people to dig around in other people's source code, for curiosity or to make their own changes, developers of infrastructure components would feel more compelled to document their interfaces in ways that casual programmers can quickly pick them up, and to make their interfaces simpler and easier to learn, because the expected audience would be less dominated by seasoned programmers.

And, conversely, if more people were casually getting involved in simple programming tasks in order to improve the software they use (if it became easier to do so), then the general programming ability of the population would also rise, giving more people the grounding in basic conventions and concepts required to understand programming tools...

Migrating from a normal installation of the software to a hackable one

This is perhaps the biggest hurdle, and yet, the most amenable to being overcome with better technology. It covers the whole spectrum from finding and downloading the source code, setting up a build environment, getting it building on your platform, and getting it installed as a first-class citizen in the eyes of your package manager.

I can think of two technical fixes to this problem, and the best bit is, they're both things that already exist out there rather than my usual kinds of crazy new reinvent-the-wheel thinking!

Firstly, package managers like Nix make it easy to establish build environments on your own hardware, as the build environment of any package can be requested, and automatically set up for you. Also, they are built around installing software from sources in the first place, and offer downloading pre-built binaries as an optimisation. It's quite easy to adapt a nix expression that builds a software package from downloaded sources to one that builds from a source tarball you've made yourself, and install it into an isolate "profile" in such a way that it's easily kept isolated from other software you have running and might not want to risk being broken by your experimental changes yet, and to roll the change back if it doesn't work out.

I suspect that many traditional package managers are written with users in mind, and not developers, which sounds laudable; but in practice it forces the distinction between user and developer, not allowing the former to easily migrate into the latter. Nix feels written for developers, of course acknowledging that developers are also users and still want to be able to install off-the-shelf prebuilt binaries easily. The inbuilt package manager for Chicken Scheme is likewise developer-friendly, letting you directly build from arbitrary checked-out source trees into a properly installed package; my development process for most Chicken software is to run chicken-install in my in-progress sources as the first port of call to compile and run it, rather than the usual idiom in most languages of compiling and running from the source checkout then "installing" the binaries as an optional, later, step. And yet a "mere" end-user of my software can type chicken-install ugarit, and Chicken will download the latest public Ugarit release from the Internet and install it for them. If they want to join me in hacking Ugarit, they can check out the latest sources from my web site and get stuck in pretty quickly.

Secondly, the move away from compiling software ahead-of-time into distributable binaries, towards on-demand compilation of source code at run time (with caching of compiled forms, of course), means that the normal installation of some software is the source code, there is no need for an external "build environment" to convert your changes into a runnable package, and changes to the source code can be immediately used without going through any kind of build/install phase. Because this makes tinkering with the software so much easier, it can make it become a routine part of using the software, rather than something special done only by special people. The typical Emacs user will have overridden various internal functions of Emacs from within their personal configuration; although many configurations can be done without doing so, customisation through function overriding is so accessible that people routinely customise their Emacs installations in ways that the original developers didn't think (or didn't have time/energy) to add as a configurable option. I wish all open-source projects were written in such an open manner, but it will require a lot of migration away from the batch-compilation model of C, C++, and Java.

Poorly-architected existing code

This is a thorny issue; even if you can easily get into the code of your application and make the changes you want, and you understand all the tools it's built with, it can still be hard to make the changes you want because of one of a number of kinds of inherent "fragility" in the way the software is constructed.

Usually, this boils down to some variant on the idea of some information being repeated all over the code, rather than kept in one place. If your code relies on communicating between its components by using a special file, for instance, and every place where this file is read or written contains its own code to read and write this file directly, then changes such as storing the file in a different place, or adding some extra information to it, or replacing it with access to a database or something, will be difficult. You'll need to find all the places where the file is used, and individually re-write them to reflect your changes. This is laborious, and you might miss some, leading to bugs when those bits of code are run but don't reflect the changes.

However, if the mechanism of accessing this shared state (reading and writing the file) was isolated into one place in the software, with an interface that is used wherever required and reflects only the essentials of the access to the shared state, then that mechanism can easily be changed to another, as long as it still preserves those essentials, relatively painlessly and safely.

Software developers, before they even write a line of code, be thinking about how their software might be changed in future, and make sure that they split it into modules with clean interfaces, to make that easier. As side effects, it also makes their code easier to test and debug, as the interfaces serve to define and clarify the responsibilities and expectations of each module, which makes it easy to write comprehensive tests for the modules.

If this seems like hard work, then you're doing it wrong. We've all heard of code (usually in Java, for some reason) that seems to have taken the Design Patterns book as a checklist of things to do, and features pages and pages of AbstractFactoryWrappers that just indirect everything; the actual code that does the task at hand seems to be scattered thinly amongst all this framework. That's not what I mean by designing your software to be extensible. I just mean splitting it into bits with an explicit interface between each, and makings those interfaces reveal as little as possible about the workings behind them, and putting duplicated code into modules behind interfaces rather than writing the same logic more than once, rather than making it into one big ball of inter-related mud. If you're not saving time by writing software like this in the first place, or if it seems like a burden, then you need to re-think how you write code.

I think programming languages can do a lot to help us write cleaner code, too. I find that when I'm writing C and C++, it's often hard to cleanly pull bits of functionality out into other functions due to the manual memory management which complicates interfaces, and the lack of lexically-scoped first-class functions. Code written in Lispy languages tends to be a lot more easy to refactor as it grows, leading to cleaner interfaces (on average), and the automatic memory management tends to make the interfaces simpler as well.

Also, a culture of open extensibility in software means that extensibility of your code is high in the programmer's mind at all times. Developers of Emacs packages seem to expect bits of their software to be overridden, and write it accordingly.

Conclusion

I think that making programming more accessible has very many good consequences. It gives people more power to get more out of their computers. It gives people more reason to trust computers (and as we move to a more online society, people are forced to place their trust in computers in order to take part; but being forced to place your trust in something you don't trust is a harrowing experience), as they can peer inside the software to see how it works, and fix it if it doesn't. It also means that everyday users of computers have an easy, and natural, transition into learning programming, which is a very rewarding pastime; and more people contributing to open-source projects means we all get to have a better quality of life.

So, open-source software developers, I implore you to consider these points. Try to make your software open and welcoming to newcomers!

Computing, Society | alaric | Thu 16th Jan 2014 7:11 pm

2 Comments

By John Cowan, Thu 16th Jan 2014 @ 11:09 pm

I agree with everything you say, of course (though I do think Chicken isn't a very good example, because the only users are already developers). But there's another point you don't mention, and that's that even a completely non-programming user can pay someone to make changes to an open-source system, whereas even this option is denied to users of closed-source programs. In the familiar metaphor, I cannot make repairs to my car, but I can take it to any competent mechanic; I don't have to deal with a car whose hood is welded shut.
By alaric, Sun 26th Jan 2014 @ 11:52 am

I agree with everything you say, of course

Sensible, sensible! 🙂

(though I do think Chicken isn't a very good example, because the only users are already developers).

Not necessarily; statistics are hard to come by, but there's a small and growing set of useful end-user apps as Chicken eggs, so in principle there might be people who install Chicken purely to run chicken-install and then get an app.

But there's another point you don't mention, and that's that even a completely non-programming user can pay someone to make changes to an open-source system, whereas even this option is denied to users of closed-source programs. In the familiar metaphor, I cannot make repairs to my car, but I can take it to any competent mechanic; I don't have to deal with a car whose hood is welded shut.

This is true! But software developers are expensive, so you have to be fairly well-off to fund that sort of thing.

I think there's probably scope for a few trends to develop from that.

Perhaps there are people willing to do bespoke development work on open-source packages at a reduced rate because they'd like to contribute to the open-source world (and, perhaps, in exchange for some github kudos or some other measure of reputation).

Perhaps there is scope for charitable organisations to form to fund open source development; if people agree that some freely-available software to perform a given function is desirable, then collaborating to fund its development as a public good would seems sensible.

I once (all too briefly!) had a job that paid so horrendously well that I spent some of my spare income on hiring a freelance developer to work on an open-source project that I thought would help in one of my lines of work, but that gravy-train ended before it could get to a usable state, alas... I'd like to do more of that. If I were to become Properly Rich, rather than buying private jets and all that rubbish, I'd like to fund software development and open-source the results...

Sarah and Alaric Snell-Pym living in interesting times

Opening up Open Source (by alaric)

Learning the programming environment

Migrating from a normal installation of the software to a hackable one

Poorly-architected existing code

Conclusion

2 Comments

Other Links to this Post

Leave a comment

Search

Categories

About Us

Ada Lovelace Day

Business

Family

Fictional Friends

Friends

Mind candy

Projects

The Salaric Blogs

Archives

Meta

Snell-Pym

Sarah and Alaric Snell-Pym living in interesting times

Opening up Open Source (by alaric)

Learning the programming environment

Migrating from a normal installation of the software to a hackable one

Poorly-architected existing code

Conclusion

2 Comments

Other Links to this Post

Leave a comment

Subscribe

Search

Categories

About Us

Ada Lovelace Day

Business

Family

Fictional Friends

Friends

Mind candy

Projects

The Salaric Blogs

Archives

Meta