Who cares about user interfaces? (by )

The Web

The World Wide Web started in the 1990s, but it wasn't really a user interface for applications at first. HTML, the language for creating Web pages, was first published in 1991, and developed slowly over time; it has somewhat suffered from being pulled between various competing interests; I won't get too distracted by the Browser Wars, but what concerns us here is a three-way tug between:

  1. The "ideological" basis - the intended philosophy - of HTML, as generally espoused by the various people involved in the formal standards, is as a document description format that, originally, avoided specifying much about how documents would be displayed at all (see, eg HTML 2.0, which gives some advice but places very few restrictions on what a conformant user-agent could do with the document); it merely specified that documents had paragraphs, headings, links, and things of that ilk. The only nod to presentation in the HTML specification itself was the presence of a few elements like <i> (for italicising text) and <b> (for bold text), alongside the more generic <em> (for emphasis) and <strong> (for "strong" emphasis) and so on
  2. The concept of HTML as a visual medium. As the Web started to spread, there was great interest in being able to control the visual display of a page from within HTML. This enabled companies to present their web pages in their own branding, and enabled everyone to make their sites "look cool". In enthusiastic hands, this led to the visual wonders explored in self-written homepages on places like GeoCities (may it rest in peace); in more artistic hands it let web sites just look nicer than the default black-on-grey of early web browsers.
  3. The use of HTML to build interactive applications. HTML (combined with some features of HTTP, the protocol used to communicate between a Web browser and the Web server hosting a page) allowed simple kinds of interaction from the early days, initially intended for things like accessing search indexes and submitting feedback or requests to the operators of the web site - an online equivalent of printing out a paper form, filling it in, and posting it off. However, the capabilities of HTML, and extensions to it such as client-side JavaScript, have been repeatedly stretched by ingenious people trying to build interactive applications within the Web; and nowadays they rival conventional user interfaces in capability and sophistication.

There has also been a battle over control of the capabilities of HTML (which, in practice, is delivered through controlling the Web browsers people use) and related things for commercial reasons; it's become a vitally important platform for a vast amount of human activity, including commercial activity. That has provided a backdrop to the conflict about the fundamental nature of HTML, and influenced it in various ways; the makers of browser software have been keen to add new functionality to attract users to their browser, so have significantly extended the visual media and interactive application capabilities of HTML, and tried to encourage web authors to use their features to make better Web sites, and then to encourage users to use their browser to be able to use these better Web sites - while also trying to foil each other's competing attempts to do the same. They are also under commercial pressure to make it hard for competing browsers to add their own support for the new capabilities they add, which means they have no particular motivation to make the trifecta of HTML/CSS/JS easy to implement. This has resulted in a race to pour more and more resources into browser development; at the time of writing there are only really three browser engines in widespread use, and one of them (Google's Blink, the core of the Google Chrome, Chromium, and Microsoft Edge browsers) is clearly ahead, giving Google increasing control of the Web. And they want to keep it that way, by making it harder and harder to develop competing browser engines, so they have no motivation to make the Web any simpler.

But back to the topic of this blog post: User interfaces.

Building an application to provide its user interface via the Web has several attractive implications:

  1. Everybody has a web browser, so if somebody finds a link to your application, they can click that link and be using it right away; as opposed to having to download and install software.
  2. The same application can be used on Windows, Mac OS, Android, iOS, Linux, FreeBSD, NetBSD, Solaris, ... any platform for which a modern browser can be found. You don't have to make multiple versions for different platforms.
  3. The application is tightly linked to your central server, as initially entering and then using the app requires constant communications between the user's browser and the server. This is a mixed blessing - it means you need to maintain (eg, pay for) server capacity to support all the current users of your app, and if your servers break users are impacted immediately; unlike software they download and run, which is then running entirely on their hardware. However, during the dot-com boom venture capital money to build lots of servers and ops teams was easily available! The upsides are that it becomes easy to build inter-user functionality into your apps (enabling a whole raft of apps that were meaningless without that, from eBay to social networking), you can make new features available to users instantly... But most importantly, you can monitor your users' use of your application in real time, building profiles on them that you can then use to sell advertising space in your app, which was the only real business model available for apps you didn't pay to use, when that VC money ran out. I have written before about how THAT turned out...

To begin with, the Web was a very limited environment to build apps in, because HTML and HTTP weren't designed for that sort of thing. But because of the exciting possibilities of web-based app development, browser manufacturers rushed to add functionality to help. This combination of a basic foundation ill-suited to the task at hand, with various things flung together on top by competing parties, in a rush, has led to the Web as a development platform these days being a complete mess and quite painful to develop on; painful enough that I've personally grown to avoid having to build applications in it whenever possible.

Client side web apps

Originally, Web apps could only work by generating a Web page on the server when the user followed a link, and sending the page back to the user. The page could contain links to other pages, and forms which submitted user-provided data to the server, in exchange for another page. This allowed basic interactivity, and the addition of "cookies" (little bits of data stored on your computer by your browser, included along with pages returned by the server and then sent back to the server when the user requested other pages from the same server) made it practical to have a notion of a "session", where the server recognised the same user coming back again; this made it possible to build things like online shops with a "basket" and an order/payment form. But the forms were limited in the kinds of fields they offered, and the fact that no interaction with the application server was possible until the form was submitted precluded more immediately interactive applications. JavaScript came onto the scene in 1995: a language designed to be embedded into Web pages, so that the browser would execute the script when the page was displayed. This enabled applications to ship some or all of their logic to run in the browser instead of on the server. HTML pages could include JavaScript could react to user actions such as clicks, mouse movements, keypresses and interacting with form fields, enabling real-time interaction; and with the advent of the now rather datedly-named XMLHttpRequest mechanism, JavaScript code could talk to an application's server, enabling interaction with server-side parts of the application within the scope of a single page, without needing to submit a form or follow a link.

Web apps are like X Windows apps

As a user interface platform, however, the Web is very low-level. Being primarily a documented description language, HTML lets Web developers specify the display of the page by positioning text and images, with various effects like filled backgrounds and borders, and then provide JavaScript code to react to mouse clicks and keypresses. This is basically the same level of user interface infrastructure provided by X Windows, rather than a widget-based system like the Windows or Mac OS GUIs; but, worse than that, whereas in X Windows you can say "Draw this text in such-and-such a font at such-and-such a position", with HTML you had to provide an HTML element (often a <div>) then write CSS to override its default presentation (which would vary between browsers) to give it a specific font, size, colour, position, and so on; and the specification of size and position in CSS is designed for typesetting onto variable-sized screens, being inputs into a fairly complex layout algorithm that attempts to best meet the requirements placed upon it in the situation it finds itself. As such, forcing it to just put elements where you want them and make them scale as you intend when the viewport changes isn't particularly easy. Even simple tasks like centering objects in spaces was notoriously difficult.

Additionally, the low-level nature of display control means that the browser really only understands the application as a bunch of rectangles with text in, which means that it can't provide any higher-level functionality, such as alternative ways to interact with the application. The most visible casualties of this were assistive tools such as screen readers, and the commercial pressure to comply with anti-discrimination laws led to the development of a standard called ARIA that allowed HTML elements to be tagged with metadata about their intended meaning to the user. For instance, while a button to submit a traditional HTML form was clearly marked as a button (because the browser needed to know that it was a submit button to provide that functionality), in a JavaScript-based app one might just have a <div> element (which is just a generic container) be styled to look like a button, and JavaScript code added to react to a mouse click in that button to perform some action; the browser would be unable to know it was a button and, for instance, allow the browser to assist a non-visual user in finding and clicking on the button unless it had an ARIA role="button" element to express that. But this system depends on developers remembering to add and maintain the ARIA roles; and as the roles are invisible to "normal" users of graphical browsers, it's easy to overlook them.

With X Windows, the difficulty of building user interfaces directly in terms of such low-level primitives led to the development of "toolkits" that provide higher-level functionality on a per-application basis, such as Tkhttps://en.wikipedia.org/wiki/Tk_(software), GTK, Qt, and myriad lesser-known ones. The vast majority of X applications are written using such tools, as they make it easy to produce good results; but equivalents for Web development have been somewhat crippled by at least two main factors I'm aware of:

  1. A cultural expectation that Web-based apps will be individually styled to their own design, because the people paying for commercial web app development have grown accustomed to having their own "corporate style". Whereas consistency in desktop UIs was seen as a goal to aim for, making applications easy to use and understand, Web applications tend to be judged more for looking fancy and unique. So any reusable shared toolkit that provides its own consistent look and feel is considered boring and ugly; even the most cherished conventions are fair game these days. Such toolkits have arisen to some extent within the scope of individual organisations large enough to have multiple Web apps made by different teams, such as Google's Material Design or the GOV.UK Design System, and others may adopt those toolkits when their original authors are cool and everyone wants to emulate them, but after a year everyone's calling them "dated" and saying it's time to rewrite the app as a "refresh" because it looks "stale".
  2. The complications of the CSS/HTML/HTTP model are hard to abstract over; applications need to directly interact with them to some extent anyway, so any toolkit between the application and the underlying platform will risk getting in the way of things that need to be done.

The only widely-reused toolkits I've seen, therefore, tend to be rather minimal in scope. Most are just CSS that provides standard, basic, low-level presentational building blocks to build your own styles in top of, like Tailwind; or JavaScript libraries that provide some functionality to structure a user interface without directly controlling the display of things, like React or htmx.

There's a dizzying array of different tools out there, none of which yet provide a development or user experience anywhere near as good as interactive user interface builders provided in the 1990s. The frontend JavaScript community is highly divided along ideological lines, probably going back to the original conflict between CSS/HTML/HTTP as a system that separates content, presentation, and interaction versus a platform for building highly integrated custom applications, so new JavaScript and CSS frameworks seem to crop up with surprising frequency; all justified by blog posts explaining how all the others are terrible and misguided.

This also means that Web applications are hard to understand. In the earlier days of the Web, if you saw something cool you could right-click and press "View Source" and see how it was done. Nowadays, the source of a Web app is often minified JavaScript, HTML that's just a meaningless set of nested <div>s dwarfed by their gnarly long class="..." attributes, and CSS generated by compiling from SASS. The actual source the developers of the site work with is elsewhere; what gets sent to the user is compiled from that, by tools that help to paper over the complexity of the platform. The ability to View Source and make sense of what a Web application is doing has fallen by the wayside.

This is not a world that is easy for newcomers to enter; we're building an ivory tower. Sure, to be a good interface designer requires an extensive background in human factors, user research, and various other important disciplines - but adding "mastery of an ever-shifting tangle of complicated technologies" to the list is a distraction from the important (and, probably, more rewarding) parts of the job. Problem-domain design skills can be learnt from scratch by producing something terrible and putting it in front of users, but aspiring web developers need to learn Javascript, CSS frameworks, and Github before they can start contributing to a site. The Web offered the potential to democratise publication - but we've squandered it.

To quote a random person from the Internet who put it better than me:

Of all the sins Electron and web design have visited on the world, the idea that the basic ergonomics of a platform's UI controls is the domain of the app designer is the biggest. Save me from designers who insist on "fixing" what a button looks like, and how it works.

And now we have an entire generation of computer users who have never experienced UI consistency, and so they have no idea how hard their life has been made by bad design.

[Russell Keith-Magee, Nov 2024](https://cloudisland.nz/@freakboy3742/113512103924724697)

The Electron mentioned in the quote is a software toolkit for building cross-platform (Windows, Mac, and Linux) desktop apps, but by writing them as Web apps which are rendered by an inbuilt copy of the Chrome browser. This means that, even on the desktop, apps are starting to each come with their own visual language - inconsistent for users, and that developers have to build from scratch for each app.

Oh dear.

Pages: 1 2 3 4 5 6

No Comments

No comments yet.

RSS feed for comments on this post.

Leave a comment

WordPress Themes

Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales
Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales