You’re halfway through a cracking blog post, you absolutely logically destroyed your imagined opponents with the end to that last paragraph; it’s probably time to hit the save button. But just as you provisionally entitle your masterpiece ‘really good blog.doc’, the cursor freezes; an interminable 20 seconds of clicking and cursing later, an error message pops up: bottom line is you’ve lost your work.
That ‘computer says no’ is one of the most frustrating things to have to deal with: like a toddler throwing a temper tantrum, these machines we deal with every day seem unable to communicate their needs except in the most incomprehensible of terms, turn sour without warning, and with one swipe of a podgy paw, are more than capable of reducing your favourite china vase to smithereens.
If my brick of a Toshiba is a toddler, then I’m sort of waiting for it to grow up into a real AI (artificial intelligence). At the moment, we seem to be stuck at its pre-teen precedessor, at something like Siri: it can say stuff, but sometimes gets it quite wrong.
Siri and her fellow voice recognition technologies are cool because they allow us to interact with machines using literally our own language. Most of our interactions with computers instead use visual symbols: we’re looking at a screen, moving the mouse across space, swiping and dragging and pinching to manipulate the images we see; there are icons standing for objects or things to interact with, there are boxes or windows in fake 3D sitting on top of each other. Virtual reality (VR) is even cooler because we’re so used to the visual interface that the fact of all our vision being taken up by what the headset presents to us – even if we can still feel the seat below you or hear our friends’ voices – is enough to trick our bodies into responding with motion sickness and vertigo as our virtual spaceship circumnavigates the planets. Where VR replaces the real world, augmented reality (AR) attempts to make us interact with pixel ‘objects’ integrated or in relationship with real ones, with no separation between the world of digital symbols and the rest of the visual world.
We interact with the world using our senses, and so our interaction with man-made machines has been crafted so as to make sense to our senses and so as to resemble the natural world. Yet the recycle bin is simply a metaphorical name that groups files you have ‘deleted’, rather than a separate space on the hard drive; your Facebook friend list is only metaphorically a book of faces and more accurately a selection of results pulled from a database on a server far far away; what to you looks like a line break in Microsoft Word is in reality just a certain rendering of a symbol stored on the computer which, when read by Word, is converted into the visual space that you recognise as the end of a paragraph. Maybe this is why it is so uniquely frustrating when computers go wrong; they have been made to interact with us in ways that feel so intuitive, and they can even speak with a human voice, but our differences are made apparent when something goes wrong and we realize they are unable to reason and react as we might expect.
One of these differences lies in how the computer communicates. Underneath the graphical or voice-controlled interfaces we are familiar with, the most basic language that controls the computer’s internal operations and its communications with others is binary. Binary is a bit like using a flashlight to signal ‘Chocks away!’ in Morse code; point A sends electrons to point B, and point B knows how to interpret the bursts (now on, now off) into a stream of 1s and 0s. You can then combine sets of 1s and 0s into more complicated codes. For example, take a string of seven on or off digits. There are 128 unique strings of seven 1s or 0s that can possibly be combined. Each of these unique strings could be used to represent an ASCII character (including all the letters, numbers and some special characters) if you want to show text on a screen, or send information you’re typing on the keyboard back to the computer. Or take a string of eight binary numbers (a unit of information known as a byte). There are 256 possible unique strings in this scenario; these could be converted into a single number (0-255) and, for example, used to represent ¼ of an IP address, which allows other computers to send messages to yours. The IP address is a symbol with the same relation to the actual device on your lap as your postcode has to your house; it’s not the actual house, but it tells the post office which part of the country you’re in, and allows the postman to get to your front door.
So computers have their own set of languages, one on top of another, which means that their operations and failures are not the result of a simple ‘computer says no’, but are actually just lost in translation. The business of understanding these machines, and how they communicate with each other, is really just a big translation exercise. When I load up the Yahoo news homepage, what’s really happening is that my computer is sent a file written in HTML ‘language’; Internet Explorer speaks the lingo and knows how to ‘read’ the file, meaning it’s presented to me not just as 1s and 0s or even a string of text translated from the binary but in sections, with bold headlines and links to other websites. For the paratextual instructions (the <b> tag that, when read by the browser, puts the subsequent text in bold) are as important as the actual data, which, rather than framing our interpretation, in fact force us to ‘read’ one part of the text as the significant headline, another as a numbered list. Now Firefox or Chrome speak the same language, but I wouldn’t be able to open that file with Excel or Windows Media Player. And, since there isn’t one wire directly linking my laptop to the Silicon Valley data centres, the 1s and 0s that my browser interprets as words or instructions or pictures are not the only bits that pass over the wire. I can’t just throw a message in a bottle over the white cliffs of Dover and expect it to reach my friend in the Bahamas. No, I need a sturdy envelope and an address. If my friend lives in the same village I may not need the postcode, but if she lives abroad I’ll need to write down the country too. The various addresses, the house number, my friend’s name, the delivery instructions to ‘leave package round corner under first flowerpot next to greenhouse’, are all encoded as more 1s and 0s that various other intermediary devices between me and Yahoo translate and interpret in their own way. And in a fraction of a second.
Yet I’m also aware that translation has its pitfalls: sometimes, there is no way of adequately conveying the original meaning in the target language. Similarly, with computers, one small blip can cause major translation problems further up the chain. This is why there is not always a user-friendly way for the computer to tell me what’s going wrong when I try and save my blog post. Thus the French translator working off a bad edition of a novel in which a panda ‘eats, shoots and leaves’ will figure the panda in a much more homicidal light than is warranted by the original manuscript’s ‘eats shoots and leaves’. Thus one microscopic gene mutation can land you a life-threatening tumour. Thus one 0 mistakenly changed to a 1 or viceversa – for example if the transmission is interrupted by a cosmic ray – could land you on gacebook.com instead of Facebook.
So if computer problems are really just problems of translation and interpretation, successful diagnosis and treatment of these problems must be a question of learning the lingo and using our powers of deduction and inference to draw meaning from what we can see. Is my computer crashing because something was mistranslated; or was there an error in the original text? Like, did the electron transmitters melt because I left it outside without applying sunscreen? Working out what happened is like investigating a crime, in which ultimately I can only attempt to interpret the symbols I’m presented with in the best way I can. If gacebook.com is owned by hackers and they try to steal my online banking details, the investigation will involve piecing together the evidence and recreating what happened from the traces left by the hackers, either on the computer itself, or present somewhere in the 1s and 0s passing between me and gacebook. This is exactly like working from the symptoms to diagnose a disease; or matching the fingerprints the burglar left on the drinks cabinet; or sussing out whether someone is telling the truth from their mannerisms and speech patterns; or getting stuck into a good murder mystery where you know that the narrator is feeding you select information only. So with a bit of skill, maybe it will be possible to move past the uninformative error window to write the computer some instructions using yet another language in the terminal, and maybe then it will let me know whether I can get my blog post back.
It seems to be a surprise to us that dealing with computers, like everything else in life, requires us to read signs and know how to interpret them correctly. The bottom line is that interpreting the world is difficult, whether the stakes are navigating human relationships or thinking through a difficult problem or recovering a cracking piece of argumentative writing – and it’s no different in this new machine domain, even though we invented the darn things. In this case, it doesn’t help that there is a massive skills gap: I’ve a better chance of working out why some people wear jeans with knee rips (on first sight, incomprehensible) than why I can’t set my iPhone snooze period to more or less than 9 minutes. To be fair, this looks set to be remedied soon, as coding in schools is on the up, whereas modern languages uptake is down. This feels a bit sad; although perhaps now that an AI program can beat a human at 2,500 year old strategy game Go, it shouldn’t surprise me that we’d rather learn to communicate with machines than with our foreign neighbours.