Skip to main content

Explaining my Title

I believe I got the quote from Cory Doctorow, and he credited a designer who's name I've never been able to find or credit, so feel to let me know if you know the source.

EDIT: Cory was kind enough to let me know that this is from Dr. Don Norman, experiential designer and founder of the Nielsen Norman Group.

But the quote as I remember it is this:

"The default state of technology, any technology from stone axes to modern computers, the default state of technology is 'broken'."

It's true, too; technology as a definition is something that is created and thus must be maintained. The axe is possibly the earliest and easiest example: axes that aren't sharp are spectacularly bad axes. For really, really long time we survived on technology and tools that were only moderately more complicated than our own fingers and teeth, and thus technology was relatively easy to maintain and manage, but still: a dull knife is a failure mode. A snapped bowstring is a failure mode. And it requires time, attention, and effort to keep the technology of life out of failure mode and in a usable condition. And this is a condition that becomes more and more true as systems become more complex and civilization becomes, well... civilization.

The thing about the modern world is that it's become so complex that the average person, while entirely capable of maintaining their own technology, just doesn't have time to do it on their own. In point of fact, there's a better-than-average chance that they're out doing maintenance on someone else's technology so that person can do maintenance on someone else's technology... it's maintenance all the way down, in our society. The miracle isn't that computers make our lives easier; the miracle is that they manage to not fail on a reasonably regular basis.

An anecdote: the FAA requires that every plane that flies meet a strict policy on maintenance -- the industry standard as I understand it is the "five nines", which means that 99.999% of the parts and functionality of the aircraft must be working for the aircraft to be certified as air-worthy. If you accept the idea that the average 737 has a million moving parts (and I personally think that's low), then that means that every Southwest flight you take there's as many as 10 things on the plane that are broken. The good news is that they often aren't major things -- a seatbelt doesn't lock, a cabin compartment doesn't latch, etc. -- but again, the miracle isn't that planes fly, but rather that planes don't fall out of the sky on a regular basis.

The difference between older, more "reliable" technology and the new experience of the Internet Of Things and our Software-based interface with the world is that most older technology has had the edges shaved down and sanded off. By default, these systems have been redesigned and redesigned until the understanding is that the technology persists in a system where the failure mode is understandable and easy to manage (though sometimes the timing of that failure mode is less-than-ideal -- witness anyone who's had a car run out of gas between mileposts on the freeway).

Much of Operational Thinking involves planning for Failure Modes -- how does it fail, why does it fail, what happens to the user / customer / involved systems when it fails -- and working with management and development teams to determine risk matrices for a given situation and the likelihood of business impact. Often the most important question an Ops team member can ask any developer is "how does it fail," because many developers (rightly enough) are extremely focused on delivery modes and success, and it's the job of the Ops person to make sure that failure is a mode the business as a whole and every partner in the business thinks about in order to reduce time spent in that mode.

Another Anecdote: Disaster Recovery methodology is a very-low-reward value. Often thinking about DR is boring and weird, because it often involves situations that just plain don't happen...until they do. The DR plan for the Datacenter flooding is not something anyone wants to work out, until it's June of 2011 and your company is looking at an emergency relocation of your production environment because your current datacenter is just outside Council Bluffs and there's a record water-release upstream on the Missouri that's about to sweep through and put the first two floors of the building underwater. Then it becomes really valuable to have that white binder with the carefully-laid-out plans for system migration. And it can be both expensive and panic-inducing when it turns out the white binder is empty / out of date, especially since your clients in New York and California aren't really on board with you taking a week off to fix the problem...

Software (and nearly all modern tools are to some extent married to some sort of software) sometimes breaks. Sometimes it breaks in extremely predictable ways, and sometimes it breaks in ways that are not only impossible to predict but sometimes nearly-impossible to replicate (want to have fun? Google "Leap Second Bug" and head down that particular Wikipedia rabbit hole). As an Operational-minded person, I am often looking for new and interesting failure possibilities in the various tools I use. But most people don't think about different types of failure modes; they have a mindset that all tools are either "working" or "broken". And it's important to think about that. And to recognize that the default state of modern civilization and life is much more often "broken".

Comments

Popular posts from this blog

Organizing And You: Lessons from Labor History

    I made a joke on Twitter a while ago: Do I need to post the Thomas M Comeau Organizing Principles again? https://t.co/QQIrJ9Sd3i — Jerome Comeau says Defund The Police (@Heronymus) July 15, 2021 and it recently came back up because a member of my family got their first union job and was like "every job should be offering these sorts of benefits" and so I went ahead and wrote down what I remember of what my dad told me. My father had many jobs, but his profession was basically a labor union organizer, and he talked a lot about the bedrock foundation items needed to be serious about organizing collective action. Here's what I remember.    The Thomas M. Comeau Principles of Organizing -- a fundamental list for finding and building worker solidarity from 50 years of Union Involvement. This list is not ranked; all of the principles stated herein are coequal in their importance. Numbering is a rhetorical choice, not a valuation. 1) Be good at your job. Even in an at-will

Money and Happiness as a fungible resource

Money really does buy happiness. Anyone who tells you differently has a vested interest in keeping you poor, unhappy, or both. I know this because I grew up on the ragged edge of poor, and then backed my way into a career in IT, which is where the modern world keeps all the money that isn't in Finance. So I am one of the extreme minority of Generation X that actually had an adulthood that was markedly more financially stable than my parents. And let me tell you: money really does buy happiness. To be clear: at 45 years old, I'm now in a relationship and a period of my life where our household is effectively double-income, no kids. I live in the city, but I own a house, and can only afford to do that because of our combined income. We also have two cars -- one new, one used (though neither of them is getting driven very much these days) -- and we have a small discretionary budget every month for things like videogames, books, and the like. What my brother used to call DAM -- Dic

Activision, Blizzard, Game development, IT, and my personal role in all of that.

 I'm pretty sure if you spend any sort of time at all on Twitter and/or spend any sort of time playing videogames, you are by now at least aware of the lawsuit brought forth by the State of California's Department of Fair Employment and Housing versus Activision Blizzard, Inc., et al. From this point on, I'll add a Content Warning for folks who are sensitive about sexual assault, suicide, and discrimination based on sex, gender, and skin color, as well as crude humor around and about sexual assault , and what the State of California refers to as "a pervasive 'frat boy' culture" around Act/Bliz, especially in the World of Warcraft-associated departments.   Just reading the complaint is hard rowing, even with the clinical legalese in place. The complaint itself is relatively short; 29 pages laying out ten Causes of Action (basically, "these are the legs on which our lawsuit stands"). I'm not sure I have the vocabulary to properly express how a