Sunday, December 3, 2017

Compare Testing

If you believe that testing is inherently about information then you might enjoy Edward Tufte's take on that term:
Information consists of differences that make a difference.
We identify differences by comparison, something that as a working tester you'll be familiar with. I bet you ask a classic testing question of someone, including yourself, on a regular basis:
  • Our competitor's software is fast. Fast ... compared to what?
  • We must export to a good range of image formats. Good ... compared to what?
  • The layout must be clean. Clean ... compared to what?
But while comparison as a tool to get clarification by conversation is important, for me, it feels like testing is more fundamentally about comparisons.

James Bach has said "all tests must include an oracle of some kind or else you would call it just a tour rather than a test." An oracle is a tool that can help to determine whether something is a problem. And how is the value extracted from an oracle? By comparison with observation!

But we've learned to be wary of treating an oracle as an all-knowing arbiter of rightness. Having something to compare with should not lure you into this appealing trap:
I see X, the oracle says Y. Ha ha! Expect a bug report, developer!
Comparison is a two-way street and driving in the other direction can take you to interesting places:
I see X, the oracle says Y. Ho hum. I wonder whether this is a reasonable oracle for this situation?
Cem Kaner has written sceptically about the idea that the engine of testing is comparison to an oracle:
As far as I know, there is no empirical research to support the claim that testers in fact always rely on comparisons to expectations ... That assertion does not match my subjective impression of what happens in my head when I test. It seems to me that misbehaviors often strike me as obvious without any reference to an alternative expectation. One could counter this by saying that the comparison is implicit (unconscious) and maybe it is. But there is no empirical evidence of this, and until there is, I get to group the assertion with Santa Claus and the Tooth Fairy. Interesting, useful, but not necessarily true.
While I don't have any research to point to either, and Kaner's position is a reasonable one, my intuition here doesn't match his. (Though I do enjoy how Kaner tests the claim that testing is about comparisons by comparing it to his own experience.) Where we're perhaps closer is in the perspective that not all comparisons in testing are between the system under test and an oracle with a view to determine whether the system behaviour is acceptable.

Comparing oracles to each other might be one example. And why might we do that? As Elaine Weyuker suggests in On Testing Non-testable Programs, partial oracles (oracles that are known to be incomplete or unreliable in some way) are common. To compare oracles we might gather data from each of them; inspect it; look for ways in which each has utility (such as which has more predictive power in scenarios of interest).

And there we are again! The "more" in "which has more predictive power" is relative, it's telling us that we are comparing and, in fact, here we're using comparisons to make a decision about which comparisons might be useful in our testing. I find that testing is frequently non-linear like that.

Another way in which comparison is at the very heart of testing is during exploration. Making changes (e.g. to product, data, environment, ...) and seeing what happens as a result is a comparison task. Comparing two states separated by a (believed) known set of actions irrespective of whether you have an idea about what to expect is one way of building up knowledge and intuition about the system under test, and of helping to decide what to try next, what to save for later, what looks uninteresting (for now).

Again this throws up meta tasks: how to know which aspects of a system's state to compare? How to know which variables it is even possible to compare? How to access the state of those at the right frequency and granularity to make them usable? And again there's a potential cycle: gather data on what it might be possible to compare; inspect those possibilities; find ways in which they might have utility.

I started here with a Tufte quote about information being differences that make a difference, and said that identifying the differences is an act of comparison. I didn't say at that point but identifying the ones that make a difference is also a comparison task. And the same skills and tools that can be used for one can be used for both: testing skills and tools.

Thursday, November 23, 2017

Six & Bugs & Joke & Droll

Hiccupps just turned six years old. Happy birthday to us. And thank you for reading; I hope you're getting something out of it still.

Unwittingly I've stumbled into a tradition of reflecting on the previous 12 months and picking out a few posts that I liked above the others for some reason. Here's this year's selection:

  • What We Found Not Looking for Bugs: a headrush conversation with Anders Dinsen on the nature and timing of testing 
  • The Dots: a headrush conversation with myself on the connections between the connected things 
  • Fix Up, Look Sharp: a headrush reading experience from Ron Jeffries' Extreme Programming Adventures in C# 
  • Quality != Quality: a headrush of being picked up by Hacker News, my page views going nuts, and developers debating quality 
  • A (Definition of) Testing Story: a headrush last-minute conference proposal accepted at UKSTAR 2018 

And in the meantime my mission to keep my testing mind limber with rule-of-three punning continues too. Check 'em out on Twitter. Join in!

(And apologies to Ian Dury.)

Saturday, November 18, 2017

Don't Knock It

They were chuckling at me when I came back from the kitchen next to the meeting room. They were grinning and smirking at each other because they'd heard me laugh out loud and knew that I was the only person in there.

So I felt compelled to explain that I was laughing because I value highly in testers the ability to find more than one way to look at any given situation. Stated drily like that, it  doesn't sound worthy of a solo guffaw does it? But what I actually said went a bit like this ...

You know that scene in The Lord Of The Rings where they're trying to get into a mine? There's a clue phrase in Elvish above the door that Gandalf translates as "Speak, friend, and enter" but then he can't remember what the password is. Eventually he sees an alternative interpretation, "Say friend and enter", and they get in.

Well, I was in the kitchen looking at the door to the car park and there's a sticker on it which I'm sure I must've read before ...

... but this time I thought is that door calling me a knob?

Thursday, November 16, 2017

Respond to the Context

Sometimes a phrase just lights up the room when it's spoken.

I encountered one today. One of my team was debriefing us, giving her analysis of our answers to her survey of our experiences of the team pairing experiment that she ran.

I say it lit up the room, but really for me it was writ large in fireworks, sounding a fanfare, and flying loop-the-loops. Here it is:
Respond to the context.
I'll just leave it there for you. And also this.

Wednesday, November 8, 2017

NoSQL for Us

Unfortunately, last night's Cambridge Tester Meetup talk about database unit testing was cancelled due to speaker illness. No problem! We had Lean Coffee instead. Here's a few aggregated comments and questions from the group discussion.

How do you deal with internal conflicts?

  • Give overt, verbal appreciation to the other person and their perspective.
  • Be humble.
  • Leave your ego behind.
  • Conflict is healthier than the alternative. 
  • Conflict betrays a lack of common understanding.
  • I seek conflict.
  • Conflict of personality or of ideas?
  • I want to squeeze out ambiguity and lack of clarity.
  • A stand-up row can be acceptable to achieve that. (Even if it isn't the first thing I'll try.)
  • Some people avoid conflict because they feel they won't win the argument.
  • What is the source of the conflict? That makes a difference.
  • Try to keep discussion to facts; objective not subjective; technical not personal.
  • Try to get to know each other as people.
  • Try to build team spirit.
  • Change your language for different people.
  • Make yourself likeable.
  • Be assertive. That is, be calm, direct and equal.

What does Agile mean to you?

  • The Agile Manifesto is about software engineering and not about other processes.
  • Agile is a good term for marketing to upper management.
  • Extreme Programming is not a good term for marketing to upper management.
  • Agile is for projects where we don't know what we want.
  • It's for when we want to do the right thing but don't know how.
  • It's about early feedback.
  • It's about collaboration.
  • It's about being responsive.
  • Anything-by-the-book is never good.
  • "Painting by numbers doesn't teach you how to paint".
  • Most teams have 30% of their members who don't know what they're doing.
  • I'm a fan of Agile but not a fan of Scrum.
  • Teams at my work mostly use Kanban.
  • It's about knowing things will change and not going overboard on planning.

TDD Difficulty

  • So many people talk about TDD but why is it so hard to get it into use?
  • I like it and my boss likes it, but in five years we've never moved to it.
  • Why?
  • Perhaps it's too big a change for our team.
  • Perhaps no-one wants to make the effort to change it.
  • BDD is a better approach.
  • Is TDD better as personal preference than mandated practice?
  • It only matters that there are tests at the end.
  • Has anyone tried to measure the pros/cons of doing it?
  • Some people think TDD is an overhead; work without benefit.
  • TDD is about design rather than tests.
  • Is TDD really about capturing intent?

How are you using Docker in Testing?

  • To avoid having to deal with dependencies.
  • For installation testing; it's easy to get a known, repeatable environment.
  • Interested in trying to containerise test cases so that we can give something to developers to just run to see an issue.
  • Virtual machines are an unnecessary overhead much of the time.
  • Docker makes it easier to exploit all of the CPU on a host.
  • Docker is no help for kernel development and testing (if you need to use variant kernels.)
  • My team haven't found a use for it.

Wednesday, November 1, 2017

A (Definition of) Testing Story

I'm speaking on the Storytelling track at UKSTAR 2018. In eight minutes I'm going to explain how I arrived at my own definition of testing, why I bothered, and what I do with it now I've got it. 

You can find some of the background in these posts:
and I made a kind of sketchnote video thing to promote it too:

If you still want to come after all that, get 10% off, on me:

See you there I hope.

Saturday, October 28, 2017

Quality != Quality

Anne-Marie Charrett delivered a beta version of her Testbash Manchester keynote at the Cambridge Tester meetup this week. Her core message was that quality and testing are not the same thing:
  • there are non-testing aspects of software development that contribute to product quality
  • there are non-product aspects of quality which should be considered in software development.

A theme of the talk was that customer benefit could be threatened by the second of these, by factors such as code hygiene, speed of delivery, and time to recover after a failure in production. Testers, and others in software development, were urged to reframe their view of quality to encompass these kinds of activities. A Venn diagram represented it like this:

Interesting, but it didn't quite hang together for me. I slept on it.

In the morning, I found myself thinking that what Anne-Marie was trying to visualise really had two notions of quality, and they were not the same. Perhaps she could move from a two-way to a three-way relationship between product quality (features, performance, usability and so on), production quality (for the non-product stuff around producing the software), and customer benefit. (Although I prefer business value rather than customer benefit because the business might prefer things that don't give value to the customer at some times.)

Here's how I've tried to sketch that:

The sweet spot is work that improves the way the software is produced, improves the software and adds value for the business. For example, changing from a product implemented in two languages to a product implemented in a single language could enhance in-product consistency and performance, simplify toolchains, and reduce IDE licensing costs.

But the tripartite division gives other potentially interesting intersections too. There's the traditional new feature which drives an increase in sales (product quality/business value) and then there's situations like moving from weekly to daily drops of a core component to internal teams which removes wasted time on their side (production quality/business value).

Anne-Marie asked for feedback on her talk so I pinged her a few notes along with a sketch of my idea and she incorporated some of it into the keynote.

Which is gratifying and all that but while my model might be considered an iterative improvement on hers, it's not short of its own quirks. The intersection I haven't mentioned yet (production quality/product quality) could be encountered when ancient build servers are replaced, enabling newer libraries to be used in the product, but adding (at least, at that time) no value.

The caveat, at that time, is interesting because it reflects the idea that there's a granularity effect at play. The example just given, at a certain temporal granularity, adds no value. But once new features that build on the new library are implemented, value is added. Zoom out to a wider time perspective and the action of updating the build server can sit in the sweet spot.

There's other ways in which this model is fuzzy: in a continuous deployment world, the boundary in the pipeline between the product and the production of the product becomes harder to define. Also, there's no good way to represent stuff that's actively detrimental to business value.

And there's ways in which our viewpoint (biased to the technical) can distort the relative importance of our interests too. Remember that business value can be generated without any involvement of the development staff: dropping the price of the product might drive sales and increase overall revenue.

Your perspective on the model alters the value of the model. Quality may not be whatever you think it is. Stay humble.
Images: Nick Pass and Dan Billing (via Twitter)

Edit: This post blew up a bit on Hacker News. The views expressed there on what quality is or isn't, the ease with which quality can be achieved, and the notions of quality as subjective or objective distinction are interesting to see. Anne-Marie used Weinberg's definition of quality in her talk and I recently wrestled with that in In Two Minds.