Tampering

FunnelAny of you reading this who have been on my Systems Thinking course will have had the fun of being involved in Dr Deming’s famous red bead experiment.

This post is about Dr Deming’s other (not quite so famous but equally important) Funnel experiment. The experiment teaches about the harm caused by ‘Management by Results’ (MBR) …where this occurs through tampering.

Here’s the experiment set up:

Funnel experiment set upWe have a flat horizontal surface (let’s say a table) with a piece of paper placed on top of it. We also have a kitchen funnel (like we would use to decant a liquid from one bottle to another), and a marble that we will drop through the funnel towards the paper below.

Let’s assume that the funnel is held upright in some sort of stand, say 50cm above the piece of paper.

Now we mark a cross in the middle of the piece of paper.

Goal: to hit the cross with the marble by dropping it through the funnel.

Round 1: We position the funnel over the cross and then drop the marble through the funnel 50 times. For each marble drop, we mark the spot on the paper where it hits.

We are likely to see something like this on the paper (looking from above):

nelsons-funnel-stable

Remember, we simply dropped 50 marbles without attempting to make any changes in-between drops and the paper shows the system to be stable. However, note that there is variation as to where the marble landed. It continually landed near the cross (with probably a few direct hits) but there was variation.

Round 2: So this time, we think that by adjusting the position of the funnel between each marble drop, we can ensure that the marble hits the cross on the next drop!

So we drop the 1st marble, note where it lands as compared to the cross and then move the funnel to compensate for this error. i.e.

  • if the marble landed 1 cm to the left (west) of the cross, we move the funnel 1cm to the right (east) of its current position….because this ‘fine tuning’ will make the next drop hit the cross, right?;
  • if this 2nd marble lands 2cm below (south of) its new position, then we move the funnel 2cm north from where it is currently positioned;
  • ….and so on, iteratively moving the funnel

So, what happens after we use this approach with our 50 marbles, iteratively adjusting the funnel’s position after each drop?

Well, it’s somewhat disappointing!

nelsons-funnel-adjust-to-target-1

Our attempts at compensation have made the variation increase drastically (experiments show by approx. 40%). We’ve made things much worse.

Clearly our ’round 2′ method of compensating didn’t work as we wished. Is there another way of compensating and therefore getting better at hitting the cross?

Round 3: The new idea is to do the opposite of the last idea! This time, we will move the funnel directly over where the last marble landed and keep doing this for the 50 drops.

Oh dear, the marble is moving away from the cross and will eventually move off the table and (as Deming put it) all the way “off to the Milky Way”.

nelsons-funnel-drift-1

Perhaps using the last marble drop as a guide isn’t a good idea!

Conclusions: So which method was best?

  • Round 1 was clearly the best.
  • Round’s 2 and 3 are examples of tampering (though in different ways). They show the effects of tweaking a process based on the ongoing results of that process…it will simply increase the variation and the chances of failures.

So, what should we do? To actually improve a process requires an understanding of the sources of the variation…and then the performance of controlled experiments to identify process improvements.

For our Funnel system we could experiment with:

  • lowering the funnel;
  • decreasing the size of the funnel hole;
  • strengthening the stand holding the funnel to make it more stable;
  • …performing the process in a vacuum 🙂

All of these proposed improvements involve changing the system rather than merely tampering with it based on previous results.

So what?

Now all of the above looks like good fun…I’m already thinking about borrowing a funnel…but it seems an awful long way from our working lives. So let’s explain why in fact it’s not…

Taking the production/ selling of something, let’s say a sandwich shop as an example:

  • you sold 10 sandwiches on Monday, so you make 10 for Tuesday..
  • you only sold 2 sandwiches on Tuesday, so you throw 8 in the bin (not fresh anymore) and you only make 2 sandwiches for Wednesday….
  • you have 6 customers on Wednesday, so you sell the 2 sandwiches you made, have 4 disappointed would-be-customers but make 6 sandwiches for Thursday…
  • …and so on. You can expect to have major stock problems and a lot of unhappy customers!
  • it would be much better to make a set number of sandwiches each day, collect information about demand variation over a sensible period of time and then adjust your system accordingly.

The sandwiches are a very simple example of any process. What about taking a call centre as an example:

  • There will be a great deal of natural variation in customer calls throughout a day (with a number of causes, categorised and explained in this earlier ‘Spice of Life’ post)…so the number of ‘calls waiting’ will constantly fluctuate, though likely between predictable bounds. No surprises there.
  • …let’s assume that Bob’s job is to constantly monitor the current ‘calls waiting’…
  • …it gets to a point where Bob thinks the number of calls waiting is high…so he sends out an urgent request for everyone to drop what else they are doing and get on the phones…and they all rush to do so…
  • ….so the number of ‘calls waiting’ now drops really low and even disappears…excellent. Now Bob sends out a screen pop-up message along the lines of “panic over, people who missed out on their breaks can go now”
  • ….so the number of ‘calls waiting’ now jumps up again…and the up-and-down cycle continues.
  • Bob has a really stressful job looking at the numbers and continually tampering (using the ’round 2′ method) in a hopeless attempt to exactly match supply to demand.
  • A better way would be to increase our understanding of the system by studying demand (its types and frequencies) and amending its design based on what we learned. There might be:
    • loads of failure demand in there (which is a waste of capacity); and
    • frequency patterns within the different types of value demand

Clarification: Many of you working in contact centres may say “but Steve, of course we analyse incoming calls and look for patterns!” I would note that:

  • whilst you can, and should, look for predictable patterns in the data, I doubt that you can tell me how many calls you will get in the next 15 minutes and how long they will take. There will be variation and this is outside your control….does this make you tamper?
  • Nearly all contact centres simply see incoming calls as ‘work to do’ with an ‘average handling time’. Hardly any will analyse this demand. Can you tell me what types of value and failure demand you get, and their frequencies…and what conclusions you draw from this?

I’m not picking on contact centres – I simply use it as an example that we should all be able to understand. Tampering happens all over the place.

In general, managers often look at the results of the last hour/ day/ week/ month* and attempt to make adjustments accordingly, whether by congratulating or berating staff, moving people between queues, changing work quotas, knee-jerk reacting to defects and so on.

(* where the unit of time will depend on the nature of the transactional service being provided)

In fact, praising someone one week for a perceived outstanding result (likely against the lottery of a target that they had very little control over meeting) and then giving them a ‘talking to’ the next because their result was considered poor is tampering.

The point is to understand the system and the reasons for variation. Then (and only then) you can make meaningful changes instead of merely tampering.

Note: The ‘Round 3’ type of tampering is not as common as the ‘Round 2’ type…but it does happen. Consider the following cases of using the last example to inform the next:

  • Using the last board cut as a pattern for the next; or
  • Train the trainer: Mary trains John who trains Bob who trains Jess.

Both of these examples show that any variation from purpose will be retained and amplified as it is passed on, like a chain of whispers.

Credit: I’ve seen the funnel experiment performed a few times now but, rather than taking the laborious time to recreate it, I have borrowed the 3 marble drop pattern pictures used above from this website.

Note: For those aficionados amongst you, this post represents a slightly simplified version of Deming’s full funnel experiment. There is yet another tampering rule (which I have left out for the sake of brevity) …but, just so you know, it also doesn’t work. You can read all about the funnel experiment in Chapter 9 of Deming’s book ‘The New Economics’.

The Spice of Life

spices-442726_640Variety is the spice of life. If everything were the same it would be rather boring. Happily, there is natural variety in everything.

Let me use an example to explain:

I was thinking about this as I was walking the dog the other day. I use the same route, along the beach each day, and it takes me roughly the same time – about 30 minutes.

If I actually timed myself each and every day (and didn’t let this fact change my behaviour) then I would find that the walk might take me on average 30 minutes but it could range anywhere between, say, 26 and 37 minutes.

I think you would agree with me that it would be somewhat bizarre if it always took me, say, exactly 29 minutes and 41 seconds to walk the dog – that would just be weird!

You understand that there are all sorts of reasons as to why the time would vary slightly, such as:

  • how I am feeling (was it a late night last night?);
  • what the weather is doing, whether the tide is up or down, and even perhaps what season it is;
  • who I meet on the way and their desired level of interaction (i.e. they have some juicy gossip vs. they are in a hurry);
  • what the dog is interested in sniffing…which (I presume) depends on what other dogs have been passed recently;
  • if the dog needs to ‘down load’ or not and, if so, how long this will take today!
  • …and so on.

There are, likely, many thousands of little reasons that would cause variation. None of these have anything special about them – they are just the variables that exist within that process, the majority of which I have little or no control over.

Now, I might have timed myself as taking 30 mins. and 20 seconds yesterday, but taken only 29 mins. and 12 seconds today. Is this better? Have I improved? Against what purpose?

Here’s 3 weeks of imaginary dog walking data in a control chart:

Untitled

A few things to note:

  • You can now easily see the variation within and that it took between 26 and 37 minutes and, on average, 30 mins. Understanding of this variation is hidden until you visualise it;
  • The red lines refer to upper and lower control limits: they are mathematically worked out from the data…you don’t need to worry about how but they signify the range within the data. The important bit is that all of the times sit within these two red lines and this shows that my dog walking is ‘in control’ (stable) and therefore the time range that it will take tomorrow can be predicted with a high degree of confidence!*
  • If a particular walk had taken a time that sits outside of the two red lines, then I can say with a high degree of confidence that something ‘special’ happened – perhaps the dog had a limp, or I met up with a long lost friend or…..
  • Any movement within the two red lines is likely to just be noise and, as such, I shouldn’t be surprised about it at all. Anything outside of the red lines is what we would call a signal, in that it is likely that something quite different occurred.

* This is actually quite profound. It’s worth considering that I cannot predict if I just have a binary comparison (two pieces of data). Knowing that it took 30 mins 20 secs. yesterday and 29 mins 12 secs. today is what is referred to as driving by looking in the rear view mirror. It doesn’t help me look forward.

Back to the world of work

The above example can equally be applied to all our processes at work…yet we ignore this reality. In fact, worse than ignoring it, we act like this isn’t so! We seem to love making binary comparisons (e.g. this week vs. last week), deriving a supposed reason for the difference and then:

  • congratulating people for ‘improvements’; or
  • chastising people for ‘slipping backwards’ whilst coming up with supposed solutions to do something about it (which is in actual fact merely tampering)

So, hopefully you are happy with my walking the dog scenario….here’s a work-related example:

  • Bob, Jim and Jane have each been tasked with handling incoming calls*. They have each been given a daily target of handling 80 calls a day as a motivator!

(* you can substitute any sort of activity here instead of handling calls: such as sell something, make something, perform something….)

  • In reality there is so much about a call that the ‘call agent’ cannot control. Using Professor Frances Frei’s 5 types of service demand variation, we can see the following:
    • Arrival variability: when/ whether calls come in. If no calls are coming in at a point in time, the call agent can’t handle one!
    • Request variability: what the customer is asking for. This could be simple or complex to properly handle
    • Capability variability: how much the customer understands. Are they knowledgeable about their need or do they need a great deal explaining?
    • Effort variability: how much help the customer wants. Are they happy to do things for themselves, or do they want the call agent to do it all for them?
    • Subjective preference variability: different customers have different opinions on things e.g. are they happy just to accept the price or are they price sensitive and want the call agent to break it down into all its parts and explain the rationale for each?

Now, the above could cause a huge difference in call length and hence how many calls can be handled…but there’s not a great deal about the above that Bob, Jim and Jane can do much about – and nor should they try to!. It is pure chance (a lottery) as to which calls they are asked to handle.

As a result, we can expect natural variation as to the number of calls they can handle in a given day. If we were to plot it on a control chart we might see something very similar to the dog walking control chart….something like this:

Control chart 2

We can see that:

  • the process appears to be under control and that, assuming we don’t change the system, the predictable range of calls that a call agent can handle in a day is between 60 and 100;
  • it would be daft to congratulate, say, Bob one day for achieving 95 and then chastise him the next for ‘only’ achieving 77…yet this is what we usually do!

Targets are worse than useless

Let’s go back to that (motivational?!) target of 80 calls a day. From the diagram we can see that:

  • if I set the target at 60 or below then the call agents can almost guarantee that they will achieve it every day;
  • conversely, if I set the target at 100 or above, they will virtually never be able to achieve it;
  • finally, if I set the target anywhere between 60 or 100, it becomes a daily lottery as to whether they will achieve it or not.

….but, without this knowledge, we think that targets are doing important things.

What they actually do is cause our process performers to do things which go against the purpose of the system. I’ve written about the things people understandably do in an earlier post titled The trouble with targets.

What should we actually want?

We shouldn’t be pressuring our call agents (or any of our process performers) to achieve a target for each individual unit (or for an average of a group of units). We should be considering how we can change the system itself (e.g. the process) so that we shift and/or tighten the range of what it can achieve.

So, hopefully you now have an understanding of:

  • variation: that it is a natural occurrence…which we would do well to understand;
  • binary comparisons and that these can’t help us predict;
  • targets and why they are worse than useless; and
  • system, and why we should be trying to improve its capability (i.e. for all units going through it), rather than trying to force individual units through it quicker.

Once we understand the variation within our system we now have a useful measure (NOT target) to consider what our system is capable of, why this variation exists and whether any changes we make are in fact improvements.

Going back to Purpose

You might say to me “but Steve, you could set a target for your dog walks, say 30 mins, and you could do things to make it!”

I would say that, yes, I could and it would change my behaviours…but the crucial point is this: What is the purpose of the dog walk?

  • It isn’t to get it done in a certain time
  • It’s about me and the dog getting what we need out of it!

The same comparison can be said for a customer call: Our purpose should be to properly and fully assist that particular customer, not meet a target. We should expect much failure demand and rework to be created from behaviours caused by targets.

Do you understand the variation within your processes? Do you rely on binary comparisons and judge people accordingly? Do you understand the behaviours that your targets cause?