## The Toolbox, Volume 1: The Sample Size Calculator

*by Doug Paulin*

* Senior Manager, Logile, Inc.*

Many individuals that find themselves managing labor programs do not have formal training in engineering concepts that are incredibly helpful to ensuring their success and the success of their company. The Toolbox looks to cover one of these concepts each month, providing useful instruction, templates, and tools that you can put into practice.

This month’s tool download: The Sample Size Calculator

Welcome to the first installment of a monthly Logile offering, The Toolbox. Through my years working with individuals leading workforce managing programs I have come to realize that many of them have risen through their organizations to ascend into these roles, gaining deep experience about their business, industry and customers along the way. However, in many cases these individuals never received formal training in useful concepts, tools, and approaches as part of their development that can greatly assist them in different facets of their current role. The purpose of this regular offering is to provide you with training and tools that you can put into practice right away to achieve better results in your labor management programs.

So, before we dive into the theory or the tool itself this month, let’s discuss a typical challenge posed to those working in labor management programs. Your company is considering a change to a standard operating procedure. Maybe it is introducing new technology to enhance the customer experience; a new marketing strategy, or just changing something for the sake of changing it (we have all been there). In an organization where the labor management team has been integrated into evaluating potential changes to operations (and if you haven’t, it is time that the leader of your group speaks up), you may be tasked with evaluating the impact of making this type of change. The company has set up a pilot program in a location and you have traveled to observe the new process. And now what?

If you and your organization use a predetermined time and motion system such as MOST, the answer may be simple (and if you do not, please feel free to reach out to learn about the benefits). You observe the process, write your method descriptions, develop your sequence models, and calculate the time for the overall process, later performing some form of extrapolation across the organization to determine the impact of the potential change.

However, what do you do if you do not use a predetermined time and motion system? Furthermore, systems like MOST are only useful when there is motion to study. What if you are trying to understand the impact of a change not concerned with motion, such as a machine processing time or an interaction between a customer and an associate demonstrating a new product? The answer is that you need to perform a time study.

Assuming that you understand the proper approach to designing and conducting a time study, the question still remains – how many times must you observe and measure the process with a stopwatch? The correct answer is, as many times as necessary to achieve the acceptable statistical accuracy prescribed by your organization for such data. But what does that mean?

Time study, along with many other forms of data collection, is a sampling process. What this means is that we can assume that our samples are distributed normally across our unknown population (all of the occurrences of this process) average, and unknown variance.[i] The number of samples that you will need to collect is dependent upon how large the variance, or difference is between your samples. Without diving too deep into the statistics or theory, we can utilize statistical approaches related to sample populations to arrive at the following equation for calculating the variance based on your observations:

With time study we are almost always dealing with a very small initial sample (we recommend close to 30 initial samples to use in this exercise). Due to this, we must use a *t*-distribution to estimate confidence intervals (that statistical accuracy prescribed by your organization mentioned above). This yields the following:

Finally, we can solve for *n *to determine the total number of samples in addition to the initial collection that we need measure:

So now that we’ve concluded our statistics lesson for today, how do we actually use this information?

The first thing that you must do is set up your time study utilizing the proper methodology (i.e., document the entire process, break it down into work elements, define start and end points for each element, etc.). Once you have done that, you must collect an initial sampling of times. We recommend collecting 30 initial time samples. Once we have this data, all we need is to determine the desired accuracy and start using the provided tool.

This accuracy is expressed in the *t*-distribution table as Probability (*P*), which refers to the sum of the two tail areas (right and left) of our normal distribution. Basically, we are defining the odds that any sample falls in the main portion of our bell-shaped graph (between the tails). As we increase P, or the odds that the sample falls between and not within our tails, we increase the accuracy of our measurement. However, we also increase the number of samples that we must potentially collect to achieve this accuracy. A general best practice, and what Logile recommends, is to require an accuracy of 95 percent, or *P* = 0.05.

*Figure 1 – An example of a normal distribution (the bell shape) with the tails highlighted in yellow. The tails represent the portion of samples that will fall outside of our accepted accuracy. The higher the *P* value, the smaller the yellow areas and the higher the odds that a sample falls between those yellow areas.*

So now that we’ve discussed the statistics that this process is based on, collected our initial samples, and determined our desired accuracy; let’s explore how to use the tool provided in this installment (download link provided at the top of this post).

The instructions are listed in the document, but we will review them quickly here as well. First, take your samples (in seconds) and type them into shaded cells in column B (starting in cell B4). Select the desired confidence interval in cell G13 (set by default to 95 percent). Once you have done these two things, any samples beyond acceptable control limits will be highlighted in red – delete these values. Once you have done this, your required sample size will be presented in cell G16, highlighted in green.

*Figure 2 – screen shot of this month’s tool – The Sample Size Calculator*

What this tool is doing is performing the equations presented above based on every sample that you enter. Practice using the tool by inputting fabricated values and changing the Confidence selection to see how the calculations and *Required Samples* values change as you increase or decrease the Confidence, as well as how it changes as the variance between your values changes.

As with many processes related to workforce management like time study, there is a right way to conduct the exercise to ensure that you produce the most accurate results possible. The implications of not calculating the correct sample size are basing something like a labor standard off data that does not truly represent what is going on in your organization. For processes that occur in great volume, such as register transactions for a retailer, a poor standard based off inadequate measurement can result in either millions of dollars of additional, unnecessary annual labor costs, or not adequately staffing to handle your customer volume. Now you have one more tool in your toolbox to ensure that this is done correctly.

[i] Benjamin W. Niebel and Andris Frievalds, Methods, standards, and work design (McGraw Hill, 2003) 393.

This post is also available in: Spanish