Optimizing Service and Resources
It's 5:30 pm, it seems like everybody has just left work, and you're stuck in a long checkout line at the grocery store.
Do you wonder why store management hasn't figured out how many cashiers they need during the evening rush?
If you do, you probably have an intuitive appreciation for the importance of queues.
Whether workers wait to use the office copier, airplanes wait to land, or parts wait on an assembly line, queues are an inevitable, and often frustrating, part of life. Waiting lines affect people every day, which is why a primary goal in many businesses is to provide the best level of service possible. Minimizing those waiting lines is a key part of creating a positive experience for the customer.
How can you achieve that in your organization? Well, there's a whole body of mathematical knowledge dedicated to studying, simulating, and analyzing wait times. It's called queuing theory, and it can help minimize the cost to your business of waiting lines.
It does that by helping you determine the best way to use your staff and other resources, while reducing customer wait times.
Queuing models show you how to make sure you have enough staff working, at any given time, to provide a good level of service – without hurting profitability by having people standing around doing nothing.
Queuing models consider the following:
- The average arrival rate of customers.
- The average rate of servicing customers.
- The cost to the business of customer dissatisfaction resulting from waiting time.
- The cost to provide the service points.
Most queuing models follow the same basic structure: customers arrive for service, they join a line, and they wait to be served. To determine if you have any bottlenecks or other inefficiencies in your queue, you need to figure out what's happening in the queue. Little's Law helps you do this. This theory says that the average length of the queue (L) is equal to the average arrival rate (λ) multiplied by the average waiting time (W).
Here's an example: suppose your call center receives 8,000 calls (L) per quarter (W). You need to figure out the best and most efficient way of providing phone support to your customers. Using Little's Law, you calculate the following:
L = λW
8,000 = λ (0.25)
λ = 32,000 calls per year
If you have two people in the call center, each working an eight-hour shift, and they work 250 days per year, then you have 4,000 working hours to serve customers.
The number of calls that must be processed per hour is as follows:
λ = 32,000/4,000 = 8 calls per hour
While Little's Law provides us some very useful information in terms of what is happening in your queue, it isn't sufficient as an approach to studying and optimizing your queue. Demand for service and other variables that produce inefficiency are not generally predictable or smooth enough for this to work. Rather, Little's Law is used more to help you define the inputs you need to put into a more sophisticated queuing model.
To analyze your queuing model for efficiency, you start by analyzing the characteristics of your queue.
Characteristics of a Queuing Model
To build a queuing model, you first have to understand your underlying queue system. In a queue system, customers arrive by some process, then wait in a line for a server. When a server becomes available, a customer is selected by a predetermined queue rule. Once service is complete, the customer leaves the queue system.
Therefore, the queue system is determined by three main factors:
- How customers arrive.
- The rules of the queue.
- How the service is provided.
Let's look at each of these factors in more detail.
1. How Customers Arrive
Generally, you have no control over when customers arrive. In our example, the callers can arrive (call) any time the call center is open. However, you can calculate average arrival rates, as we did above. To create a queuing model, though, you'll need to analyze in more detail how customers arrive. To help you do this, try the following:
- Track the arrivals – For a certain period of time, determine when calls arrive.
- Create a graph – Show the calls received for a variety of time periods – calls per day, calls per shift, calls per hour, and so on.
- Determine the arrival distribution – How are your arrivals, or calls, distributed throughout the day?
Using our example, when you plot the calls you received on a graph, you may find that the calls are spread fairly evenly throughout the day. Therefore, eight calls per hour is a reasonable number to use when analyzing what resources you will need. With two staff members, your call center should run very efficiently – each worker answers four calls per hour.
When tracking random events related to time intervals, the events are assumed to form what's called a Poisson Distribution. For a large number of events, the Poisson Distribution is usually in the shape of a normal curve, or bell curve. With a small number of events, the curve usually skews (tilts, or leans) to the right. When analyzing a queue with mathematical equations, we usually assume a Poisson Distribution.
Suppose, however, that your customers arrive more unevenly throughout the day. Then you have to use a simulation to build a queue model. If your analysis showed that over half of the calls were received between 10:00am and 2:00pm, then you can't say that the call center receives eight calls per hour; that's not what really happens. You may therefore want to do further analysis of that four-hour interval to determine if those calls were evenly distributed during that time period. In other words, did they follow a Poisson Distribution?
Another assumption for queue arrivals is that, once someone enters a queue, the customer stays to complete the service. In reality, however, this may not always be the case. The model should therefore take this into consideration. Two terms are commonly used to describe exceptions to this assumption: customers are said to ‘balk' if they stop and leave before actually entering the queue. And they ‘renege' if they enter the line, but then change their minds and leave the line before service is finished.
2. The Rules of the Queue
This refers to the maximum possible length of the line, and the service discipline, or sequence, used.
Maximum queue length – Some queues have a limit, a maximum length. Once that limit is reached, further customers are blocked from joining the line. If you have a waiting room, how many people will it hold? If you use phone lines, how many calls can you place on hold at the same time? Unless you have other information, the model assumes an unlimited queue length.
Determine the capacity of your service. What's the maximum number of service events you can handle for the time interval that you're analyzing?
Service discipline/sequence – The model also assumes that queues follow the FIFO (first in, first out) rule. Other queue rules include LIFO (last in, first out), shortest job first, random selection, and priority order.
Analyze the nature of your service events. In what sequence do you process customer arrivals? Is this sequence efficient? How does the sequence affect efficiency, customer satisfaction, and the use (or waste) of resources?
Let's go back to our example. You analyze the figures further, and find that the typical customer phone call for service is either (a) quick and easy, such as asking a customer to simply reset a button, or (b) complex, having to solve a more serious problem. In this case, about 60 percent of the calls are quick and easy.
Right now, you use the LIFO method to process customers. As a result, many of the ‘quick and easy' customers have to wait in long queues behind a few complex calls. If you dedicate one worker to handle complex calls, and the other to deal with ‘quick and easy' calls, you could potentially reduce wait times – and you wouldn't need to hire any new staff to process calls. When the ‘complex problem' worker isn't busy, he or she can help out the colleague by receiving other calls, and serving the queue.
3. How the Service Is Provided
Here, you'll need to take into consideration the following:
- Number of servers or workers – Some queues have one service provider, and others have many. Determining the right number of servers, so that you can make best use of your resources, and optimize the service delivered to customers, is a typical outcome of a queuing model.
- Number of stops in the process – The number of stops in a queue is another issue to consider when analyzing the best way to provide service. In manufacturing, one way to reduce the stops, and to shorten various queues, is to limit the number of times a product is moved. Many call centers use an automated system to narrow down the options for customers. However, if the calls can be handled by any of your call center staff, this may actually add more time to the service process.
Distribution of service – In some queues, the service time is basically the same for every customer, so you can reasonably calculate an average service time. For instance, the average service call for furnace repair may be 1.5 hours. It doesn't matter how many people are waiting to be served, each service call takes about an hour and a half. In these cases, you can accurately predict how long a given customer has to wait in the queue, given the number of service providers you have available.
In other cases, however, sometimes the total service time depends on the number of people in the line. This typically happens when you have only one server. The more people there are in the queue, the longer the wait time for the next person who arrives in the line. A restaurant is an example of this type of service distribution. A chef can typically prepare two meals more quickly than six therefore the smaller your party, the less time you wait. Or with a fast food line, if there are four orders for a Tasty-Burger ahead of you, you'll wait longer for your food than if you ordered the less popular Tasty-Fish.
In our example, the average call length would be skewed, because some calls are very short, and some are very long. When you have a large standard deviation in service times, your wait times will increase. A key strategy is therefore to look for ways to even out the service distribution.
Many calculations can be used to measure queue performance. While a detailed discussion is beyond the scope of this article, these are some of the key metrics used:
- Average server usage.
- Average number of customers waiting.
- Average number of customers in the system.
- Average wait time.
- Average time in the system.
- Probability of zero customers in the system.
- Probability of exactly n (a certain number of) customers in the system.
- Cost of servers per time period.
The formulas used to calculate these values depend on the type of queue, and the type of distribution. Models for more complex queues – for example, multiple server queues – need more of these types of values. The whole analysis process is very complex, and the optimal answers are not black and white, even after these metrics are derived.
To gain further understanding of your queue, you can use spreadsheets to simulate a queue over a period of time. Again, building these simulations is beyond our scope here. However, you can buy software to create a queuing model, and determine the optimal configuration of your queue system. A quick Google search of 'queuing software' will yield many results you can begin to explore.
Queues are common in a wide variety of work environments. To maximize customer satisfaction, it's usually best to reduce wait times. However, you have to balance your service rates with the cost of providing that service. Queuing models are often used to find the right resource mix. By analyzing your queue system, and calculating a variety of statistics related to your queue, you can determine how to provide the best service possible – and make the most efficient use of your limited resources.