Southwest Airlines was one of my favorite airlines, so when I heard about their December 2022 debacle, I felt disappointed and decided to find out more about it.
Here are some of my findings and reflections.
Severely cold temperatures and winter snow wreaked havoc on air travel across the United States in late December of last year, causing over a week of flight cancelations and delays and leaving thousands of passengers stranded. According to the Washington Post, over 3,000 flights were canceled across the US on Tuesday, December 28th, alone. Of those, Southwest Airlines accounted for about 2,600 cancelations. The next day, while other airlines like Delta, United, and American were almost back to normal, Southwest still canceled another 2,500 flights – about 62% of its scheduled flights for that day.
Why the debacle at Southwest?
Much has been said in the media by aviation experts and journalists, blaming everything from the airline’s “point-to-point” model to high levels of absenteeism within their Ramp Agent workforce due to the cold snap. While these issues surely contributed to the problem, everyone seems to agree on one root cause: information technology.
You see, when a flight is canceled, the aircraft and its crew – typically 2 pilots and 3 or 4 flight attendants – are now out of position. The airline, then, needs to “catch up” and move the aircraft and crew to its correct position by rescheduling the canceled flight as soon as possible. When airlines are forced to cancel a large number of flights, say, due to bad weather, all of those airplanes and their crews are out of position, and it becomes more difficult to catch up. Most airlines employ modern scheduling software with an optimization capability to reschedule flights and minimize the disruption; however, Southwest Airlines has not updated their scheduling software in almost two decades.
Southwest uses an off-the-shelf crew scheduling software product called SkySolver, which could not handle the scale of cancelations and delays the airline experienced during the last holiday season. According to a tech analyst from Popular Science, this forced employees to attempt to manually match flight crews and available aircraft, while dealing with complex aviation federal regulation restrictions. The sheer number of possible combinations that a person must consider in a problem of this size is too large for even the best of analysts. To make matters worse, Southwest’s “point-to-point” model – rather than a hub-and-spoke model used by other US airlines – greatly increases the complexity of the problem, according to Dr. Edward Rothberg, chief scientist at Gurobi Optimization.
It turns out that SkySolver was developed almost two decades ago to display data about the location of airplanes and crew members. During normal disruptions (i.e., a relatively small number of cancelations and delays), it is fairly simple for a trained analyst to manually figure out the fastest way to “catch up”, as long as the data about the location of the aircraft and crew members are accurate. Unfortunately for Southwest, SkySolver is not fully integrated with other systems, so crews need to call the home office to report their location in the event of a cancellation. During the major disruption of December, the airline’s phone system suffered a near-collapse, as thousands of passengers were calling to try to rebook their flights, leading to wait times of 6 to 8 hours, according to several accounts. So the data in SkySolver was not up-to-date, making the task practically impossible.
Lessons for Production Schedulers and Supply Chain Managers
The SkySolver software is currently owned and distributed by GE Aviation, a division of General Electric. According to the brochure, the software now includes a module called Recovery Optimization, which they describe as follows (emphasis by the author):
Once a disruption event has been identified, Network Operations helps you take action to recover, whether it be swapping equipment, rebooking passengers, or moving and extending crews. The goal is to minimize the impact to flights, crew, and passengers. Designed for high-speed computing, the optimizer is used for small and large-scale disruptions."
It is not clear whether Southwest has upgraded their version of SkySolver to include this module, but according to various tech analysts, Southwest pilots have “begged company executives to update the antiquated systems since at least 2015.” Regardless, it is undeniable that investing in necessary IT upgrades would have helped Southwest recover more quickly from the disruption.
Let this be a lesson for manufacturers and supply chain managers: relying on manual planning and scheduling might not be considered high risk during normal disruptions; however, what happens during a major disruption? What happens when a machine (or two!) breaks down during production, with no sense of when – or if? – it might be up and running again? Or what happens when many machine operators call in sick due to – heaven forbid! – a widespread epidemic, or a pandemic? Or when many suppliers have their supply lines disrupted at once (sounds familiar?) Or, when your production scheduler or your supply chain and logistics analyst, with 20+ years of experience, decides to finally retire and you do not have a back up that can do the job like she did?
Lesson #1: invest in optimization software for production scheduling and supply chain planning, such that your plans are optimal under normal business conditions and can be re-optimized quickly in the event of disruptions
In the case of Southwest, investing in the Recovery Optimization module could have mitigated the problem significantly. However, one of the causes of the problem was that the data was not available when needed. Crew members had no way of updating their current location in the system, and they were forced to literally call it in, having to wait 6 to 8 hours on the line to do so. This is what is known as “technical debt”, according to Columbia University professor Zeynep Tufekci: “a company’s gap between its existing software and its necessary updates to maintain operations.”
According to Andrew Paul, a staff writer for Popular Science, Southwest has experienced multiple logistical collapses since 2021 due to its long-delayed adoption of cloud-based, decentralized, and integrated data systems.
Lesson #2: minimize your “technical debt” by ensuring your systems are well integrated, such that your data is accurate and reliable when it is needed
If your operation relies heavily on experienced planners who use large, complex spreadsheets or is so complex that off-the-shelf software does not work for you, do not make the mistakes described above. Invest now in state-of-the-art software that is designed to satisfy your specific needs, that efficiently optimizes your operation and integrates into your existing ERP/MES systems. Do not increase your technical debt.