The Same Problem Keeps Happening. Here's How to Actually Fix It.
Why the same quality problem keeps coming back in your shop — and the step-by-step process to find the real cause and fix it for good.
You fixed this problem six months ago. You did the investigation, made the change, closed the ticket. And now it's back.
It's not bad luck. It's not the new guy. It's a sign that the original fix addressed the symptom, not the cause. Until you address the cause, this problem will keep coming back.
Here's how to break that cycle.
Why the Same Problem Keeps Coming Back
Most shops fix problems the same way: something goes wrong, someone figures out what happened, they change something, the problem goes away. For a while.
The issue is the "what happened" step. Most of the time, "what happened" is the most obvious answer. The part was out of tolerance. The weld cracked. The operator assembled it backward. So you fix that thing — you tighten the spec, you have the welder redo it, you add a step to the work instruction.
That is the symptom. Not the cause.
Why was the part out of tolerance? Because the operator was measuring with a worn gauge. Why was the gauge worn? Because there's no calibration schedule. Why is there no calibration schedule? Because nobody owns it.
Now you're at the root cause. Fix that — assign gauge calibration ownership, set a schedule, make it visible — and the out-of-tolerance parts stop coming back.
Fix only the symptom, and the problem returns. Every time.
The Difference Between a Symptom and a Root Cause
Here's a simple test. Ask "why" five times.
A weld cracked on a customer's assembly. Why?
- The bead didn't penetrate deep enough. Why?
- The welding parameter settings were wrong. Why?
- The welder set them from memory instead of the job traveler. Why?
- The job traveler didn't specify weld parameters for this material. Why?
- Nobody updated the traveler when they switched to the thicker-gauge material six months ago.
The root cause is a document that wasn't updated when a material changed. That's fixable. And fixing it doesn't just stop this specific weld from cracking — it prevents the same gap from causing a different failure later.
This method is called the 5 Whys. It's not complicated. You don't need software or a facilitator. You need a few minutes and the willingness to keep asking.
How to Run a Root Cause Analysis That Actually Works
You don't need a formal process to make this effective. Here's what works for a shop of any size.
Step 1: Write down the problem as specifically as possible.
Not "weld failures." Try: "weld bead cracking on part #4471, specifically at the end of the top horizontal pass, found during inspection on three customer returns in the last 60 days."
Vague problems produce vague root causes. Specific problems produce specific fixes.
Step 2: Gather facts before opinions.
Go to the floor. Look at the part. Talk to the person who made it. Review the work order. Check the inspection record. You're not looking for who is to blame — you're looking for what actually happened.
In almost every case, the person who made the defect is not the cause of the defect. The system they were working in is.
Step 3: Ask why five times.
Write it down — paper is fine. Commit to going at least four levels deep before you stop. Most root causes live at level four or five.
Watch out for "operator error" as an answer. It is almost never the root cause. If someone made a mistake, ask why the system allowed that mistake to happen. Was there a gauge that should have caught it? A step in the work instruction that was unclear? A training gap?
Step 4: Identify the real fix — not just the patch.
There are two kinds of fixes:
- Containment: Stop the bleeding. Pull the bad parts. Notify the customer. Make sure no more bad parts ship today.
- Corrective action: Change the system so this does not happen again.
You need both. But only the second one ends the cycle.
If root cause is "no calibration schedule for gauges," the corrective action is "create and implement a calibration schedule with a named owner." If root cause is "work instruction didn't include weld parameters for the new material," the corrective action is "update the instruction and add a change-control step for future material substitutions."
Step 5: Verify that the fix actually worked.
This is the step most shops skip. You made the change. Did it hold?
Set a reminder. Go back in 60-90 days and check the data. If the problem hasn't recurred, close it. If it has, you didn't find the real root cause — go back to step two.
Making the Fix Stick
The root cause is one thing. Making the change last is another.
The most common reason fixes fail is that nobody is responsible for them. Someone decides what needs to change. Nobody is explicitly assigned to make it happen.
Assign an owner. Give them a due date. Follow up.
If you're changing a work instruction or procedure, make sure the people doing the work know about it. A paper update that never gets communicated is no update at all. Walk the floor. Show people what changed and why. They'll follow it when they understand the reason.
Document it — not because of ISO, though it helps there — but because the next time something similar happens, you want to know what you did before and whether it worked.
A simple format: problem, root cause, corrective action, owner, due date, verification date, verified by. A spreadsheet works. The format doesn't matter. The discipline does.
When the Same Problem Shows Up on Multiple Jobs
If you're seeing the same type of issue across different jobs or part numbers, you're not looking at a single root cause — you're looking at a systemic one.
A machining shop that keeps hitting dimensional tolerances across different parts isn't having a one-off problem. They might have aging tooling across the board. Or operators trained differently on different shifts. Or a measurement system that isn't calibrated consistently.
When a problem pattern shows up in multiple places, the investigation needs to widen. Pull three months of quality data. Look for the pattern. What do the failing jobs have in common? Same operator? Same machine? Same material supplier? Same shift? The answer is usually in the data if you look.
One More Thing Worth Saying
There is a strong temptation, when the same problem comes back, to escalate the pressure. Work harder. Inspect more. Write stricter procedures. Add more sign-offs.
That rarely works. More pressure on a broken system produces more stress, not better outcomes.
The answer is almost always simpler: find the actual cause, fix the actual cause, and confirm it's actually fixed. That's the full loop. Most shops do the first half and skip the rest.
What CLEO Does With This
If you want a system that handles this automatically, CLEO does exactly this for small manufacturers. When a problem is logged, it guides you through the root cause investigation, tracks the corrective action with an owner and due date, and reminds you to verify the fix held. It keeps the thread from getting dropped — which is the most common reason problems come back.
But the discipline is the same whether you use CLEO or a spreadsheet. Get to the root cause. Assign the fix. Come back and check.
The Short Version
If the same problem keeps happening in your shop:
- Write the problem down specifically — part number, defect type, frequency.
- Ask why five times to find the real cause.
- Fix the system, not just the part.
- Assign someone to own the fix, with a due date.
- Come back in 60-90 days and verify it held.
The problem is not the weld. It's not the tolerance. It's the gap in your system that allowed the defect to happen — and keep happening. Find that gap and close it, and the problem stops coming back.
Protect what you build.
CLEO handles your quality system so you can focus on making things. $99/month, all-inclusive.
Start for free