How a Computer Vision AI Agent Detects Manufacturing Defects with 99.4% Accuracy — Replacing 70% of Manual Inspection for a Saudi Packaging Manufacturer

aTeam Soft Solutions April 2, 2026

A Quick Overview

We collaborated with a major packaging producer in Saudi Arabia on three production lines operating 24/7, generating over 2 million cardboard boxes, flexible packaging sleeves, and printed cartons on a weekly basis. Their clients were food companies, consumer goods, and pharmaceutical makers, so quality failures weren’t only a manufacturing problem. One defect that was missed could cause it to return or be reprinted, bring on a customer complaint, SFDA-related risk to compliance, or, at worst, cause them to lose a major pharmaceutical client.

Before we came in, quality inspection relied on a 15-member inspection team spread on three shifts. They were doing careful work, but they were battling basic physics. Up to 800–1,000 units can be reliably inspected by one human inspector. Each line was operating at more than 5,000 units per hour. In practice, 15-20% of the total production was checked by sampling. That led to bad parts making it into customers’ hands, it led to late discoveries of root causes, and it too often let production teams learn of quality drift after they had already made hundreds or thousands of bad parts.

At aTeam Soft Solutions, we developed a computer vision AI agent that monitored each unit on the line in real time, automatically rejected high-confidence defects, which focused human verification on ambiguous cases, and most critically, alerted on production drift in time to stop significant batches of defected packaging. The result was full inspection coverage, an accuracy of 99.4% in detecting defects, customer complaints reduced from 8-12 per month to virtually zero, and cost savings to the tune of around SAR 3.5 million per year.

How does the Quality Control Seems Like Before We Made the System?

When we began investigating the plant, the problem wasn’t that the company didn’t have a quality procedure. It had one that. The problem was that the process was designed around manual sampling in an environment of production speed and product variety that it could no longer realistically accommodate manual inspection.

The firm produced a variety of packaging products, including cardboard boxes, flexible packaging materials, and printed cartons for food, pharmaceuticals, and consumer goods. In some work, the visual bar was relatively forgiving. In others, particularly pharmaceutical cartons and branded packaging, there was virtually no allowance for print errors, barcode failures, contamination, or dimensional running. A tiny print defect on a pharmaceutical carton wasn’t just a cosmetic problem. It could trigger a batch rejection, a compliance issue, or a contract-level escalation.

Manually, the workflow was like this

At the end of every production line, units were visually inspected for the quality of printing, physical condition, dimensional stability, barcode scanning, and contamination. They look out for colour variation, smudges, plate misregistration, weak glue flaps, wrong folds, dints, tears, missing text, oil stains, and foreign particles. Even an experienced inspector on a good day can cut through the obvious issues fast. But you cannot inspect every single unit on a high-speed line all day long with that level of attention, not even a really good inspector.

And that is what created the real operational gap.

With sensible attention, an inspector could check 800-1,000 units per hour. Each line was pushing in excess of 5,000 units per hour. So the plant was, in effect, doing sample-based quality control on a process that needed full-time inspection. The sampling model allowed a great deal of good product to flow through efficiently, but could also allow a defect pattern to emerge and continue for a considerable length of time before a human would notice it.

It created multiple sorts of risk all at once.

First, there were just many defective units that got through. The estimated rate of manual inspection for the defects was about 85% according to the factory. That sounds pretty good, at least until you consider what 15% of that would be, given the volume they were generating. It also meant the customer was getting actual defects.

Second, defect patterns were frequently identified too late. If plate registration drifted, or glue quality started deteriorating, or color balance gradually strayed out of tolerance, the process might take 30 to 60 minutes to finish. Suddenly, obvious failures are what human inspectors are more likely to catch. They are far less able, if at all, to pick up on slow degradation of quality in real time, especially well into a long shift.

Third, fatigue among inspectors was a real factor. The plant had previously noticed that detection rates decreased as a shift progressed. On an hour-by-hour basis, human detection may be near 90%. By the eighth hour, it can dip to less than 70%. It is not a staffing issue. It’s what happens when you rely on human attention as your primary inspection mechanism in a repetitive, high-speed manufacturing environment.

The impact on business could be seen.

Customer complaints regarding packaging quality were around 8-12 per month. There was a different price to pay for every complaint. Some of them triggered returns. Some of which required reprints. Some of them produced emergency conversations with key accounts. Some undermined confidence in particular, demanding customers. The pharmaceutical side was particularly sensitive. One faulty pharmaceutical package that finds its way into the market can give rise to a much larger problem downstream than the immediate value of the packaging itself.

The plant was also losing value latently because defects were frequently detected too late. When a print issue is discovered only after folding and sticking have taken place, the business has already invested the cost of labor, machine time, and the material value of good output on bad output. Much more waste could have been avoided through early detection.

This is why we did not think of the question as “can we automate end-of-line inspection?” We thought of it as a more general AI quality inspection manufacturing problem: “how can we inspect every unit, know what real defects are with high accuracy, scale across hundreds of packaging designs, and enable the production line to do something early enough to keep quality drift from becoming scrap, complaints, or customer risk?”

Why Was It Not Enough to Use the Current Tools?

The producer had previously experimented with more basic machine vision and sensor solutions before engaging us. Those solutions worked for very narrow interrogations, such as can the barcode be read, or as a very limited missing mark detection solution, but could not accommodate the full complement of defects the plant actually cared about.

That’s a common problem in manufacturing.

Whether an expected feature exists for a thin-section vision sensor. It doesn’t have any way to think about color balance, blurring, structural damage, contamination, quality of glue, alignment of the print, or drifting of dimensions on 500+ different package designs. When the product changes, the inspection logic must often be manually reconfigured.

General-purpose cloud vision tools also weren’t the right fit. They are developed for general image understanding solutions, not industrial quality control solutions. They have no idea what a significant print shift on a carton is, or what degree of color variation is acceptable for one pack class, is unacceptable for another, or how to check a pack at conveyor speeds in real-factory conditions.

The environment made the problem even harder. Varying lighting throughout the day. Products came with different colours and finishes. Packaging was on the move quickly. Slight vibrations or changes in alignment can change the image conditions. And the quality standards weren’t the same. A two-percent variance in color may be fine on a brown box used for shipping, but not on a pharmaceutical carton that is governed by strict brand and regulatory requirements.

So this wasn’t a “go buy a camera and plug it into a dashboard” situation. It required a specialized AI agent for industrial inspection, one that could generalize defect patterns by product category, operate at edge speed, activate rejection hardware, and translate defect detection into production intelligence.

That’s why we built the system to be a complete computer vision defect detection platform with action logic, rather than simply a visually-based scoring tool.

How We Developed the AI Quality Agent?

We developed the solution in stages, as the manufacturing teams require two kinds of trust before they will trust an automated inspection system. They have to believe that it actually sees real defects. And they have to believe it’s not going to disrupt production, through bad reject logic, through unstable performance.

So we began with visibility and shadow-mode validation before permitting the system to reject or escalate anything on its own.

Constructing the Camera Network and Training the Models

The first stage was concerned with the acquisition of images and supervised learning.

We fitted high-resolution industrial cameras at three positions on each line: post printing, post die-cutting and folding, and post gluing/assembling. Now we’re talking nine cameras, three stations, three lines of production. The motivation for this layout was pragmatic. Different defects can be observed in different process steps. Print issues show earlier. Structural and fold problems show up later. Glue and seam problems occur after the assembly.

Images of every unit, not samples, were captured by the system. It was a complete revamp of the business model. Rather than inspecting a fraction of output, the AI agent had visibility over 100% of production.

Then, we built up the training set. In the course of time, we gathered and labeled over 50,000 images of good and defective units in the most common product types. Defects were labeled by type, including: color variation, misregistration, contamination, physical damage, dimensional error, barcode failure, smudging, missing print, and related categories

We did not use a single monolithic model for all. We employed a combination of YOLOv8 for defect localization and object-level detection and task-specific custom CNN-based quality scoring models for print-related evaluation. This was an issue because some defects are discrete and spatial, and others have more to do with how something visually deviates from a reference standard.

The system ran in the shadow during this first stage. It surveyed each unit, but it was not calling for rejection or production yet. Instead, we created a live dashboard to compare the AI agent results to the human inspectors and to calculate defect rates, AI-human agreement, false positives, false negatives, and product-type-specific performance. This gave the quality team enough breathing room and gave us the feedback we needed to fine-tune the thresholds.

Here, that stage was crucial. Trust in a factory is gained by performing together, not from marketing claims.

Transforming Detection Into Real-Time Rejection and Notification

Once the performance of the model was sufficiently stable, we began executing automatically.

At this point, once the AI agent identified a defect exceeding a high-confidence threshold, it would activate an automatic reject system – via a pneumatic diverter, or similar line-side mechanism. Medium-confidence defective units were not automatically discarded. They were highlighted for human verification, with the suspect region clearly highlighted on the operator or inspector display.

That played fairly well with speed and control. Obviously, bad units have been removed immediately. Borderline cases still underwent human review.

But the greatest value of this phase was not in the rejection itself. This came from pattern detection.

The AI agent was not just looking at one defective unit at a time. It was tracking how the defect rates were evolving. When color variation in one line rose from 0.5% to 3.2% in half an hour, that was surfaced to the system as a production signal. If misregistration suddenly starts appearing regularly after a plate change, it will flag the pattern and suggest a potential source. Rather than waiting for the production team to find an escalating problem on their own, the system raised a warning, with the problem still manageable.

This is where the project transitioned from automated quality control AI to quality intelligence. Finding one bad carton is helpful. Identifying a machine that’s drifting toward a thousand bad cartons is a whole lot more valuable.

Enhancing with Predictive Quality Intelligence

In the subsequent stage, we linked vision output to production context.

We associated defect patterns with machine parameters, environmental factors, operator shift, maintenance records, raw material batch, ink batch, humidity, and machine speed. This provided the AI agent with more than just visual memory. That provides its operational context.

Once that layer was put in place, the system was able to start predicting risk versus just reporting on current defects. When humidity was rising, a certain ink batch was on the market, and some early color instability signs were manifesting, the system would alert that the risk of a quality problem was going up. If a known plate-alignment problem pattern occurred post-changeover, it could direct the production team to the probable fix much sooner.

We also produced shift-level and run-level reports automatically. Instead of waiting on manual quality reports, supervisors were getting defect breakdowns by type, potential root-cause signals, and benchmarks against plant KPIs. For our pharmaceutical clients, the same examination data was repurposed to provide quality assurance for entire batches.

This was the stage at which the AI agent ceased to look like an inspection add-on and began to act like a factory-level quality intelligence layer.

Adjusting to New Products Without Lengthy Retraining Periods

Packaging production is an ever-changing environment. The plant was producing 500+ different packaging designs, and new jobs kept coming in. A system that requires weeks of retraining for each new product would not be commercially viable.

So we built the late stage on quick adaptability.

When a new design went into production, the first hundred good units could be saved as a “golden reference.” The AI agent employed that reference to learn what the new job typically looks like and develop defect detection quickly. This made it possible for the system to be operational on new SKUs in around 30 minutes instead of having to go through an entire model redevelopment cycle.

We actually created a continuous learning loop as well. Each false positive and false negative detected by the residual human auditors was fed back into a weekly retraining pipeline. This slowly improved performance on edge cases and newly seen defect types as well.

For aTeam Soft Solutions, this constant change is what truly makes a manufacturing AI agent development India project sustainable in operational terms. In manufacturing, the initial model is never the final. The learning loop is a part of the product.

Technical Execution

The solution integrated edge inference, industrial camera processing, and centralized analytics.

On the camera side, we employed industrial imaging hardware connected to NVIDIA Jetson AGX Orin edge devices. That was because the inference had to be done in less than 700 milliseconds per unit at over 5,000 units per hour per line. Cloud-based inference would have added too much latency. With the TensorRT-optimized edge deployment, inference time was about 120 ms per unit, within comfortably line-speed requirements.

We implemented image acquisition and preprocessing using OpenCV and GStreamer. Communication between the edge stations and the central platform is handled by MQTT. Defect localization and detection are handled by YOLOv8. Custom CNNs performed print-quality scoring and other visual quality checks that were more comparative than object-like ones.

On the backend, Python-based services managed orchestration, result logging, production alerts, rejection logic, and plant system integration. TimescaleDB held production telemetry so we could correlate time-based machine conditions with defect behavior. The broader part of the operational data was handled by PostgreSQL. Captured imagery and training datasets were stored in AWS S3. Model training and retraining cycles were supported by SageMaker. Redis also provided support for fast coordination among tasks and caching in real-time processing. Because the quality dashboard was developed in React, production, QA, and plant leadership could view overall current performance, defect trends, and alert state.

Agentic AI manufacturing in Saudi Arabia deployments is not an ML project, and this architecture makes sense. They are like real-time industrial systems. The model also has to fit into the timing of the machine, reject hardware, operator workflow, and plant decision-making.

The Obstacles That Needed Genuine Factory Engineering, Not Merely AI

The Line speed was the first major resource restriction. So at over 1.4 units per second, it had to capture the image, normalize it, infer on it, and decide quickly enough for the reject mechanism to do its job. That’s why edge deployment was a must-have.

Lighting variation was another real issue. Even on a factory floor, the light is not uniform. Skylight, old lamps, vibrations, slight cam angle changes, and other elements vary the consistency of the images. We solved that with dynamic normalization by a visible reference card inside the FoV that the AI agent uses to recalibrate color interpretation on the fly.

Quality thresholds weren’t global, either. The factory didn’t want a global definition of “defective.” A pharmaceutical carton had to have a near-zero tolerance. A standard shipping carton allowed for a little greater variability. We have product-tier quality profiles already defined, so the AI agent can automatically know which yardstick to apply for each product that it produces.

Another hurdle was operator trust. The production teams needed to be confident that the number of false rejects would be small enough not to unduly disrupt production. Quality teams had to be assured that even very subtle faults would be detected. That’s why the confidence thresholds and the shadow-mode calibration were so critical before turning on automated rejection.

Outcomes

The most immediate consequence was coverage. Inspection changed from 15-20% sampling to a full 100% of units being inspected. That in itself brought a change in the quality mindset within the plant, because the company was no longer depending on luck to know whether a defect would be caught.

Defect detection accuracy was 99.4% as opposed to around 85% with human-only sampling. The AI system was also more temporally consistent, as it did not experience fatigue. As a result, the number of customers receiving defective units has dropped 94 percent.

Complaints from customers reduced from 8-12 per month to almost none. From a business perspective, the company went from escalating regularly to complaints being the exception rather than the norm. That was important, particularly for pharmaceutical customers, where a packaging defect has much more commercial and regulatory sensitivity.

The inspection team was cut down from 15 people to 5, focused primarily on AI-flagged cases, exception review, and new product setup. The rest of the experts were not taken out of quality. They were shifted toward higher-value quality work rather than rote end-of-line sampling.

One of the biggest hidden gains was from the alerting system. Root causes that might have taken hours to notice can now be identified in minutes. That cut production waste by detecting drift earlier in the line. The company anticipated a 60% reduction in waste related to the late detection of defects.

The false positive rate remained low at approximately 1.2%, which the client was happy with, as rejecting a few good units is much less costly than shipping a defective one in regulated or brand-sensitive packaging.

Most notably, all three pharmaceutical clients renewed and even grew their contracts upon seeing the AI-based quality system running. There’s no clearer indicator of trust than that in this kind of manufacturing environment.

The company projected total annual business savings under waste reduction, reduced complaints, lower manual inspection costs, and contract risk at SAR 3.5 million. For us at aTeam Soft Solutions, this case is a strong representative of AI quality inspection manufacturing done right in the Middle East: not only better detection, but better prevention, better customer confidence, and stronger production control.

Technology Stack Overview

We developed the system using Python services, YOLOv8, and custom CNN models for vision inference, OpenCV and GStreamer for camera processing, NVIDIA Jetson AGX Orin edge devices for low-latency inference, React.js for the quality dashboard, PostgreSQL and TimescaleDB for operational and telemetry data, MQTT for edge-to-server communication, AWS S3 for image storage, SageMaker for model training, and Redis for rapid task collaboration.

What We Gained

The important realization was that detection is valuable, but it’s prevention where the truly big economics live.

In the beginning, the explicit goal was “catch more of the bad units.” We got that. But after the system was up and running, it was the production alert layer that turned out to be the most valuable feature. The AI agent could detect gradual patterns of degradation that human inspectors would not be aware of until much later. Catching the drift from becoming a batch problem was far more valuable than just throwing out defective units at the end.

We also discovered that good AI had to be product-aware. A single processing plant does not manufacture one product. It makes a variety of products, with different tolerance requirements. The development of quality profiles and rapid new product adaptation was critical to the system being truly usable.

And finally, we found that manufacturing teams adopt AI more quickly when it tells them what’s changing on the line. Once the system could say more than “defect rate is rising” but also say “the likely cause is ink’s gradual change after the recent setup change,” it stopped feeling like a black-box detector and started feeling like an operational partner. That is a core tenet when the company develops each AI agent for factory environments.

What It Means for the Region’s Manufacturers

Manufacturers in Saudi Arabia and the broader Middle East often tend to view vision AI as a way to replace sampling. That’s just part of the value. The greater opportunity is to create a system that looks at every single unit, identifies defect trends over time, and enables the production team to get on top of it early enough to prevent scrap, complaints, and risk to the customer.

At aTeam Soft Solutions, we developed this type of practical production intelligence as a real AI agent rather than just a defect dashboard. That is what enables quality control to move from reactive inspection to ongoing prevention.

Shyam S April 2, 2026