How an AI Agent Automates Government Hospital Order Data Extraction — Eliminating 6 Hours of Daily Manual Work for a Saudi Medical Distributor

aTeam Soft Solutions March 16, 2026

Quick Overview

In Saudi Arabia, a leading distributor of medical supplies and pharmaceuticals found itself spending 5-6 hours daily on a manual process. They were pulling data about government hospital orders from a national procurement portal, then cleaning it up, converting units of measure, reconciling with purchase orders, and entering everything into their ERP system. The trickiest part was converting those units of measure. For instance, a hospital order could say ‘1 box,’ ‘1 BX,’ or ‘1 pack of 10,’ but the internal system required everything to be in standard selling units. Even small errors here could lead to shipment mix-ups, hospital complaints, return issues, and potential penalties.

At aTeam Soft Solutions, we developed an AI agent designed to automate tasks for government portals. This solution effectively managed download workflows, parsed files, adjusted unit of measure variations using product context, highlighted mismatches, and gradually took over routine ERP entries, all while maintaining human approval checkpoints. As a result, we transformed the workload from 5-6 hours of manual effort by four team members to just about 30 minutes of exception review by one individual. Plus, we achieved an impressive 98.7% accuracy in UoM conversion and received zero quantity-related complaints after Phase 3.

The Client’s World Experience Prior to AI

Before we set out to design the system, we took the time to really understand the actual workflow instead of just assuming that the issue was limited to “data extraction.” What we discovered was a repetitive operational process that played a crucial role in the public healthcare supply delivery in Saudi Arabia.

Every morning, four team members from the distributor’s order operations would begin their day by logging into the government procurement portal. This wasn’t a straightforward one-click export; the portal required several filters to be applied in a specific order: date range, region, hospital group, product category, and order status. Since the data was spread out across various combinations, the team often needed to perform 15-20 separate downloads during a single session.

They would pick one combination, wait for the page to load, export the file, give it a new name, move on to the next filter set, and do it all over again. If the session timed out, they had to log back in. When the portal slowed down, the entire process lagged. If a captcha popped up, they had to pause their progress. Some days, the raw downloads wrapped up by mid-morning, while other days, particularly when the portal was busy, just the download stage could take over two hours.

Once everything was downloaded, the next phase kicked off. Staff open CSV or Excel files one by one, eliminate unnecessary columns, standardize headers, ensure the hospital name is complete, and manually pull out fields like PO number, hospital, ship-to address, product code, requested quantity, and delivery date. They often copied values between files and internal templates because the export format from the portal didn’t match the structure needed for ERP imports.

Then, they faced the most challenging part: converting units of measure. A hospital order might ask for 1 box of gloves, 2 cartons of syringes, or 10 EA of a consumable. But the distributor’s system kept track of stock in base units. For one SKU, “1 box” might equal 100 pieces, while for another, it could be 50. Similarly, “1 case” could represent 24 units for one product line and 12 for another. Staff had to examine the product description, supplier catalog references, and historical records to manually calculate the correct internal quantity.

This is where fatigue became a real challenge. By the time the team arrived at the conversion stage, they had already invested hours navigating the portal and cleaning up spreadsheets. Just one incorrect conversion could lead to the wrong shipment quantity, spark a complaint from a government hospital, and create extra work with returns and credit notes down the line.

Even after conversion, the data still wasn’t quite ready. The staff had to meticulously compare the order lines with existing purchase orders and shipment records. They looked for duplicates, unmatched orders, discrepancies in quantities, and any missing references. Only after that did they manually input the verified information into the ERP or order management system.

This process took about 5-6 hours each day across four team members. The client estimated that around 4-6% of line items had errors related to quantities or units of measure, leading to issues later on. In a procurement setting for public hospitals, this wasn’t just a minor administrative hassle. It impacted service quality, inventory planning, and the distributor’s reliability in a high-pressure healthcare supply chain.

Why Current Tools Did Not Succeed

This client had already experimented with various forms of automation before reaching out to the company, but unfortunately, each attempt failed right when the workflow became complicated.

Their initial venture was with RPA through UiPath. It provided some assistance for a brief time with file downloads, but the frequent changes to the government portal rendered the bots unreliable. The position of buttons shifted, form layouts changed, session behaviors evolved, and new captcha processes emerged. Every couple of weeks, the automation required fixes. As a result, the team still relied on specialists to keep the bot functioning, which essentially undermined the goal of achieving daily operational automation.

They also looked into basic Python scripts. These scripts did a great job cleaning files after download, but they struggled to navigate a JavaScript-heavy Oracle APEX interface, manage authenticated sessions, or respond to browser-side events. Essentially, the scripts were too fragile for the front end and too inflexible for the actual workflow.

The biggest challenge, however, was interpreting units of measurement (UoM). This was not just a straightforward mapping table issue. The same product could be listed as “1 box,” “1 BX,” “pack of 10,” “10 EA,” or even as a fractional carton based on the hospital or export formatting. A converter that relied purely on rules quickly became a maintenance headache.

That’s why the solution required agentic AI rather than standard automation. The workflow called for a system capable of doing three things simultaneously: functioning within a shifting browser environment, interpreting semi-structured business data in context, and gaining operational trust through supervised decisions. It wasn’t enough to just click buttons faster; the solution needed to grasp what it was doing well enough to discern when to take action and when to escalate issues.

Our Agentic AI Approach – Architecture

We took a phased approach to implementing the AI agent because handling procurement and hospital supply workflows is not ideal for immediate full autonomy. A misstep could impact actual deliveries. That’s why we set up a gradual rollout, allowing the client to build trust through measurable accuracy.

Examine and Retrieve

During the first month, our focus wasn’t on taking autonomous actions but rather on ensuring reliable data extraction, organized storage, and human validation.

To achieve this, we developed a Chrome extension-based AI agent that logged into the government portal following the client’s authenticated process, navigated through various filter combinations, and downloaded files into a monitored folder. In the event of encountering a captcha or an unexpected login challenge, the system didn’t just fail quietly; it paused the process and notified a human to intervene.

Once the files were downloaded locally, a background service would pick them up, parse the data, and run the line items through a UoM normalization pipeline. This is where we utilized Claude Sonnet in a controlled manner. It didn’t simply assess the UoM field on its own. Instead, it took into account the product description, catalog references, known product mappings, and historical order trends to accurately determine the normalized unit.

Humans played a crucial role in this phase. Each extracted order showed up on the review dashboard, complete with its source file, parsed fields, converted quantities, and confidence scores. Staff members carefully checked the results, corrected any errors, and approved the final output. These corrections were then stored as feedback data. The primary goal for Phase 1 wasn’t just speed; it was to create a dependable training and validation dataset while achieving a stable extraction accuracy that would allow us to move into assisted decision-making.

Propose and Validate

During weeks 5-8, we took a big step by enhancing the AI agent to include reconciliation logic along with extraction.

At this point, the agent was smart enough to automatically compare order lines from the portal with existing purchase orders and shipment records. If the portal showed 500 units while the purchase order indicated only 480, the system would flag this difference and give a reason for it. If a hospital order came in that didn’t have a matching PO, the agent would point out that it might need a new PO to be created or require manual confirmation. Additionally, if there were orders that seemed duplicated across files, the system would flag those before they entered the ERP.

What changed was that it mattered for the staff. They weren’t merely double-checking pulled data anymore. They were looking at recommended courses of action. The dashboard displayed not only what the order data said, but what the system wanted us to do next: approve for import, hold for review, investigate UoM mismatch or confirm duplicate risk.

The humans still had the final say, approving each action with a simple click. This step was crucial because it transformed the system from simply parsing information into a decision support tool, all while keeping the operations team accountable. We set clear internal goals: maintain extraction and conversion accuracy above 95%, ensure that false-positive discrepancy flags stayed within an acceptable review threshold, and enable staff to resolve flagged items more swiftly than before.

This stage helped establish trust since users could see that the recommendations were based on real business context rather than just mysterious AI processes.

Proceed with Caution

During weeks 9-14, we rolled out controlled autonomy.

The AI agent gained the ability to automatically create order records in the ERP, but only when three specific conditions were satisfied. Firstly, the confidence level for UoM conversion needed to be over 97%. Secondly, the purchase order reconciliation had to match perfectly. Lastly, the order pattern had to align with established historical trends for that particular product and account.

If any of those conditions were not met, the order would be sent directly to the human review queue—no guessing involved and no partial actions taken.

This is where architecture transitioned from merely supporting decisions to executing operations. The dashboard effectively distinguished between automatically processed orders and those that were flagged. Every day, management received summary emails detailing how many orders were processed automatically, how many were held up, the UoM confidence rate, and any anomalies that were observed by the hospital or product group.

From an enterprise operations standpoint, this marked a significant shift. The team no longer spent time on routine, repetitive cases but instead concentrated on the exceptions that genuinely required judgment. For an AI data extraction business use case, this represents the true value: not just extracting data from a portal, but ensuring that verified data moves safely into the next operational phase.

Complete Independence with Audit Trail

During months 4-6, after the client enjoyed over two months of excellent production accuracy, we took the exciting step of introducing delivery ticket automation.

Now, our AI agent can automatically generate and close delivery tickets for straightforward cases. If the system hits a snag like stock shortages, address mismatches, credit holds, or any other business-rule violations, it pauses the process and sends alerts to the relevant team via the dashboard and WhatsApp notifications.

While this phase didn’t eliminate the need for human oversight, it transformed it. Instead of having to review every single item, staff focused on exception dashboards and regular audit reports. Every action taken by the system was meticulously logged, including timestamps, source data references, confidence levels, and the ability to roll back changes if needed.

This audit trail is particularly crucial in Saudi Arabia, where maintaining operational transparency is vital when providing services to government hospitals. At aTeam Soft Solutions, we believe that autonomous workflow execution must always be reversible, understandable, and guided by business rules. This approach is what makes agentic AI procurement in Saudi Arabia deployments practical and safe rather than a risk.

Technical Execution

The solution brought together browser automation, language intelligence, data validation, and workflow orchestration, instead of just depending on a single model or tool.

For our portal interactions, we implemented a Chrome Extension based on Manifest V3. This allowed us to manage browser-side tasks smoothly for things like logging in, navigating, downloading, and handling sessions. Since the portal didn’t have an API, browser automation was really our best bet. A Node.js background service was in charge of managing the extraction pipeline, keeping an eye on the download directories, and sending tasks into Redis queues to ensure that processing could keep going steadily, even when portal downloads occurred in bursts.

For interpreting AI data, we utilized the Claude API Sonnet for contextual unit of measurement normalization and semi-structured data extraction. We complemented that with deterministic validation logic to avoid relying solely on model outputs. We drew on product descriptions, supplier catalog references, historical order behavior, and established product-specific mappings to check if the proposed normalized quantity was logical for business. The reconciliation logic was managed through Python services, making it easier to maintain the client’s rules for comparing portal orders, purchase orders, and shipment records at this layer.

Once parsed and validated, the data was stored in PostgreSQL. The processing services were hosted on AWS EC2, while RDS was responsible for the managed database, and S3 was used for file storage and keeping source documents. Users accessed a user-friendly dashboard built with React.js, which provided a clear view of source files, extracted fields, normalized units of measurement, confidence scores, explanations for discrepancies, and options for one-click approval or corrections.

We put a lot of thought into creating the human-in-the-loop interface. Our goal was to ensure that reviewers didn’t have to juggle multiple tools to make a single decision. Instead, we provided a unified screen that displayed everything the system extracted, explained the reasoning behind its recommendations, and included the supporting evidence. Anytime a user made corrections to a conversion, we made sure those changes were fed back into the product-specific mapping table and evaluation logs, allowing the system to learn and improve over time.

When it comes to error handling, we made it straightforward. If a portal download didn’t succeed, the job would retry after validating the session. In case a captcha popped up, we triggered a human alert. If confidence levels dipped below a certain threshold, no actions were taken in the ERP. And if an ERP import didn’t go through, the order was reverted to a pending state, making it easy for users to see the exact reason for the failure on the dashboard.

At aTeam Soft Solutions, we believe that technically implementing an automated order processing AI system is about much more than just picking the right model. It’s about ensuring that all aspects—browser actions, data interpretation, business rules, review controls, and auditability—function seamlessly together as a cohesive operational system.

Tackling Challenges and Considering Edge Case Situations

The toughest challenge of this project was dealing with all the messy real-world conditions that hit us at once.

First, the government portal didn’t have an API and was built on Oracle APEX, which led to dynamic and unreliable element IDs. Unfortunately, typical CSS selectors and XPath-heavy automation just couldn’t hold up against changes in layout. To tackle this, we created a visual recognition layer that could pinpoint important UI elements by using label text, nearby context, and their relative position on the screen. This approach provided our automation with a more secure way to navigate the portal, even when the DOM structure changed.

Then, we had to grapple with the data itself. Both Arabic and English values popped up side by side in exports. Hospital names weren’t always consistent, and the ship-to addresses sometimes differed in abbreviations between the portal and our internal systems. Plus, some product descriptions were quite detailed, while others were short enough to cause confusion without context from supplier catalogs.

Normalizing for UoM was quite a challenge! What “1 box” means could vary widely across different products, like gloves, syringes, catheters, and consumables, each having its own unique packaging patterns. Our system needed to grasp the context of each product and not just focus on text similarities. At the start, our conversion accuracy was at 89% in week one, but by month three, it soared to 98.7% thanks to the product-specific mapping table learning from validated corrections.

The human element played a crucial role as well. Staff had seen earlier automation efforts fall short, which made them hesitant to trust the new system initially. Many reviewers were meticulously checking each item, even when the confidence score was high. To tackle this, we created confidence calibration views that displayed historical accuracy by product category. This helped users see where the system was performing well and where it would benefit from additional review.

This experience is part of why the company believes that an AI agent designed for enterprise operations should cater not only to technical challenges but also foster trust among the team members who will be using it.

Outcomes

The business benefits became apparent pretty quickly, but what really stood out was not just the time we saved. It was also about the reliability we gained.

Before we introduced automation, four team members spent around 5-6 hours each day taking care of downloads, cleaning up spreadsheets, converting units of measurement, checking reconciliations, and entering data into the ERP system. Once we launched Phase 3, this dropped to about 30 minutes of daily review by just one team member, primarily focusing on any exceptions. This led to a significant decrease in manual effort and resulted in an estimated annual labor saving of around $85,000.

The accuracy of UoM conversion shot up from a manual range of about 94-96% to an impressive 98.7% with the help of our supervised AI agent workflow. While that might seem like a small jump, it makes a big difference in medical distribution. Just a few percentage points can mean the difference between smooth operations and hospitals raising regular quantity complaints.

The speed of order processing saw a remarkable transformation, too. What used to take until the next day is now a same-day process that’s wrapped up within just two hours of the portal update. This improvement provided the operations team with better visibility much earlier in the day, which helps them plan for dispatch and stock movement much more effectively.

Additionally, the client experienced an immediate drop in customer-related issues. In the four months following the launch of Phase 3, they recorded zero quantity-related complaints from hospitals, compared to 8-12 such complaints each month before. This improvement has led to fewer returns, less need for corrections, reduced delivery disputes, and a boost in internal confidence.

Three of the four original data entry team members have been reassigned instead of being let go. They’re now focusing on supplier relationship management and demand planning, where their skills can shine. This was important for our client, as they didn’t just want an automation project that replaced people. They aimed for one that liberated talented staff from repetitive clerical tasks.

A significant operational benefit came from the new reconciliation process. The system now flags 15-20 discrepancies each week, which previously would have been spotted only later during shipping or handling complaints. By catching these issues earlier, it enhanced purchase order visibility, minimized shipping risks, and allowed the business to address problems before they reached the hospital.

With reduced labor costs, fewer shipping errors, and fewer return logistics, the client estimates they’ve avoided about $120,000 annually in downstream costs on top of labor savings.

Currently, the client is in Phase 4 for select low-risk operations, featuring autonomous ticket generation and exception-driven oversight. In the Middle East healthcare supply landscape, this is the perfect goal: routine tasks are automated, high-risk cases are escalated, and every action is completely traceable.

Summary of the Technology Stack

This solution utilized a Chrome Extension based on Manifest V3 for automating the portal, which included managing the login process, navigating the browser, controlling downloads, and handling sessions. Workflow orchestration, file monitoring, and job execution were managed by Node.js services. The Claude API Sonnet took care of contextual unit of measurement normalization and semi-structured extraction. Python services were responsible for the logic behind purchase order and shipment reconciliation. PostgreSQL stored the normalized operational data, validation history, and audit records. The human review dashboard was powered by React.js. REST API connectors integrated with the client’s ERP system. For infrastructure, storage, and processing needs, AWS EC2, RDS, and S3 were utilized, while Redis managed background queueing and retry control to ensure stable execution.

What We Gained

This project really highlighted an important principle that we’re now applying to every AI automation medical distribution case we tackle: phased trust is essential.

If we had aimed for full autonomy right from the start, the client might have noticed quicker automation results, but the associated operational risk would have been way too great. Just one incorrect unit of measure conversion sent to a government hospital in Saudi Arabia could have eroded trust much more than a month spent on careful validation. By having that supervised observation period, we allowed the system to build credibility with users, and we were able to fine-tune the product-specific logic.

We also discovered that being accurate isn’t enough to ensure adoption. Our team needed to recognize where the system excelled and the reasons behind it. That’s why the confidence calibration feature was so crucial—it shifted the approach from “review everything” to “review what really matters.”

If we were to start over, we would definitely incorporate category-level UoM reference data into the project much earlier and establish our exception taxonomy sooner. This approach would have helped lessen the amount of manual reviews we had to do initially. At aTeam Soft Solutions, this experience clearly showed us that any agentic AI workflow involving procurement, compliance, or regulated delivery needs to undergo at least eight weeks of supervised operation before we allow it to take autonomous actions.

Collaborate With Us

At aTeam Soft Solutions, we focus on creating practical AI agent government portal automation systems for businesses that are still relying on manual workflows in areas like procurement, distribution, compliance, and operations. If your team in Saudi Arabia, the UAE, or the broader Middle East finds itself constantly transferring data from portals, emails, PDFs, spreadsheets, or messaging platforms into your internal systems every day, there’s usually a smarter way to handle that work.

We don’t just jump into theory. We begin with your actual workflow, real exceptions, and the genuine controls your team requires. From there, we develop an automation approach that starts off under supervision, confirms accuracy, and gradually expands safely into full action.

Shyam S March 16, 2026