How an AI-Powered WhatsApp Bot Eliminated 85% of Manual Data Entry for a Logistics Company — Processing 2,000+ Documents Per Day

aTeam Soft Solutions March 10, 2026
Share

The Quick Overview

We created an AI-powered WhatsApp document-processing system for a mid-sized logistics and freight forwarding company in the UAE and Saudi Arabia. This client manages over 500 shipments each week involving sea freight, air freight, and land transport, which means they receive a lot of documents in various messy formats such as PDFs, scans, phone photos, and WhatsApp forwards. Previously, they relied on eight operators to manually read and enter document data into an outdated logistics system, taking 35-45 minutes per shipment.

At aTeam Soft Solutions, we designed a WhatsApp-first intake and AI extraction workflow that classifies documents, extracts important shipment information, checks data across related documents, and inputs verified records into the client’s logistics system. Now, operators handle only exceptions instead of typing everything out.

This led to a remarkable improvement in operations: manual data entry was reduced by 85%, processing time fell to just 4-6 minutes per shipment, accuracy shot up to 97.5%, and during peak periods, the company scaled up to 2,000+ documents per day without needing to hire more staff. This project stands out as a great example of AI-powered data entry automation for logistics teams managing high-volume document workflows.

The Client and Their Problem to Overcome

The client is a mid-sized logistics and freight forwarding company catering to customers throughout the UAE and Saudi Arabia. They handle all the import and export documentation for shipments in various sectors, including FMCG, automotive parts, electronics, and construction materials. Operationally, they find themselves at the heart of a complex network involving clients, shipping lines, customs brokers, transport partners, and warehouses. While they don’t have control over how the documents are produced or delivered, they are accountable for ensuring that the data is accurate and entered into the system swiftly.

That’s where the main challenge lies.

Every shipment entails a set of documents that can consist of a bill of lading, commercial invoice, packing list, certificate of origin, customs declaration, delivery order, arrival notice, and additional paperwork based on the specific route and type of cargo. In reality, this translates to 8-15 documents for each shipment. Some of these documents arrive as neat PDFs, but many do not. They often come in as scanned images, low-quality photos taken on mobile phones, forwarded images from WhatsApp, and mixed groups of documents from various party sources.

The client developed a manual approach to tackle this situation. A team of eight data entry operators painstakingly reviewed each document individually and input the details into the logistics management system. While the operators were skilled, the process turned out to be quite slow and exhausting. On average, it took 35-45 minutes to process the documents for a single shipment. With over 500 shipments each week, the company was investing a substantial amount of time from its team on tedious typing and cross-checking tasks.

The bigger concern was the potential for errors. The client estimated an 8-12% error rate in the fields entered manually, a figure that was consistent with our findings during sampling. Although these mistakes weren’t always major, even small errors can lead to significant issues in logistics: incorrect container numbers, mismatched weights, wrong consignee names, missing HS codes, or discrepancies across different documents. These types of errors resulted in customs delays, penalty fees, shipment holds, and dissatisfied clients.

Things only got more challenging during peak season. In busy times, the document queue piled up by 2-3 days, causing delays that impacted teams throughout the supply chain. Operations had to wait for the necessary documentation to be entered before they could proceed to the next steps. Client communication became more reactive, as the company was still processing the paperwork from the previous day while new shipments were arriving for today.

The client had already tried improving performance through improved SOPs and operator training; however, the bottleneck wasn’t just about training. It involved the volume, variation, and manual aspects of the work. They were looking for automation but didn’t want a rigid system that necessitated every partner to change their behavior, especially since their ecosystem was heavily based on WhatsApp.

When they reached out to us, they were in search of a solution that could accept documents via WhatsApp, reliably extract fields from various formats, validate data before entry, and minimize manual efforts without disrupting their operations. This scenario made it a perfect fit for AI document processing development, OCR automation logistics, and a practical WhatsApp chatbot development company in India approach that centered on operational needs rather than customer support.

Why They Select aTeam Soft Solutions?

The client explored options with several document automation vendors and OCR providers before choosing aTeam Soft Solutions. What really set us apart was that we didn’t just present the project as “OCR software”; instead, we framed it as a comprehensive logistics workflow automation challenge.

During our initial conversations, we requested sample shipment document sets, exception cases, operator workflows, and screenshots of their existing logistics system. We also inquired about the errors that led to customs delays and which fields were most important for downstream processing. This approach helped steer the discussion from general extraction accuracy to focusing on operational results, such as turnaround times, exception rates, and how well the solution would fit into their current process.

As a software development company in India, we were able to design a custom solution that aligned perfectly with their actual intake channel (WhatsApp), their document variability, and the limitations of their legacy system. The client had previously seen a few tools that required users to upload files to an external portal and manually choose document types. That wouldn’t have worked for them; they needed partners and agents to continue sending documents in the way they were already accustomed to.

The client was really impressed that aTeam Soft Solutions could bring together multiple layers in one cohesive build, including WhatsApp intake, OCR, AI document understanding, cross-document validation, exception handling, and system integration. While many vendors excelled in one or two layers, they often fell short of covering the entire workflow.

Our team based in India played a key role in this success. As a web development company in India and a custom engineering partner, we were able to move swiftly, test with sample documents daily, and adjust the system as new document formats emerged. This agility was crucial since the client prioritized getting the bill of lading and commercial invoice automation up and running first, followed by the addition of more document types once we validated the production.

In the end, ultimately the client opted for aTeam Soft Solutions because they were looking for a team capable of delivering a practical logistics automation software India solution that could show measurable improvements in operations, rather than just a basic OCR proof of concept.

Our Method — Exploring and Planning

We kicked things off with a discovery phase that looked at how documents actually flowed through the company, rather than how the standard operating procedures outlined them. This distinction is important in logistics document processing. While document intake may seem orderly on paper, the reality is a mix of forwarded messages, partial scans, last-minute revisions, and urgent requests.

We carefully mapped the entire workflow from intake to entry, noting who sends the documents, how they come in, how operators determine the types of documents, which fields they fill in, what manual validations they carry out, and where they escalate any issues. We also gathered a representative dataset of documents that reflected different shipment types, routes, partners, and quality levels. This collection included clean PDFs, subpar scans, skewed phone photos, stamped documents, and multilingual materials.

Additionally, we took a look at the client’s current logistics management system. One key finding was that the system lacked a usable API. This realization had an immediate impact on our architecture plans, as it meant we needed to consider database-level integration instead of a straightforward API sync.

From our discoveries, we created a phased rollout plan. Phase 1 (lasting 10 weeks) focused on providing production support for bill of lading and commercial invoice processing, since these two document types were responsible for a significant amount of manual work and downstream dependencies. Phase 2 (an extra 6 weeks) would expand the system to cover all 12 document types used in the client’s operations.

We put together a fantastic team consisting of 4 developers, 1 AI/ML engineer, 1 QA engineer, and 1 project manager. For our workflow, we chose Jira for sprint planning, and we used Slack/Teams for our daily chats. Figma came in handy for reviewing the admin dashboard and operator flows, and Git was our go-to for version control. Additionally, we organized a document annotation and review loop with the client’s operators, as their insights were crucial for establishing field labels, edge cases, and real-world exceptions.

One key decision during our planning was to develop the exception workflow right from the start, instead of waiting until after the extraction phase. We understood that while accuracy would evolve, operators needed a quick way to verify uncertain fields from day one. This choice turned out to be one of the greatest wins for adoption in the project.

The Solution — What We Create 

A document intake system on WhatsApp that aligns perfectly with the client’s actual workflow

We developed a WhatsApp-first intake system designed for clients, agents, and partners to effortlessly send documents to the company’s WhatsApp Business number. This was a thoughtful decision. We recognized that the quickest way to derail this project would be to require external parties to use a new portal or adhere to strict upload guidelines.

By leveraging the WhatsApp Business API through Twilio, we established an intake layer that accepts PDFs, images, and forwarded media. It tags these documents to relevant shipment conversations when possible and organizes them for processing. This approach allowed the system to be immediately user-friendly, as it aligned with the clients’ established communication habits.

We’ve also included acknowledgements and status messages, allowing senders and operators to confirm that documents were received and are being processed. This enhancement significantly cut down on follow-up calls and “did you get the file?” messages, which the operations team was grateful for right away.

Document classification powered by AI, even through messy formats

Once a document makes its way into the system, the very first step is classification. We developed an AI document classification layer that determines the document type based on its content, rather than just relying on the filename. This was important because files often came with inconsistent names like “scan1,” “IMG_2345,” or forwarded files that had lost their original metadata.

To classify documents, we used a mix of OCR output, layout cues, and GPT-4-based document understanding. This allowed us to categorize documents into types such as bills of lading, commercial invoices, packing lists, certificates of origin, customs declarations, delivery orders, and arrival notices. The classification logic was crafted to work with PDFs, scans, and photos, including those with partially cut images and documents featuring stamps or handwritten notes.

This classification step was key because the extraction pipeline relies on knowing which field schema to apply. Plus, it significantly reduced the workload for operators, as they no longer had to manually sort documents before entering data.

A pre-processing pipeline designed for handling low-quality images and documents in multiple languages

Document quality turned out to be one of the toughest challenges we faced in the project, so we created a special pre-processing pipeline before OCR. Many of the incoming files were phone pictures taken at odd angles, often with poor lighting conditions, shadows, stamps over the text, handwritten notes, or even folded paper edges. Additionally, some documents featured English, Arabic, and sometimes Chinese text, especially for shipments coming from China.

To tackle this issue, we integrated a range of image pre-processing steps like skew correction, contrast enhancement, noise reduction, orientation adjustments, and cleanup of specific regions whenever we could. The aim wasn’t about making the documents look pretty; it was all about boosting OCR reliability for the crucial fields.

We also crafted the OCR workflow to handle multilingual text extraction and document regions in a way that varied based on the type of document. This enhancement significantly improved both the classification accuracy and the field extraction process. Without that pre-processing layer, the subsequent AI logic would have had to exert too much effort dealing with quality issues.

This aspect of the system perfectly illustrates why OCR automation logistics projects thrive when we view OCR as a pipeline rather than just a singular API call.

Smart OCR and data extraction tailored for every type of document

After we pre-processed the documents, we used the Google Cloud Vision API to perform OCR. Next, we leveraged GPT-4 for understanding documents and implemented rule-assisted extraction to pinpoint and pull out key fields. The fields we extracted varied based on the document type and included shipment numbers, container numbers, weights, dimensions, as well as details regarding consignees and shippers, HS codes, invoice values, and information about origins and destinations.

Rather than relying on just one extraction method, we adopted a multi-faceted approach. For highly structured fields, such as specific reference numbers, we applied pattern and rule checks. For fields that had variable layouts and required more context, like consignee details or line-item interpretations, we utilized AI-based extraction techniques. This mix helped enhance both the accuracy of our results and their explainability.

Additionally, we created document-specific extraction templates and schemas to clarify which fields were mandatory, optional, or conditional. This setup was a big help for the validation engine when it determined whether a document was complete enough to be posted automatically or if it needed to be reviewed for exceptions.

Extracting bills of lading with shipping line awareness using template matching

Bills of lading presented quite a challenge since there’s no single universal format. Every shipping line has its unique structure, and even within a single line, layouts can differ based on the route or system used. To tackle this issue, we trained our system on over 50 different bill of lading formats and developed a template-matching layer. This layer first identifies the shipping line, then applies the best extraction strategy for that format.

As a result, we saw a reduction in extraction errors and an increase in speed since the system no longer needed to process each bill of lading as if it were unfamiliar. It also simplified system maintenance: when a new format emerged, we could simply add or adjust a template, rather than having to overhaul the entire extraction process.

This was one of the most crucial engineering decisions made during the project, given that bills of lading are both high-volume and significantly impact downstream customs and shipment processing.

Data validation engine for cross-document use

Extracting data from individual documents was just the start of our challenge. The client’s operators were also spending a lot of time double-checking values across related documents. To help with this, we created a validation engine that compares the extracted fields across a shipment’s document set and highlights any discrepancies.

For instance, the system ensures that the weights match between the bill of lading and the packing list, confirms that consignee details are consistent across all documents, and validates that key references are accurate. It also checks field formats and completeness rules before pushing the data into the logistics system.

This extra validation layer has really cut down on avoidable errors. Rather than just posting the extracted data blindly, the system identifies records that are highly reliable and internally consistent, while also flagging exceptions that need human verification. This is one key reason why we saw such a big improvement in overall accuracy!

Managing exceptions through WhatsApp instead of a separate dashboard workflow

One of the standout features of our system was the exception workflow. Whenever the AI wasn’t quite sure about a certain field or if a validation check didn’t go through, it didn’t require operators to go through the whole document again. Instead, it would send a specific field back to the operator through WhatsApp for confirmation.

For instance, an operator might get a message asking them to verify a container number or an invoice value, complete with the extracted value and relevant document context. They could easily reply on WhatsApp, and the correction would be recorded and updated in the system.

We originally set up the error review within a web dashboard, but the adoption rate among operators was quite low. They found it much quicker to confirm fields via WhatsApp, as that was already part of their intake workflow. After we redesigned the correction process to make it easier to reply on WhatsApp, we saw a significant increase in adoption and a boost in the speed of corrections.

This adjustment also sped up model improvements because we gathered a lot more structured corrections. This is one of the clearest examples in this project where the design of the workflow was just as important as the performance of the AI model.

System population comes automatically, even without an API in the older legacy logistics software

The logistics system used by our client didn’t have an API, which is often the case when working with older desktop software in operations teams. Instead of just stopping at data extraction and exports, we created a database-level integration layer. This layer maps validated fields to the client’s system schema and allows us to write records directly to the underlying database after performing integrity checks.

This process required us to meticulously map the fields, ensure transaction safety, and manage rollback handling, since direct database integration comes with higher risks than a typical API. We collaborated closely with the client’s IT team to validate the mapping rules and conducted thorough testing in a staging environment before allowing production writes.

This integration was crucial for the project’s return on investment (ROI). Without it, if the operators had to manually copy data from our tool into their system, much of the advantage would have been lost.

Admin panel dashboard, daily reports, and ongoing learning pipeline

We developed a user-friendly React.js admin dashboard tailored for our operations and management teams, allowing them to easily keep track of processing volumes, accuracy rates, exception counts, turnaround times, and operator productivity. Additionally, we incorporated daily summary reports that provide supervisors with a quick glimpse into throughput, bottlenecks, and trends in errors.

Simultaneously, we established a training feedback pipeline that leverages operator corrections to enhance extraction accuracy for the future. We capture structured corrections from WhatsApp exception replies and dashboard validations, normalize them, and incorporate them into our retraining and rule-tuning workflows.

This ongoing improvement process has transformed the system into a valuable operational asset that evolves with use, instead of being just a static OCR tool that tends to degrade as document variations increase.

The Challenges We Encountered and How We Resolved Them

Our first big hurdle was the quality of the documents. Many of the files were just phone pictures taken at awkward angles and in bad lighting, with stamps covering text and handwritten notes scribbled over them. We also had multilingual documents to deal with, and not all of the OCR results were clean enough for us to rely on.

To tackle this, we created a special pre-processing pipeline that corrected and enhanced the images before running OCR. We designed the extraction logic to handle the noise from partial OCR results. Plus, we implemented field extraction rules that were aware of the document types, allowing the system to retrieve important fields even when parts of the page were a bit messy.

The second challenge was the variation in bills of lading. There aren’t any universal formatting standards among shipping lines, so the layouts vary widely. This meant that a one-size-fits-all extractor didn’t perform very well.

We tackled this by training on over 50 different bill of lading formats, and we added template matching that first identifies the shipping line before applying the appropriate extraction strategy. As a result, we were able to lower both the error rates and the volume of exceptions for one of the most frequently used document types.

The third challenge we faced was integrating with the client’s old logistics system, which didn’t have an API. This posed both a process and technical risk because the client needed complete automation, not just extracted CSV files.

We created a database-level integration with strict field mapping, validation, and staged rollout controls. We began with read-only verification, then progressed to controlled write operations in a staging environment, and finally enabled production posting. This approach minimized risk while still achieving full automation.

A fourth challenge cropped up after going live: getting users to adopt the correction workflow. Though our initial dashboard-based correction process was technically sound, it was too slow for the operators. By redesigning the corrections around WhatsApp replies, operator adoption skyrocketed from 40% to 95%, and we started to gather a lot more training data. This improvement enhanced the system faster than any individual model adjustment.

The Results — Tangible Outcomes

The results were clear within just weeks after the production rollout and became even more noticeable as we incorporated additional document types.

The most significant operational benefit was a reduction in labor for repetitive typing tasks. The manual data entry workload decreased by a remarkable 85%, enabling the client to trim down from 8 operators to just 2. These two team members now concentrate primarily on handling exceptions and overseeing processes rather than managing full document processing. This shift not only led to cost savings but also lessened the risk of backlog and bolstered resilience during busy periods.

We also saw a dramatic improvement in processing speed. The average time to process a set of documents per shipment dropped from 35-45 minutes to just 4-6 minutes. Furthermore, the average turnaround time for documents went down to under 3 minutes from receipt to system entry for standard cases. This allowed operations downstream to speed up and minimize avoidable delays throughout the shipment lifecycle.

We saw a fantastic improvement in accuracy! The client’s effective data accuracy jumped from around 88-92% to an impressive 97.5% following extraction, validation, and exception confirmation. This boost helped cut down on customs-related rework, reduced penalty risks, and minimized client complaints.

Moreover, capacity increased without needing to hire more staff. The company escalated from managing about 500 shipments per week to supporting over 2,000 shipments per week while maintaining the same staffing model for document operations. During peak times, the system handled more than 2,000 documents daily without any hiccups in performance.

From a cost perspective, the client estimated a remarkable 70% reduction in operational costs for data entry. Even better, they established a reliable document processing pipeline, moving away from an unpredictable queue that used to grow during busy seasons.

Client satisfaction also received a boost; scores climbed from 3.5/5 to 4.6/5, mainly due to quicker document processing and fewer errors related to data entry. In the logistics field, clients tend to notice improvements in speed and accuracy before anything else catches their attention.

For us, this project stands out as a great example of AI document processing development and AI-powered data entry automation tailored for a high-volume logistics scenario. It also highlights why aTeam Soft Solutions is often chosen when companies seek automation that aligns with real-world operations rather than just theoretical workflows.

Summary of the Technology Stack 

  • Back-end: Python (FastAPI) for document intake orchestration, classification, extraction workflows, validation engine, and integrations
  • AI / Document Understanding: OpenAI GPT-4 for document classification and contextual field extraction
  • OCR: Google Cloud Vision API for text extraction from PDFs, scans, and images
  • Messaging / Intake: WhatsApp Business API via Twilio for document intake and exception confirmation workflows
  • Database: PostgreSQL for document metadata, extracted fields, validations, audit logs, and processing history
  • Queue / Performance: Redis for task queuing, caching, and asynchronous processing coordination
  • Document Storage: AWS S3 for raw documents, processed assets, and archival storage
  • Admin Interface: React.js dashboard for operations monitoring, accuracy tracking, exceptions, and reporting
  • Infrastructure / Deployment: Docker on AWS ECS for scalable containerized services
  • Testing: OCR/extraction validation testing, integration testing, exception workflow QA, and performance/load testing

What We Gained 

The most important takeaway from our experience was about designing a correction workflow. We noticed that the AI improved significantly faster when operators were able to correct errors directly within WhatsApp, the channel they were already familiar with. Initially, our approach was technically sound, but it didn’t align with user behavior. We were flagging errors on a web dashboard, which operators only accessed when they had spare time. As a result, our adoption rate hovered around 40%, limiting the feedback we received.

However, once we revamped the correction process to allow operators to reply directly in WhatsApp to verify uncertain fields, our adoption skyrocketed to 95%. This shift provided us with approximately 3x more structured correction data, and extraction accuracy improved about 3x faster than before, thanks to learning from a larger volume of real corrections.

This experience has transformed how we design human-in-the-loop AI systems at aTeam Soft Solutions. We now focus on creating correction workflows that prioritize the user’s natural operating channel first, followed by admin dashboards. For logistics and operations teams, it’s all about convenience and speed to ensure that feedback loops are effective.

We also discovered that for logistics automation software India projects, it’s essential to evaluate integration constraints (like the absence of an API) as early as the model feasibility stage. While the AI can function, the real business impact hinges on how well the end-to-end system operates.

Collaborate With Us

If your logistics team is spending lots of time manually entering shipment documents, handling OCR errors, or facing backlogs during busy seasons, we’re here to help you automate your workflow without making your partners change how they send files. aTeam Soft Solutions creates effective systems for AI document processing development, OCR automation logistics, and AI-powered data entry automation that are tailored to fit real operations.

Whether you’re looking for a WhatsApp-based intake flow, integration with legacy systems, or a comprehensive logistics automation software India solution, we can often outline the initial production phase within just a week after our first conversation. If you’re considering a software development company in India or a web development company in India for document automation, feel free to share a few sample document sets and your current process with us, and we’ll provide insights on where automation can save you time and minimize errors the most effectively.

Shyam S March 10, 2026
YOU MAY ALSO LIKE
ATeam Logo
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Privacy Preference