January 9, 2025

Document Processing: Experience and Learnings

mobile: Document Processing: Experience and Learnings's image

At Ballast Lane Applications we have several clients that we are helping to optimize their Document Processing. One is a client receiving and entering purchase orders in non standardized formats that need to be processed, and another is a client processing hand written applications. We have been using Large Language Models (LLMs), which are specialized Machine Learning (ML) models, to extract and summarize the information from documents.

For the non-technical: ML refers to a broad field of study within AI that involves training algorithms to perform specific tasks by learning patterns from data. ML models are typically trained on structured data, like tabular datasets, images, or time series data. The key concept for a program to be considered ML is that the intelligence is not written in the program, but learned from the data the system is exposed to. ML techniques can be applied to a wide range of problems, such as image recognition, data classification, natural language processing, fraud detection, and recommendation systems.

LLMs are a type of deep learning model that are trained on massive amounts of textual data, such as books, websites, and articles. These models are designed to understand and generate human-like text.

A quick overview of what we do for our clients:

For the client processing hand written applications, we analyze documents using Amazon Textract, which is a ML powered service to extract text, handwriting and data from PDFs and images with structured data, like forms, invoices and tables. We create adaptors to be more specific in what we want, and we train the adaptor and improve the results over time. A benefit of using Textract is that you can start right away without a lot of previous knowledge, since it's all set up. If you want to improve the experience you can start training the adapter to teach the system about your specific business needs. After getting the info from Textract we then postprocess the info and map it to our models.
For the client receiving and entering purchase orders in non standardized format, we use Claude, an AI Chatbot. We gather and preprocess the input, generate an appropriate prompt, extract data/text using a script we wrote, send to the LLM, then validate and use the response. This works for a PDF or an image.

It’s amazing to look over someone's shoulder (figuratively speaking of course) and watch handwritten scribble get extracted, summarized and presented in a meaningful way.

Using tools like this does not mean you can “set it and forget it”, in fact it’s quite the opposite. The team members need to:

Know the client’s business and the data they are working with.
Be available to complete constant validations and tweaks. There are inevitably some errors, for example price, if a different unit is being used.
Be constantly reviewing the tools and assessing whether there are lower cost, better alternatives out there that could provide even more value to the client.

The solutions we have been developing save time by automating repetitive tasks, allowing people to focus on higher-value activities. Also, faster processing can help businesses scale efficiently by eliminating data entry bottlenecks.

These are exciting times and I am pleased to be riding the coattails of my talented team members, who are willing to patiently explain the great work they are doing for our clients.