mother with baby

Document AI

How ThinkTrends' Document AI stands out from the rest
Jyotiska Biswas
December 1st, 2022
What is Document AI? Document AI refers to techniques used for automatically pulling information out of documents, such as pdfs. This is an incredibly difficult task with a wide variety of nuances due to the many formats that documents can appear in. However, it’s an important research area, as time-consuming document review processes that are currently handled manually could be replaced with extremely efficient automated processes powered by Document AI. This will allow for two things:
  1. Experts can spend more time applying their knowledge to analysis rather than sifting through documents
  2. Companies will free up resources to be spent in other areas

Parts of Document AI

Table Extraction

A very challenging aspect of document AI is table extraction. Tables in documents can show up in many different forms, with plenty of edge cases to throw off even the best Document AI model. What do you do when certain table cells are missing? Or when there is a table within a table? We’ve scratched our heads and whiteboarded many ideas to tackle these very problems.

Handwritten Text

Oftentimes documents will contain signatures. These have to be accounted for in the model so that the information is not lost.

Classifying Text

What if you need to know which text is an address? Or a title? It’s helpful when a document is clearly formatted with tags, but that cannot be relied on. Instead, the models need to be trained on robust, annotated data sets in order to properly determine what kind of data is being pulled out.

Extracting Meaning

Extracting information from a document is one part of this process, but the more valuable part of is making a decision/action based on the information that is extracted. This is where

Document AI has become a familiar area for ThinkTrends, as we've gained more experience with pulling tricky information out of a variety of documents. We don't use a one-size-fits-all model(we know that this doesn’t work). Instead we’ve developed a general approach to create AI models that are applied in specific cases. As a result, we’ve been able to help businesses automate their work flows, processing thousands of documents on their behalf.