The Large Language Model GPT is simply fascinating, and by now, everyone let it write write a poem for their (grand)parents’ wedding anniversary. But what are possible business use cases for Large Language Models? Especially in medium-sized businesses, so-called Document AI (= intelligent document processing with AI, oftentimes based on the Large Laguage Model GPT) have big potential. In this article, we discuss the functionality of such AI Docs and how specific challenges (reliability of output, access management, data security) can be overcome.
Before We Get Started: What is Document AI?
If we talk about Document AI in this article, we mean chatbots that use Large Language Models (usually GPT) to process internal documents. Usually, these documents are not open to the public, like e-mails, contracts, or patents. If your AI for documents is based on GPT, it can also be referred to as Document GPT.
And What Are Large Language Models?
A Large Language Model (short LLM) is a type of artificial intelligence system designed to understand and generate human-like text on a large scale. The special thing about LLM-based chatbots is their ability to understand and generate natural language. It feels like a real conversation. Currently, GPT is the most famous LLM. It stands for “Generative Pre-trained Transformer” and refers to a specific architecture of artificial neural networks developed by OpenAI. OpenAI has been doing this for some time, but with the release of the chatbot “ChatGPT-3” at the end of 2022, it entered the mass market.
How Can Businesses Benefit From Document AI?
Working with AI, and GPT in particular, has already become prevalent in businesses. However, this is frequently limited to optimizing core business operations (smart products, IIoT in production, etc.). Businesses can benefit from using Document AI for internal processes and operations by making internal knowledge more accessible and efficiently utilized. AI for documents can speed up or simplify numerous processes, such as:
…Onboarding
…Sales
…Maintenance
…Legal Advice
…Customer Support
The best Document AI use case depends on several factors. Fill out this questionnaire to get an individual assessment where the greatest potential lies for your business.
Particularly for medium-sized businesses, the digitalization of these areas has not been profitable for until now. Developing the necessary quality of an AI for documents in these areas simply required a very high level of development effort. In relation to the number of salespeople, customer service representatives, and so on, the costs would not have been proportionate. With GPT, this has changed.
The Benefits of Document AI
Artificial intelligence doc search, which processes and indexes internal technical documents, can save a lot of time and resources. Why? Because it takes a long time to find the right document to answer the request. According to this McKinsey study, highly skilled knowledge workers spent 19% of their worktime searching and gathering information. The documents are difficult to grasp – it takes a professional to understand what actions need to be derived… and relevant information is usually scattered in different places. These are things that LLMs and particularly GPT are good at:
- Document GPT can find relevant information quickly.
- AI for documents can consolidate all relevant information, even if it is located in different places or in different documents.
- When needed, information can be better formulated for understanding, as LLM-based document intelligence can also translate tech lingo. This is especially helpful for newcomers or non-experts.
In addition, Document AI services are accessible 24/7 and take care of “tedious” to-dos. Searching through documents is generally considered to be a less inspiring task. Instead, employees can focus on exciting questions (problem-solving, product innovation, etc.).
Nice, But How Does Document AI Work?
The advantages of Document AI are obvious. Therefore, there are already a couple of providers that offer PoCs (proof of concepts) for this. They all basically follow the same approach and use a pre-trained language model (LLM) to generate answers based on a specific document or paragraphs. This is how they operate:
Preprocessing: The document or section is preprocessed to remove unnecessary formatting and divide it into smaller, more manageable segments such as paragraphs or sentences.
Indexing: These segments are then encoded with an embedding model and stored in a database (e.g. Pinecone).
Question encoding: The asked question is encoded with the same embedding model.
Contextual search: Usually, the embedding of the question is compared with the embeddings of the segments to identify relevant contexts. More advanced approaches use multiple search methods here.
Input formatting: The encoded question and context segments are concatenated and formatted into a single prompt. Essentially, the LLM is asked to answer the following question based on the following context.
LLM query: Finally, the model is prompted to formulate the answer based on the provided context.
Moderation (optional): Most solutions do not yet include this part. However, you should consider adding moderation to check the answer for policy violations.
At Motius, we also developed a Document GPT application that follows this principle.
How do Document AI systems ensure accuracy over time as the nature of documents and language evolves? Document AI systems maintain accuracy over time through continuous learning and updates. They adapt to new document formats, languages, and terminologies by being retrained on newer datasets.
Arising Challenges for Document AI
So far, so good. However, as always, there are challenges, and currently, few existing solutions address them. As we have been working with LLMs for years, we have thoroughly analyzed the following challenges when it comes to AI for documents:
→ Reliability of the output: If you follow the schema explained above, it is essential to find the “right” paragraph for the correctness of your answer. Without the a paragraph that is relevant to answering the question, there is no meaningful answer. We recommend using not only the commonly used “Embedded Search” but also a conventional “Keyword Search.” In our experience, this significantly increases the reliability of the output of your Document AI.
→ Clean embedding: Embedding Document AI tools into the existing infrastructure is not trivial. In addition to a fragmented tool landscape, good access management is essential. Individual Document AI tools, as they are currently popping up everywhere, are not tailored to your (operational) environment. A customized tooling significantly increases security. At Motius, we have already supported several integrations and also developed a Document AI solution that can be quickly and precisely tailored to your environment.
→ Data security: If you are processing sensitive data with your Document Intelligence tool, you want to be sure that it is done safely. There are many reasons (DSGVO, ethical concerns, reputation, …) to ensure your data safety. In our experience, the best way to do this is by assessing your data security requirements upfront.
Document AI for Critical Applications
The accuracy of the output, secure access management, and data security are particularly important for some areas, for example for legal or medical applications. For such artificial intelligence doc search tools, it is crucial that only certain individuals have data access, and information or decisions must be traceable. It must be clear what information decisions are based on and, of course, ensured that they are correct.
As mentioned above, we see options to ensure these points. However, if certifiability is important, so-called knowledge graphs can be a better approach for your Document AI tool, either in addition or alone.
A knowledge graph is a semantic data structure that represents knowledge in the form of entities, attributes, and relationships between these entities. This enables organizing information in a way that is readable for humans and machines. Hence, humans can verify the information before it is unlocked for the bot. Which means, connections extracted by AI from the documents can be reviewed before they are provided to the user.
With LLMs like GPT, context-aware knowledge graphs can be developed, eliminating false statements from a Document AI tool. Moreover, it is clear what entities a response is based on, for example, which legal precedents the tool considered to make its decision.
Which leaves us with the last challenge mentioned above: data security. What specific techniques are used to manage and restrict access to sensitive documents processed by Document AI systems? Access to sensitive documents is managed using encryption, role-based access controls, and secure data storage practices. These ensure that only authorized users can access specific documents.
Depending on your requirements, there are different ways to ensure the security of your data. If you use data that shouldn’t leave your company, you need to deploy on-premise. If your data can leave your company and also Europe, you can go for commercial cloud providers. However, oftentimes this isn’t the case (e.g. due du DSGVO regulations). In those cases you should check if your data can be shared with a commercial cloud provider from Europe. Not an option? Then we’d suggest a VPC deployment for AI document processing.
Leverage LLMs for Your Business
Large Language Models and their application as internal Document AI will drive a lot of change – especially for small and medium-sized enterprises (SMEs), as models like GPT are an opportunity to advance digitization relatively easily. With Document AI based on GPT or other LLMs, you can accelerate resource-intensive processes and create further value-adding processes. Given the growing shortage of skilled workers, processes can be done faster and more efficiently with the help of LLMs. This will not only free up capacities but also create a more attractive working environment, since employees can outsource repetitive or tedious tasks. This is a not to be underestimated competitive advantage in the notorious “War for Talents”.
However, there are some challenges that you should keep in mind when creating an AI for (internal) documents. We can help you to ensure reliable output and a watertight access management. Let’s define and implement a value-adding use case for LLMs in your company