{"id":311,"date":"2025-01-22T22:00:00","date_gmt":"2025-01-23T04:00:00","guid":{"rendered":"https:\/\/douglasstarnes.dev\/?p=311"},"modified":"2025-01-16T16:36:55","modified_gmt":"2025-01-16T22:36:55","slug":"getting-started-with-azure-ai-document-intelligence","status":"publish","type":"post","link":"https:\/\/douglasstarnes.dev\/index.php\/2025\/01\/22\/getting-started-with-azure-ai-document-intelligence\/","title":{"rendered":"Getting Started with Azure AI Document Intelligence"},"content":{"rendered":"\n<p>It&#8217;s amazing how much of the world&#8217;s data, especially the business world&#8217;s data, is still contained in documents, forms, and other physical paper media.  Until recently, extracting data from physical documents was a manually, time-consuming and error-prone process.  Microsoft Azure AI Document Intelligence lets you leverage the power of artificial intelligence to automate the process of extracting data from physical documents.  In this post we will look at what it takes to get started.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">What is Azure AI Document Intelligence<\/h3>\n\n\n\n<p>Azure AI Document Intelligence is part of the Azure AI Services.  The Azure AI Services allow almost any developer to add AI features to their applications while knowing little if anything about the inner workings of artificial intelligence.  A lot of the time, the &#8220;hard work&#8221; of using an AI model to make predictions is reduced to a single line of code.  The rest of the code related to AI services is boilerplate configuration and analyzing the results which is what makes your application unique.<\/p>\n\n\n\n<p>Azure AI Document Intelligence offers solutions to the problem of extracting information from documents.  For example, Azure AI Document Intelligence can extract printed and handwritten text from documents and forms.  But AI Document Intelligence goes far beyond optical character recognition (OCR).  AI Document Intelligence can recognize structures in a document such as headings, tables and paragraphs.  It can also extract data from forms and tell which checkboxes are selected.<\/p>\n\n\n\n<p>For common types of documents, Azure AI Document Intelligence offers a number of prebuilt models.  There are prebuilt models for invoices, contracts, business cards, and US taxes forms among others.  And there are also a generic layout model that will extract text, tables, figures and selected checkboxes.  It can even recognize page numbers and table and figure captions.  And if you have other needs, you can train custom models as well.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Getting Started with Azure AI Document Intelligence<\/h2>\n\n\n\n<p>To use Azure AI Document Intelligence you must have an Azure account.  If you do not have one you can sign up for a free Azure account at <a href=\"https:\/\/azure.microsoft.com\">https:\/\/azure.microsoft.com<\/a>.  New customers will receive a $200 credit for the first 30 days.  Then you can get 12 months of popular services &#8211; including Azure AI Document Intelligence &#8211; for free!<\/p>\n\n\n\n<p>With your Azure account, log in to the Azure Portal at <a href=\"https:\/\/portal.azure.com\">https:\/\/portal.azure.com<\/a>. You&#8217;ll need to provision an instance of AI Document Intelligence.  At the top of the portal, in the search box, look for <em>Document Intelligence<\/em>.  <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"885\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5577c8-1024x885.png\" alt=\"\" class=\"wp-image-312\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5577c8-1024x885.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5577c8-300x259.png 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5577c8-768x664.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5577c8.png 1048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>On the next page, click the <strong>Create <\/strong>button.  You will be taken to a form to configure a new AI Document Intelligence instance.  Select or create a resource group and select a region.  Give the instance a unique name and select a pricing tier.  Again, you can get the free F0 tier of AI Document Intelligence for 12 months.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"436\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a58ac76-1024x436.png\" alt=\"\" class=\"wp-image-313\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a58ac76-1024x436.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a58ac76-300x128.png 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a58ac76-768x327.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a58ac76.png 1447w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Click the <strong>Review + create<\/strong> button at the bottom of the page to validate the configuration.  Once validation has succeeded, click the <strong>Create <\/strong>button to provision the new instance.  There are other ways to provision Azure AI Document Intelligence such as multi-service instances and containers.  Those beyond the scope of this post.  Click the Go to resource button when provisioning is complete.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"259\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5b917d-1024x259.png\" alt=\"\" class=\"wp-image-314\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5b917d-1024x259.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5b917d-300x76.png 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5b917d-768x194.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5b917d-1536x389.png 1536w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5b917d-2048x519.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Azure Document Intelligence Studio<\/h3>\n\n\n\n<p>The easiest way to see how Azure AI Document Intelligence works, is to explore the features using Azure Document Intelligence Studio.  From the resource overview, under the<strong> Get Started<\/strong> tab, click the <strong>Go to Document Intelligence Studio<\/strong> button. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"565\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5e19de-1024x565.png\" alt=\"\" class=\"wp-image-315\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5e19de-1024x565.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5e19de-300x165.png 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5e19de-768x424.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5e19de-1536x847.png 1536w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a5e19de-2048x1130.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Azure Document Intelligence Studio lets you explore the features of Azure AI Document Intelligence without writing any code or installing any software.  You can explore the major features through a web-based portal.  Note that using the Azure Document Intelligence Studio will consume your free quota or you will be billed for the time consumed.<\/p>\n\n\n\n<p>Looking at the features you can see the Document Analysis section which has more generic models for extracting printed and handwritten text, structures such as tables and check boxes.  Then there are a number of prebuilt models.  We will be looking at this section in just a minute.  And finally there are custom models for extracting data and also a custom model for classification.  <\/p>\n\n\n\n<p>For this post, let&#8217;s take a look at a couple of the prebuilt models. Click on the <strong>Try it out<\/strong> link for <strong>Invoices<\/strong>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"660\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e5d25d4-1024x660.png\" alt=\"\" class=\"wp-image-323\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e5d25d4-1024x660.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e5d25d4-300x193.png 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e5d25d4-768x495.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e5d25d4-1536x990.png 1536w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e5d25d4.png 1920w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>You will see a few sample images containing images. Select the first one and click the <strong>Run analysis<\/strong> button.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"437\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e55e69b-1024x437.png\" alt=\"\" class=\"wp-image-322\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e55e69b-1024x437.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e55e69b-300x128.png 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e55e69b-768x328.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e55e69b-1536x655.png 1536w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e55e69b.png 1920w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Using AI Document Intelligence, the Document Intelligence Studio has extracted different entities that are commonly found on invoices such as contact information, items billed, payment information along with a summary of charges.. Document Intelligence Studio has overlaid the boundaries of those entities onto the image of the card. In the list to the right, are the values of the entities and a confidence score. So AI Document Intelligence is 95.1% sure that this invoice contains an amount due of 610.<\/p>\n\n\n\n<p>Of course, the provided samples are always going to work great. So let&#8217;s try an invoice that the model hasn&#8217;t seen before. Here is an invoice fictional purchase and company.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"795\" height=\"1024\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e5f6ab3-795x1024.png\" alt=\"\" class=\"wp-image-324\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e5f6ab3-795x1024.png 795w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e5f6ab3-233x300.png 233w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e5f6ab3-768x989.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e5f6ab3-1193x1536.png 1193w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e5f6ab3-1024x1319.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e5f6ab3.png 1219w\" sizes=\"auto, (max-width: 795px) 100vw, 795px\" \/><\/figure>\n\n\n\n<p>In Document Intelligence Studio, drag the image on the space in the left sidebar.  Click the <strong>Run analysis<\/strong> button again.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"506\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e60e434-1024x506.png\" alt=\"\" class=\"wp-image-326\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e60e434-1024x506.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e60e434-300x148.png 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e60e434-768x379.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e60e434-1536x759.png 1536w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2e60e434.png 1919w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>Azure AI Document Intelligence has correctly identified the important entities on the invoice. It recognized the address, company name, billed items and amounts most of them with 90% or higher confidence scores.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Using an SDK<\/h2>\n\n\n\n<p>The Azure Document Intelligence Studio is great for experimenting with the features of AI Document Intelligence.  But to integrate AI Document Intelligence into your application, you&#8217;ll need to access it with code.  Like all the Azure AI Services, AI Document Intelligence is exposed as a REST API.  But for common languages like C#, JavaScript, and Python (the language I will use) there are SDKs.  Again, in a lot of scenarios, this reduces the &#8220;hard work&#8221; of making predictions to a single line of code and you can work with language native structures when analyzing the results instead of parsing JSON data.<\/p>\n\n\n\n<p>The SDK is distributed as a Python package and can be installed with pip.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"$ pip install azure-ai-documentintelligence==1.0.0b4\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #88C0D0\">$<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">pip<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">install<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">azure-ai-documentintelligence==<\/span><span style=\"color: #B48EAD\">1.0<\/span><span style=\"color: #A3BE8C\">.0b4<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p>For this demo, the image or PDF of the invoice needs to be accessible at a URL. This could be in Azure Blob Storage or I&#8217;ll be using a raw link in a public GitHub repo to keep it simple. And if you want to secure your documents, you can always use the Azure security features to restrict access to AI Document Intelligence.<\/p>\n\n\n\n<p>Next, you&#8217;ll need to go back to the Azure Portal and the overview of the AI Document Intelligence resource you created earlier.  On the left side, expand <strong>Resource Management<\/strong> and click <strong>Keys and Endpoints<\/strong>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"562\" height=\"969\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a7f1113.png\" alt=\"\" class=\"wp-image-320\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a7f1113.png 562w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a7f1113-174x300.png 174w\" sizes=\"auto, (max-width: 562px) 100vw, 562px\" \/><\/figure>\n\n\n\n<p>Copy one of the keys and the endpoint for use in your code.  Remember to keep the key a secret.  It will be used to authenticate access to the AI Document Intelligence resource.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"367\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a804ef2-1024x367.png\" alt=\"\" class=\"wp-image-321\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a804ef2-1024x367.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a804ef2-300x108.png 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a804ef2-768x276.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2025\/01\/Snag_2a804ef2.png 1396w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-pullquote has-text-align-left\"><blockquote><p>You might notice that the endpoint has a domain of <em>cognitiveservices<\/em>.azure.com. Originally, the Azure AI Services were called the Azure Cognitive Services.  They were rebranded in 2023.  You still might see the term Cognitive Services uses in some documentation and code.<\/p><\/blockquote><\/figure>\n\n\n\n<p>In a new Python file import the modules needed for Azure AI Document Intelligence and three constants for the key and endpoint and the URL of the PDF.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"from azure.core.credentials import AzureKeyCredential\nfrom azure.ai.documentintelligence import DocumentIntelligenceClient\nfrom azure.ai.documentintelligence.models import AnalyzeDocumentRequest\n\nDOCINTEL_ENDPOINT = &quot;{YOUR_VALUE_HERE}&quot;\nDOCINTEL_KEY = &quot;{YOUR_VALUE_HERE}&quot;\nDOCUMENT_URL = &quot;{YOUR_VALUE_HERE}&quot;\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #81A1C1\">from<\/span><span style=\"color: #D8DEE9FF\"> azure<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">core<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">credentials <\/span><span style=\"color: #81A1C1\">import<\/span><span style=\"color: #D8DEE9FF\"> AzureKeyCredential<\/span><\/span>\n<span class=\"line\"><span style=\"color: #81A1C1\">from<\/span><span style=\"color: #D8DEE9FF\"> azure<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">ai<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">documentintelligence <\/span><span style=\"color: #81A1C1\">import<\/span><span style=\"color: #D8DEE9FF\"> DocumentIntelligenceClient<\/span><\/span>\n<span class=\"line\"><span style=\"color: #81A1C1\">from<\/span><span style=\"color: #D8DEE9FF\"> azure<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">ai<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">documentintelligence<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">models <\/span><span style=\"color: #81A1C1\">import<\/span><span style=\"color: #D8DEE9FF\"> AnalyzeDocumentRequest<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">DOCINTEL_ENDPOINT <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #EBCB8B\">{YOUR_VALUE_HERE}<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">DOCINTEL_KEY <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #EBCB8B\">{YOUR_VALUE_HERE}<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">DOCUMENT_URL <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #EBCB8B\">{YOUR_VALUE_HERE}<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p>Create an <code>AzureKeyCredential <\/code>using the <code>DOCINTEL_KEY <\/code>and use the credential and <code>DOCINTEL_ENDPOINT <\/code>to create a <code>DocumentIntelligenceClient<\/code>.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"credential = AzureKeyCredential(DOCINTEL_KEY)\nclient = DocumentIntelligenceClient(endpoint=DOCINTEL_ENDPOINT, credential=credential)\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #D8DEE9FF\">credential <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">AzureKeyCredential<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">DOCINTEL_KEY<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">client <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">DocumentIntelligenceClient<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9\">endpoint<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\">DOCINTEL_ENDPOINT<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">credential<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\">credential<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p>Call the <code>begin_analyze_document <\/code>method on the <code>client<\/code>.  It expects the name of the model which is <code>prebuilt-invoice<\/code> for this demo.  The second value is an instance of <code>AnalyzeDocumentRequest <\/code>which uses the <code>DOCUMENT_URL<\/code>.  The method will return an <code>AnalyzeDocumentLROPoller<\/code>.  Analyzing documents could take a while so the poller will wait until the analysis has completed after the <code>result <\/code>method will return the analysis of the document.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"poller = client.begin_analyze_document(\n    &quot;prebuilt-invoice&quot;, AnalyzeDocumentRequest(url_source=DOCUMENT_URL)\n)\nresult = poller.result()\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #D8DEE9FF\">poller <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> client<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">begin_analyze_document<\/span><span style=\"color: #ECEFF4\">(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">prebuilt-invoice<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">AnalyzeDocumentRequest<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9\">url_source<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\">DOCUMENT_URL<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">result <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> poller<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">result<\/span><span style=\"color: #ECEFF4\">()<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p>Now the result contains the data extracted from the invoice.  These will be in the <code>fields <\/code>for each document in the result.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"result.documents[0].fields\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #D8DEE9FF\">result<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">documents<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #B48EAD\">0<\/span><span style=\"color: #ECEFF4\">].<\/span><span style=\"color: #D8DEE9FF\">fields<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p>For example to get the billed items you could iterate over the <code>Items <\/code>key:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"for item in result.documents[0].fields[&quot;Items&quot;].value_array:\n    print(item.value_object[&quot;Amount&quot;].value_currency.amount)\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #81A1C1\">for<\/span><span style=\"color: #D8DEE9FF\"> item <\/span><span style=\"color: #81A1C1\">in<\/span><span style=\"color: #D8DEE9FF\"> result<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">documents<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #B48EAD\">0<\/span><span style=\"color: #ECEFF4\">].<\/span><span style=\"color: #D8DEE9FF\">fields<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">Items<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">].<\/span><span style=\"color: #D8DEE9FF\">value_array<\/span><span style=\"color: #ECEFF4\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    <\/span><span style=\"color: #88C0D0\">print<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">item<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">value_object<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">Amount<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">].<\/span><span style=\"color: #D8DEE9FF\">value_currency<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">amount<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p>The values returned are the six values in the invoice.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>5000.0\n7500.0\n10000.0\n15000.0\n5000.0\n10000.0<\/code><\/pre>\n\n\n\n<p>You can see a complete list of the entities in the prebuilt invoice document schema on GitHub at <a href=\"https:\/\/github.com\/Azure-Samples\/document-intelligence-code-samples\/blob\/main\/schema\/2024-07-31-preview\/invoice.md\">https:\/\/github.com\/Azure-Samples\/document-intelligence-code-samples\/blob\/main\/schema\/2024-07-31-preview\/invoice.md<\/a>.<\/p>\n\n\n\n<p>Of course there is much more that you can do with the invoice model, as well as Azure AI Document Intelligence.  Those are good topics for future posts.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Summary<\/h3>\n\n\n\n<p>In this post you learned about Azure AI Document Intelligence for automating the processing of documents and forms.  You saw how to provision a new instance of Azure Document Intelligence.  You also saw how to use Azure Document Intelligence Studio to experiment with AI Document Intelligence without writing any code.  And you saw how to use the Python SDK to to analyze documents using the prebuilt invoice model and parse the results.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It&#8217;s amazing how much of the world&#8217;s data, especially the business world&#8217;s data, is still&#8230;<\/p>\n","protected":false},"author":1,"featured_media":328,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[10],"tags":[7,8,39,4],"class_list":["post-311","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-microsoft-azure","tag-ai","tag-azure","tag-document-intelligence","tag-python"],"_links":{"self":[{"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/posts\/311","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/comments?post=311"}],"version-history":[{"count":1,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/posts\/311\/revisions"}],"predecessor-version":[{"id":327,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/posts\/311\/revisions\/327"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/media\/328"}],"wp:attachment":[{"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/media?parent=311"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/categories?post=311"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/tags?post=311"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}