{"id":186,"date":"2024-11-19T10:00:00","date_gmt":"2024-11-19T16:00:00","guid":{"rendered":"https:\/\/douglasstarnes.dev\/?p=186"},"modified":"2024-11-18T22:42:24","modified_gmt":"2024-11-19T04:42:24","slug":"azure-ai-computer-vision-service-ocr","status":"publish","type":"post","link":"https:\/\/douglasstarnes.dev\/index.php\/2024\/11\/19\/azure-ai-computer-vision-service-ocr\/","title":{"rendered":"Azure AI Vision Service &#8211; OCR"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Many times, an image will contain text. This text can come in many forms such as the name on the front of a store, or handwritten text. There are many use cases for identifying and transcribing text in an image. The Azure AI Vision Service makes it possible to do this, in one line of code.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The wonderful thing about using the Vision Service for optical character recognition (OCR) is that if you read the previous post on image analysis, you already know how to do most of OCR as well. Everything is the same except the visual features in the <code>analyze <\/code>or <code>analyze_from_url <\/code>methods. To perform OCR, simple add the READ to the list of features.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"result = client.analyze(image_data, [VisualFeatures.READ])\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #D8DEE9FF\">result <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> client<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">analyze<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">image_data<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #D8DEE9FF\">VisualFeatures<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">READ<\/span><span style=\"color: #ECEFF4\">])<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">The result will contain a <code>read <\/code>attribute that will be <code>None <\/code>if no text was found.  If text was found, it will be organized into blocks and the block will be organized into lines.  Each <code>line <\/code>will have the <code>text <\/code>of the entire line, and the coordinates for the <code>bounding_polygon <\/code>surrounding the <code>line<\/code>.  It will also do the same for each word in the line.  And there is a confidence score for each word but not the line.  Consider this image: (<a href=\"https:\/\/unsplash.com\/photos\/a-sign-that-says-call-me-if-you-get-lost-4TiH3m8yt6A\">https:\/\/unsplash.com\/photos\/a-sign-that-says-call-me-if-you-get-lost-4TiH3m8yt6A<\/a>)<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"683\" height=\"1024\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/call_me_if_you_get_lost-683x1024.jpg\" alt=\"\" class=\"wp-image-187\" style=\"width:291px;height:auto\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/call_me_if_you_get_lost-683x1024.jpg 683w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/call_me_if_you_get_lost-200x300.jpg 200w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/call_me_if_you_get_lost-768x1152.jpg 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/call_me_if_you_get_lost-1024x1536.jpg 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/call_me_if_you_get_lost-1365x2048.jpg 1365w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/call_me_if_you_get_lost-scaled.jpg 1707w\" sizes=\"auto, (max-width: 683px) 100vw, 683px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">We can use this code to iterate over the <code>blocks <\/code>and <code>lines<\/code>:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"for idx, block in enumerate(result.read.blocks):\n  print(f&quot;-- Block {idx + 1}&quot;)\n  for ln_idx, line in enumerate(block.lines):\n    print(f&quot;   -- Line {ln_idx + 1}&quot;)\n    print(f&quot;      {line.text}&quot;)\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #81A1C1\">for<\/span><span style=\"color: #D8DEE9FF\"> idx<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> block <\/span><span style=\"color: #81A1C1\">in<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">enumerate<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">result<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">read<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">blocks<\/span><span style=\"color: #ECEFF4\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">  <\/span><span style=\"color: #88C0D0\">print<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #81A1C1\">f<\/span><span style=\"color: #A3BE8C\">&quot;-- Block <\/span><span style=\"color: #EBCB8B\">{<\/span><span style=\"color: #D8DEE9FF\">idx <\/span><span style=\"color: #81A1C1\">+<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #B48EAD\">1<\/span><span style=\"color: #EBCB8B\">}<\/span><span style=\"color: #A3BE8C\">&quot;<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">  <\/span><span style=\"color: #81A1C1\">for<\/span><span style=\"color: #D8DEE9FF\"> ln_idx<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> line <\/span><span style=\"color: #81A1C1\">in<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">enumerate<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">block<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">lines<\/span><span style=\"color: #ECEFF4\">):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    <\/span><span style=\"color: #88C0D0\">print<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #81A1C1\">f<\/span><span style=\"color: #A3BE8C\">&quot;   -- Line <\/span><span style=\"color: #EBCB8B\">{<\/span><span style=\"color: #D8DEE9FF\">ln_idx <\/span><span style=\"color: #81A1C1\">+<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #B48EAD\">1<\/span><span style=\"color: #EBCB8B\">}<\/span><span style=\"color: #A3BE8C\">&quot;<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    <\/span><span style=\"color: #88C0D0\">print<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #81A1C1\">f<\/span><span style=\"color: #A3BE8C\">&quot;      <\/span><span style=\"color: #EBCB8B\">{<\/span><span style=\"color: #D8DEE9FF\">line<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">text<\/span><span style=\"color: #EBCB8B\">}<\/span><span style=\"color: #A3BE8C\">&quot;<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">And here is what we will see in the console:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>-- Block 1\n   -- Line 1\n      CALL\n   -- Line 2\n      ME IF\n   -- Line 3\n      YOU\n   -- Line 4\n      GET\n   -- Line 5\n      LOST\n   -- Line 6\n      +18554448888<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This is pretty straightforward.  There are 6 lines of text.  The second line has two words.  Let&#8217;s try a more sophisticated image: (<a href=\"https:\/\/unsplash.com\/photos\/a-sign-advertising-safety-shoes-on-a-city-street-HlL8vMcp4cM\">https:\/\/unsplash.com\/photos\/a-sign-advertising-safety-shoes-on-a-city-street-HlL8vMcp4cM<\/a>)<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"768\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes-1024x768.jpg\" alt=\"\" class=\"wp-image-189\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes-1024x768.jpg 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes-300x225.jpg 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes-768x576.jpg 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes-1536x1152.jpg 1536w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes-2048x1536.jpg 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">There is a lot going on here.  Let&#8217;s see what the <code>blocks <\/code>and <code>lines <\/code>are.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>-- Block 1\n   -- Line 1\n      OUCH!\n   -- Line 2\n      Play it\n   -- Line 3\n      IT SHOULDN'T\n   -- Line 4\n      SAFE!\n   -- Line 5\n      HAPPEN TO A DOG\n   -- Line 6\n      WEAR\n   -- Line 7\n      Vi it the\n   -- Line 8\n      Safety Shoe Store\n   -- Line 9\n      Safety Shoes<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">If you compare the results to the image, you&#8217;ll see that the <code>lines <\/code>are being detected left to right.  Line 2 (&#8220;Play it&#8221;) is to the right of line 1 (&#8220;OUCH!&#8221;) even though line 2 is semantically related to line 4 (&#8220;SAFE!&#8221;).  For this reason, it might be helpful to identify the lines by drawing their bounding polygons on the image.  Recall that in addition to the <code>text <\/code>attribute, each <code>line <\/code>has a <code>bounding_polygon<\/code> attribute.  The <code>bounding_polygon <\/code>is a list of four points (upper left and right, lower right and left) and each one has an <code>x<\/code> and <code>y<\/code> coordinate.  <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To visualize the <code>bounding_polygon <\/code>we can make use of the Python Imaging Library or PIL.  Use <code>pip <\/code>to install the <code>pillow <\/code>package.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"$ pip install pillow\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #D8DEE9\">$<\/span><span style=\"color: #D8DEE9FF\"> pip install pillow<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">From the PIL package, bring in a couple of classes needed to draw on the image<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"from io import BytesIO\n\nfrom PIL import Image, ImageDraw\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #81A1C1\">from<\/span><span style=\"color: #D8DEE9FF\"> io <\/span><span style=\"color: #81A1C1\">import<\/span><span style=\"color: #D8DEE9FF\"> BytesIO<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #81A1C1\">from<\/span><span style=\"color: #D8DEE9FF\"> PIL <\/span><span style=\"color: #81A1C1\">import<\/span><span style=\"color: #D8DEE9FF\"> Image<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> ImageDraw<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Open the image file for reading to get the data.  Then create a new <code>Image<\/code>.  Notice that the image data must be of type <code>BytesIO<\/code>.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"filename = &quot;safety_shoes.jpg&quot;\n\nwith open(filename, &quot;rb&quot;) as image_file:\n  image_data = image_file.read()\n  image = Image.open(BytesIO(image_data))\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #D8DEE9FF\">filename <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">safety_shoes.jpg<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #81A1C1\">with<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">open<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">filename<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">rb<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">)<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #81A1C1\">as<\/span><span style=\"color: #D8DEE9FF\"> image_file<\/span><span style=\"color: #ECEFF4\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">  image_data <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> image_file<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">read<\/span><span style=\"color: #ECEFF4\">()<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">  image <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> Image<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">open<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #88C0D0\">BytesIO<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">image_data<\/span><span style=\"color: #ECEFF4\">))<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Now get the points of the <code>bounding_polygon <\/code>for the first line in the result.  Notice that the first point is duplicated and appended to the points.  This is so that PIL will be able to close the polygon when it is drawn.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"points = [(point.x, point.y) for point in result.readl.blocks[0].lines[0].bounding_polygon]\npoints.append(points[0])\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #D8DEE9FF\">points <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">[(<\/span><span style=\"color: #D8DEE9FF\">point<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">x<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> point<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">y<\/span><span style=\"color: #ECEFF4\">)<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #81A1C1\">for<\/span><span style=\"color: #D8DEE9FF\"> point <\/span><span style=\"color: #81A1C1\">in<\/span><span style=\"color: #D8DEE9FF\"> result<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">readl<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">blocks<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #B48EAD\">0<\/span><span style=\"color: #ECEFF4\">].<\/span><span style=\"color: #D8DEE9FF\">lines<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #B48EAD\">0<\/span><span style=\"color: #ECEFF4\">].<\/span><span style=\"color: #D8DEE9FF\">bounding_polygon<\/span><span style=\"color: #ECEFF4\">]<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">points<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">append<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">points<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #B48EAD\">0<\/span><span style=\"color: #ECEFF4\">])<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">To draw on the <code>image<\/code>, create an instance of <code>ImageDraw <\/code>from the image.  Then call the <code>line <\/code>method.  It takes the points of the bounding polygon, a <code>color<\/code>, and a line <code>width<\/code>.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"image_draw = ImageDraw.Draw(image)\nimage_draw.line(points, fill=&quot;red&quot;, width=10)\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #D8DEE9FF\">image_draw <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> ImageDraw<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">Draw<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">image<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">image_draw<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">line<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">points<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">fill<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">red<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">width<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #B48EAD\">10<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Create a new filename for the output.  Then save the <code>image<\/code> (not the <code>image_draw<\/code>) using the new filename.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"from pathlib import Path\n\np = Path(filename)\n\nimage.save(f&quot;{p.stem}_output{p.suffix}&quot;)\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #81A1C1\">from<\/span><span style=\"color: #D8DEE9FF\"> pathlib <\/span><span style=\"color: #81A1C1\">import<\/span><span style=\"color: #D8DEE9FF\"> Path<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">p <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">Path<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">filename<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">image<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">save<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #81A1C1\">f<\/span><span style=\"color: #A3BE8C\">&quot;<\/span><span style=\"color: #EBCB8B\">{<\/span><span style=\"color: #D8DEE9FF\">p<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">stem<\/span><span style=\"color: #EBCB8B\">}<\/span><span style=\"color: #A3BE8C\">_output<\/span><span style=\"color: #EBCB8B\">{<\/span><span style=\"color: #D8DEE9FF\">p<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">suffix<\/span><span style=\"color: #EBCB8B\">}<\/span><span style=\"color: #A3BE8C\">&quot;<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">And this is the output.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"768\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes_output_ouch-1024x768.jpg\" alt=\"\" class=\"wp-image-190\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes_output_ouch-1024x768.jpg 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes_output_ouch-300x225.jpg 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes_output_ouch-768x576.jpg 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes_output_ouch-1536x1152.jpg 1536w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes_output_ouch-2048x1536.jpg 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Take a look at the second line (&#8220;Play it&#8221;).<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"768\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes_output_play_it-1024x768.jpg\" alt=\"\" class=\"wp-image-192\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes_output_play_it-1024x768.jpg 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes_output_play_it-300x225.jpg 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes_output_play_it-768x576.jpg 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes_output_play_it-1536x1152.jpg 1536w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/safety_shoes_output_play_it-2048x1536.jpg 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The bounding polygon is not always a square. This angled text is still recognized accurately by the Vision Service in this line.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Notice line 7 (&#8220;Vi it the&#8221;).  This should be &#8220;Visit the&#8221;.  However, damage to the billboard has removed the &#8220;s&#8221;.  However, line 8 (&#8220;Safety Shoe Store&#8221;) is recognized despite the damage.  Let&#8217;s take a look at the <code>confidence <\/code>scores for each word in the line.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"for word in lines[7].words:\n  print(f&quot;{word.text} {word.confidence * 100:.1f}%&quot;)\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #81A1C1\">for<\/span><span style=\"color: #D8DEE9FF\"> word <\/span><span style=\"color: #81A1C1\">in<\/span><span style=\"color: #D8DEE9FF\"> lines<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #B48EAD\">7<\/span><span style=\"color: #ECEFF4\">].<\/span><span style=\"color: #D8DEE9FF\">words<\/span><span style=\"color: #ECEFF4\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">  <\/span><span style=\"color: #88C0D0\">print<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #81A1C1\">f<\/span><span style=\"color: #A3BE8C\">&quot;<\/span><span style=\"color: #EBCB8B\">{<\/span><span style=\"color: #D8DEE9FF\">word<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">text<\/span><span style=\"color: #EBCB8B\">}<\/span><span style=\"color: #A3BE8C\"> <\/span><span style=\"color: #EBCB8B\">{<\/span><span style=\"color: #D8DEE9FF\">word<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">confidence <\/span><span style=\"color: #81A1C1\">*<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #B48EAD\">100<\/span><span style=\"color: #81A1C1\">:.1f<\/span><span style=\"color: #EBCB8B\">}<\/span><span style=\"color: #A3BE8C\">%&quot;<\/span><span style=\"color: #ECEFF4\">)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">The output shows that the score for &#8220;Safety&#8221; is much lower than the other words.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Safety 32.5%\nShoe 96.1%\nStore 99.4%<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This example also demonstrates how the Vision Service is able to recognize text is multiple styles within the same image.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The Vision Service can also recognize handwritten text. Take a look at this example:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"817\" height=\"1024\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/writing-817x1024.jpg\" alt=\"\" class=\"wp-image-194\" style=\"width:409px;height:auto\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/writing-817x1024.jpg 817w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/writing-239x300.jpg 239w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/writing-768x962.jpg 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/writing-1226x1536.jpg 1226w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/writing-1634x2048.jpg 1634w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/writing.jpg 1995w\" sizes=\"auto, (max-width: 817px) 100vw, 817px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The block and lines that the Vision Service recognized are:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>-- Block 1\n   -- Line 1\n      Azure AI Services\n   -- Line 2\n      Computer Vision<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">And it correctly detects the bounds of the lines. Here is line 2:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"817\" height=\"1024\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/writing_output-817x1024.jpg\" alt=\"\" class=\"wp-image-195\" style=\"width:390px;height:auto\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/writing_output-817x1024.jpg 817w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/writing_output-239x300.jpg 239w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/writing_output-768x962.jpg 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/writing_output-1226x1536.jpg 1226w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/writing_output-1634x2048.jpg 1634w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/writing_output.jpg 1995w\" sizes=\"auto, (max-width: 817px) 100vw, 817px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The Vision Service also recognizes text in more language than English. It can recognize handwritten text in <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/ai-services\/computer-vision\/language-support#handwritten-text\" data-type=\"link\" data-id=\"https:\/\/learn.microsoft.com\/en-us\/azure\/ai-services\/computer-vision\/language-support#handwritten-text\">9 languages<\/a> and printed text in over <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/ai-services\/computer-vision\/language-support#print-text\" data-type=\"link\" data-id=\"https:\/\/learn.microsoft.com\/en-us\/azure\/ai-services\/computer-vision\/language-support#print-text\">150 languages<\/a>. Here is an example of printed text in French: (<a href=\"https:\/\/www.pexels.com\/photo\/store-front-entrance-in-old-building-in-france-20407572\/\">https:\/\/www.pexels.com\/photo\/store-front-entrance-in-old-building-in-france-20407572\/<\/a>)<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"574\" height=\"1024\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/french-574x1024.jpg\" alt=\"\" class=\"wp-image-196\" style=\"width:289px;height:auto\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/french-574x1024.jpg 574w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/french-168x300.jpg 168w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/french-768x1369.jpg 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/french-862x1536.jpg 862w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/french-1149x2048.jpg 1149w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/french-scaled.jpg 1436w\" sizes=\"auto, (max-width: 574px) 100vw, 574px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The Vision Service correctly recognizes the text &#8220;COUVERTURE PLOMBERIE&#8221; which translates to &#8220;PLUMBING COVERAGE&#8221; in English.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And just like the Image Analysis service, you can experiment with the features of OCR in Azure Vision Studio.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">After logging in to Azure Vision Studio, on the <strong>Optical character recognition<\/strong> tab, click on the card for <strong>Extract text from images<\/strong>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"597\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/01-10-1024x597.png\" alt=\"\" class=\"wp-image-201\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/01-10-1024x597.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/01-10-300x175.png 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/01-10-768x448.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/01-10-1536x896.png 1536w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/01-10.png 1848w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">On the next page, you can select one of the sample images, or upload you own.  <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"486\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/02-8-1024x486.png\" alt=\"\" class=\"wp-image-202\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/02-8-1024x486.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/02-8-300x142.png 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/02-8-768x364.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/02-8-1536x729.png 1536w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/02-8-2048x972.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">I&#8217;ll upload the Safety Shoes image from before.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"597\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/03_blog-1-1024x597.png\" alt=\"\" class=\"wp-image-204\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/03_blog-1-1024x597.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/03_blog-1-300x175.png 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/03_blog-1-768x448.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/03_blog-1-1536x896.png 1536w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/03_blog-1.png 2018w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">As you can see, Azure Vision Studio recognized the same text that was recognized using the SDK. Also, the bounding polygon of each word has been drawn on top of the image. Hovering over a word in the results will highlight the bounding polygon in the image.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Summary<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">In this post, you saw how to use the Azure AI Vision Service to recognize text in images. This is simple coming from the previous post about image analysis as the only difference is the visual feature detected in the image. You saw that Azure AI can recognize both printed and handwritten text in multiple languages. You used the pillow package to highlight the areas in the image identified as text. And you saw how to detect text in images using Azure Vision Studio in the browser.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Many times, an image will contain text. This text can come in many forms such&#8230;<\/p>\n","protected":false},"author":1,"featured_media":206,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[10,18],"tags":[7,8,20,4,15],"class_list":["post-186","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-microsoft-azure","category-python","tag-ai","tag-azure","tag-ocr","tag-python","tag-vision"],"_links":{"self":[{"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/posts\/186","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/comments?post=186"}],"version-history":[{"count":10,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/posts\/186\/revisions"}],"predecessor-version":[{"id":220,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/posts\/186\/revisions\/220"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/media\/206"}],"wp:attachment":[{"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/media?parent=186"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/categories?post=186"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/tags?post=186"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}