{"id":96,"date":"2024-11-08T10:00:00","date_gmt":"2024-11-08T16:00:00","guid":{"rendered":"https:\/\/douglasstarnes.dev\/?p=96"},"modified":"2024-11-04T23:43:23","modified_gmt":"2024-11-05T05:43:23","slug":"azure-ai-language-service-named-entity-recognition","status":"publish","type":"post","link":"https:\/\/douglasstarnes.dev\/index.php\/2024\/11\/08\/azure-ai-language-service-named-entity-recognition\/","title":{"rendered":"Azure AI Language Service: Named Entity Recognition"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">In the <a href=\"https:\/\/douglasstarnes.dev\/index.php\/2024\/11\/04\/getting-started-with-the-azure-ai-language-service\/\" data-type=\"post\" data-id=\"18\">previous post<\/a> on the Azure AI Language, you saw how to provision and configure an instance of the Azure AI Language service and a simple example of using sentiment analysis and key phrase extraction.  In this post we will look as another feature of the Language service, Named Entity Recognition or NER.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The NER feature of the Language service detects notable entities in a body of text such as people, places and organizations.  For example, Satya Nadella would be an example of a person recognized by NER while Microsoft would be an example of an organization.  In total, NER can recognize 14 different entity categories.  A complete list can be found <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/ai-services\/language-service\/named-entity-recognition\/concepts\/named-entity-categories\">here<\/a>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The prerequisites for using the NER feature are the same as using the sentiment analysis feature.  You need to provision an instance of the Language service and get the endpoint and keys to create a <code>TextAnalyticsClient<\/code> that you will use to communicate with the Language service.  So we&#8217;ll start from there.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Using the NER feature<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">As I said in the previous post about the Language service, most of the &#8220;hard work&#8221; of getting a prediction from an AI model is reduced to a single line of code.  The NER feature is no exception.  Once you have a client, simple call the <code>recognize_entities<\/code> method and pass it of a list of strings, which will be referred to as <em>documents<\/em>.  For example, let&#8217;s use this short passage about interesting things in Nashville, Tennessee. (courtesy ChatGPT)<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">Nashville, Tennessee, is often known as &#8220;Music City,&#8221; a title that rings true thanks to its rich history and vibrant cultural scene. While Nashville has become synonymous with country music, the city is also celebrated for its notable residents, groundbreaking music venues, and contributions to art, education, and civic life.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Nashville\u2019s story is one of creativity, resilience, and tradition. From the legacies of Dolly Parton and Andrew Jackson to the lively scenes on Broadway Street, Nashville continues to attract visitors and inspire artists worldwide. Whether you&#8217;re a music lover, a history buff, or simply someone looking to experience a slice of Southern culture, Nashville offers a warm welcome and countless stories to uncover.<\/p>\n<\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">Assuming a <code>TextAnalyticsClient<\/code> instance named <code>client<\/code>, the single line of code that does the &#8220;hard work&#8221; is this.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"results = client.recognize_entities(\n  documents=[nashville_document]\n)\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #D8DEE9FF\">results <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> client<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">recognize_entities<\/span><span style=\"color: #ECEFF4\">(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">  <\/span><span style=\"color: #D8DEE9\">documents<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #D8DEE9FF\">nashville_document<\/span><span style=\"color: #ECEFF4\">]<\/span><\/span>\n<span class=\"line\"><span style=\"color: #ECEFF4\">)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">The return value will contain one result for each document.  Inside each result is a <code>entities<\/code> attribute.  Each entity contains data such as the entity <code>text<\/code>, the <code>category<\/code> and a <code>confidence_score<\/code>.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(2 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"entity_categories = {}\n  for entity in document.entities:\n    if entity.category not in entity_categories:\n      entity_categories[entity.category] = []\n    entity_categories[entity.category].append(\n      {\n        &quot;text&quot;: entity.text, \n        &quot;confidence_score&quot;: entity.confidence_score\n      }\n    )\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #D8DEE9FF\">entity_categories <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">{}<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">  <\/span><span style=\"color: #81A1C1\">for<\/span><span style=\"color: #D8DEE9FF\"> entity <\/span><span style=\"color: #81A1C1\">in<\/span><span style=\"color: #D8DEE9FF\"> document<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">entities<\/span><span style=\"color: #ECEFF4\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    <\/span><span style=\"color: #81A1C1\">if<\/span><span style=\"color: #D8DEE9FF\"> entity<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">category <\/span><span style=\"color: #81A1C1\">not<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #81A1C1\">in<\/span><span style=\"color: #D8DEE9FF\"> entity_categories<\/span><span style=\"color: #ECEFF4\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">      entity_categories<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #D8DEE9FF\">entity<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">category<\/span><span style=\"color: #ECEFF4\">]<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">[]<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    entity_categories<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #D8DEE9FF\">entity<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">category<\/span><span style=\"color: #ECEFF4\">].<\/span><span style=\"color: #88C0D0\">append<\/span><span style=\"color: #ECEFF4\">(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">      <\/span><span style=\"color: #ECEFF4\">{<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">        <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">text<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">:<\/span><span style=\"color: #D8DEE9FF\"> entity<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">text<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">        <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">confidence_score<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">:<\/span><span style=\"color: #D8DEE9FF\"> entity<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">confidence_score<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">      <\/span><span style=\"color: #ECEFF4\">}<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    <\/span><span style=\"color: #ECEFF4\">)<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Here is an excerpt of the entities extracted:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>{\n    \"Location\": &#91;\n        {\n            \"text\": \"Nashville\",\n            \"confidence_score\": 1.0\n        },\n        {\n            \"text\": \"Tennessee\",\n            \"confidence_score\": 0.98\n        }\n    ],\n    \"PersonType\": &#91;\n        {\n            \"text\": \"residents\",\n            \"confidence_score\": 0.88\n        }\n    ],\n    \"Skill\": &#91;\n        {\n            \"text\": \"art\",\n            \"confidence_score\": 0.91\n        }\n    ],\n    \"Quantity\": &#91;\n        {\n            \"text\": \"one\",\n            \"confidence_score\": 0.8\n        }\n    ],\n    \"Person\": &#91;\n        {\n            \"text\": \"Dolly Parton\",\n            \"confidence_score\": 1.0\n        },\n        {\n            \"text\": \"Andrew Jackson\",\n            \"confidence_score\": 1.0\n        }\n    ],\n    \"Address\": &#91;\n        {\n            \"text\": \"Broadway Street\",\n            \"confidence_score\": 0.94\n        }\n    ]\n}<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The NER feature found entities in 6 different categories:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Location<\/li>\n\n\n\n<li>PersonType<\/li>\n\n\n\n<li>Skill<\/li>\n\n\n\n<li>Quantity<\/li>\n\n\n\n<li>Person<\/li>\n\n\n\n<li>Address<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">This example used a generalized model.  The Azure AI Service also offers a domain-specific model for health data.  This models extends the entity categories with topics including anatomy, examinations and medication.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To extract entities for health data, you call the <code>begin_analyze_healthcare_entities<\/code> method of the <code>TextAnalyticsClient<\/code>.  To get the results for each document, you must call the <code>result<\/code> method.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"result = client.begin_analyze_healthcare_entities(\n  [allegriclear_study]\n).result()\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #D8DEE9FF\">result <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> client<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">begin_analyze_healthcare_entities<\/span><span style=\"color: #ECEFF4\">(<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">  <\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #D8DEE9FF\">allegriclear_study<\/span><span style=\"color: #ECEFF4\">]<\/span><\/span>\n<span class=\"line\"><span style=\"color: #ECEFF4\">).<\/span><span style=\"color: #88C0D0\">result<\/span><span style=\"color: #ECEFF4\">()<\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">And this is an excerpt from the entities extracted from a fictional study about a new allergy treatment called <em>AllergiClear<\/em>. (courtesy of ChatGPT)<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>{\n    \"Diagnosis\": &#91;\n        {\n            \"text\": \"Allergic rhinitis\",\n            \"confidence_score\": 1.0\n        },\n        {\n            \"text\": \"seasonal allergies\",\n            \"confidence_score\": 1.0\n        }\n    ],\n    \"SymptomOrSign\": &#91;\n        {\n            \"text\": \"sneezing\",\n            \"confidence_score\": 1.0\n        },\n        {\n            \"text\": \"congestion\",\n            \"confidence_score\": 1.0\n        }\n    ],\n    \"TreatmentName\": &#91;\n        {\n            \"text\": \"treatments\",\n            \"confidence_score\": 0.61\n        },\n        {\n            \"text\": \"symptom management\",\n            \"confidence_score\": 0.65\n        }\n    ],\n    \"MedicationName\": &#91;\n        {\n            \"text\": \"AllergiClear\",\n            \"confidence_score\": 0.95\n        }\n    ],\n    \"Frequency\": &#91;\n        {\n            \"text\": \"daily\",\n            \"confidence_score\": 0.99\n        }\n    ],\n    \"Time\": &#91;\n        {\n            \"text\": \"12 weeks\",\n            \"confidence_score\": 0.99\n        },\n        {\n            \"text\": \"end of 12 weeks\",\n            \"confidence_score\": 0.92\n        }\n    ]\n}\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The complete list of healthcare categories can be found <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/ai-services\/language-service\/text-analytics-for-health\/concepts\/health-entity-categories\">here<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Linked Entities<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The Azure AI Language service can also provides references to additional information about entities.  By calling the <code>recognized_linked_entities<\/code> method of the <code>TextAnalyticsClient<\/code> you can get this additional information.  It includes a data for the linked information including a <code>url<\/code> and <code>data_source<\/code>.  The result also includes the <code>text<\/code> that was matched in the document for each linked entity and a <code>confidence_score<\/code>.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(2 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span style=\"display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#2e3440ff\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"54\" height=\"14\" viewBox=\"0 0 54 14\"><g fill=\"none\" fill-rule=\"evenodd\" transform=\"translate(1 1)\"><circle cx=\"6\" cy=\"6\" r=\"6\" fill=\"#FF5F56\" stroke=\"#E0443E\" stroke-width=\".5\"><\/circle><circle cx=\"26\" cy=\"6\" r=\"6\" fill=\"#FFBD2E\" stroke=\"#DEA123\" stroke-width=\".5\"><\/circle><circle cx=\"46\" cy=\"6\" r=\"6\" fill=\"#27C93F\" stroke=\"#1AAB29\" stroke-width=\".5\"><\/circle><\/g><\/svg><\/span><span role=\"button\" tabindex=\"0\" data-code=\"result = client.recognize_linked_entities([nashville_document])\n\nfor entity in result[0].entities:\n    link = {\n      &quot;name&quot;: entity.name, \n      &quot;url&quot;: entity.url, \n      &quot;data source&quot;: entity.data_source, \n      &quot;matches&quot;: [\n        {\n          &quot;text&quot;: match.text,\n          &quot;confidence_score&quot;:match.confidence_score\n        } \n        for match in entity.matches\n      ]\n    }\n    print(json.dumps(link, indent=4)) \" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #D8DEE9FF\">result <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> client<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">recognize_linked_entities<\/span><span style=\"color: #ECEFF4\">([<\/span><span style=\"color: #D8DEE9FF\">nashville_document<\/span><span style=\"color: #ECEFF4\">])<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #81A1C1\">for<\/span><span style=\"color: #D8DEE9FF\"> entity <\/span><span style=\"color: #81A1C1\">in<\/span><span style=\"color: #D8DEE9FF\"> result<\/span><span style=\"color: #ECEFF4\">[<\/span><span style=\"color: #B48EAD\">0<\/span><span style=\"color: #ECEFF4\">].<\/span><span style=\"color: #D8DEE9FF\">entities<\/span><span style=\"color: #ECEFF4\">:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    link <\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">{<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">      <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">name<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">:<\/span><span style=\"color: #D8DEE9FF\"> entity<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">name<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">      <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">url<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">:<\/span><span style=\"color: #D8DEE9FF\"> entity<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">url<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">      <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">data source<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">:<\/span><span style=\"color: #D8DEE9FF\"> entity<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">data_source<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">      <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">matches<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">:<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">[<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">        <\/span><span style=\"color: #ECEFF4\">{<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">          <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">text<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">:<\/span><span style=\"color: #D8DEE9FF\"> match<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">text<\/span><span style=\"color: #ECEFF4\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">          <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">confidence_score<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #ECEFF4\">:<\/span><span style=\"color: #D8DEE9FF\">match<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">confidence_score<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">        <\/span><span style=\"color: #ECEFF4\">}<\/span><span style=\"color: #D8DEE9FF\"> <\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">        <\/span><span style=\"color: #81A1C1\">for<\/span><span style=\"color: #D8DEE9FF\"> match <\/span><span style=\"color: #81A1C1\">in<\/span><span style=\"color: #D8DEE9FF\"> entity<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #D8DEE9FF\">matches<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">      <\/span><span style=\"color: #ECEFF4\">]<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    <\/span><span style=\"color: #ECEFF4\">}<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    <\/span><span style=\"color: #88C0D0\">print<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">json<\/span><span style=\"color: #ECEFF4\">.<\/span><span style=\"color: #88C0D0\">dumps<\/span><span style=\"color: #ECEFF4\">(<\/span><span style=\"color: #D8DEE9FF\">link<\/span><span style=\"color: #ECEFF4\">,<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">indent<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #B48EAD\">4<\/span><span style=\"color: #ECEFF4\">))<\/span><span style=\"color: #D8DEE9FF\"> <\/span><\/span><\/code><\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">Here is the linked entity data for the Nashville document.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>{\n    \"name\": \"Nashville, Tennessee\",\n    \"url\": \"https:\/\/en.wikipedia.org\/wiki\/Nashville,_Tennessee\",\n    \"data source\": \"Wikipedia\",\n    \"matches\": &#91;\n        {\n            \"text\": \"Nashville, Tennessee\",\n            \"confidence_score\": 0.56\n        },\n        {\n            \"text\": \"Music City\",\n            \"confidence_score\": 0.55\n        }\n    ]\n}\n{\n    \"name\": \"Dolly Parton\",\n    \"url\": \"https:\/\/en.wikipedia.org\/wiki\/Dolly_Parton\",\n    \"data source\": \"Wikipedia\",\n    \"matches\": &#91;\n        {\n            \"text\": \"Dolly Parton\",\n            \"confidence_score\": 0.88\n        }\n    ]\n}\n{\n    \"name\": \"Andrew Jackson\",\n    \"url\": \"https:\/\/en.wikipedia.org\/wiki\/Andrew_Jackson\",\n    \"data source\": \"Wikipedia\",\n    \"matches\": &#91;\n        {\n            \"text\": \"Andrew Jackson\",\n            \"confidence_score\": 0.21\n        }\n    ]\n}\n{\n    \"name\": \"Broadway (Nashville, Tennessee)\",\n    \"url\": \"https:\/\/en.wikipedia.org\/wiki\/Broadway_(Nashville,_Tennessee)\",\n    \"data source\": \"Wikipedia\",\n    \"matches\": &#91;\n        {\n            \"text\": \"Broadway Street\",\n            \"confidence_score\": 0.13\n        }\n    ]\n}\n{\n    \"name\": \"Culture of the Southern United States\",\n    \"url\": \"https:\/\/en.wikipedia.org\/wiki\/Culture_of_the_Southern_United_States\",\n    \"data source\": \"Wikipedia\",\n    \"matches\": &#91;\n        {\n            \"text\": \"Southern culture\",\n            \"confidence_score\": 0.81\n        }\n    ]\n}<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Again, this is coming from a generically trained model.  If you have data specific to a domain such as a company or event that is not widely known, you can train a model to recognize entities from that data.  There is no SDK for training custom NER models, but you can use the REST API or Language Studio.  Training a custom NER model is outside the scope of this post.  But you can try out the other features from before using Language Studio so let&#8217;s take a look at it.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Using Language Studio<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">If you want to use the features of the Azure AI Language Service without creating an application, you can use Language Studio.  Access the Language Studio by going to <a href=\"https:\/\/language.cognitive.azure.com\/\">https:\/\/language.cognitive.azure.com\/<\/a> in your browser.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"568\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/01-5-1024x568.png\" alt=\"\" class=\"wp-image-97\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/01-5-1024x568.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/01-5-300x166.png 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/01-5-768x426.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/01-5-1536x851.png 1536w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/01-5-2048x1135.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\">The Azure Language Studio<\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The <strong>Extract Information<\/strong> tab lets you use the NER and linked entity features discussed in this post, but you can also use the extract key phrases feature from the previous post.  And under the <strong>Classify text<\/strong> tab you can find sentiment analysis.  For this demo, I&#8217;ll click on the <strong>Extract named entities<\/strong> link.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To use named entity recognition in Azure Language Studio you&#8217;ll have to choose a model version (the most recent is selected), the language of the text to analyze, and the name of a Language service instance in your Azure subscription.  In the textarea paste the text to analyze or select one of the provided examples.  <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"592\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/02-4-1024x592.png\" alt=\"\" class=\"wp-image-98\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/02-4-1024x592.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/02-4-300x174.png 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/02-4-768x444.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/02-4-1536x888.png 1536w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/02-4-2048x1184.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Below the examples is a checkbox.  You must check the box to acknowledge that using Language Studio will consume resources from your Language Service instance. Then click the <strong>Run<\/strong> button to start.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"120\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/03-4-1024x120.png\" alt=\"\" class=\"wp-image-99\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/03-4-1024x120.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/03-4-300x35.png 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/03-4-768x90.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/03-4-1536x179.png 1536w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/03-4.png 1626w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Azure Language Studio will show the results in different ways.  The cards view shows a card for each extracted entity with category and subcategory at the top.  Then is the text matched an the confidence score.  You can also click on the <strong>JSON<\/strong> tab to see the raw data returned by the REST API.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"528\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/04-4-1024x528.png\" alt=\"\" class=\"wp-image-100\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/04-4-1024x528.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/04-4-300x155.png 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/04-4-768x396.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/04-4-1536x793.png 1536w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/04-4-2048x1057.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">The other view is the original text of the document.  However, it is annotated with a link for each entity.  Hovering over the entity will show the card with additional information such as the category.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"621\" src=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/05-3-1024x621.png\" alt=\"\" class=\"wp-image-101\" srcset=\"https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/05-3-1024x621.png 1024w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/05-3-300x182.png 300w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/05-3-768x465.png 768w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/05-3-1536x931.png 1536w, https:\/\/douglasstarnes.dev\/wp-content\/uploads\/2024\/11\/05-3.png 1812w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Summary<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">In this post you saw how to use the named entity recognition feature (NER) of the Azure AI Language Service.  You learned how to use the Python SDK to get details of the entity.  The SDK provides methods to extract named entities for both a generalized model and a domain specific model for health care.  The Azure Language Studio provides access to entity recognition and the other features of the Azure AI Language Service in the browser.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the previous post on the Azure AI Language, you saw how to provision and&#8230;<\/p>\n","protected":false},"author":1,"featured_media":107,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[10],"tags":[7,8,9,14,4],"class_list":["post-96","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-microsoft-azure","tag-ai","tag-azure","tag-language","tag-ner","tag-python"],"_links":{"self":[{"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/posts\/96","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/comments?post=96"}],"version-history":[{"count":2,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/posts\/96\/revisions"}],"predecessor-version":[{"id":103,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/posts\/96\/revisions\/103"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/media\/107"}],"wp:attachment":[{"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/media?parent=96"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/categories?post=96"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/douglasstarnes.dev\/index.php\/wp-json\/wp\/v2\/tags?post=96"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}