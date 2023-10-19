The goal of Natural Language Processing (NLP) is to enable computers to “understand” human language, either written or spoken. It’s not a new technology. In fact, it’s been in development for over 60 years - since researchers first tried to use early computing technologies to automatically translate languages. It’s given us the current proliferation of digital assistants and chatbots, but there are many types of NLP. Our teams focus on entity extraction, key phrase extraction, text classification and semantic text similarity.



Entity extraction plays a pivotal role in our NLP capabilities, fundamentally altering how we extract pertinent information from textual data. It operates as a discerning detective, meticulously combing through extensive text to identify specific entities, encompassing individuals, organizations, and geographical locations. Within our work, we utilize entity extraction to reinforce KYC processes, efficiently pinpointing entities associated with financial misconduct and regulatory infringements.



Key phrase extraction is designed to unlock valuable insights from textual data. It involves the identification of essential keywords or phrases within a document, helping to distill the most critical information. In the context of compliance and third-party risk management, key phrase extraction plays a crucial role by pinpointing and highlighting keywords related to compliance, risks, and crucial data points.



Text classification is the task of assigning a set of predefined categories to free text. Classifiers can be used to organize, structure, and categorize. For example, a document, paragraph, or a sentence can be classified as risk relevant or non-risk relevant. A sentence might be classified as factual, not to be confused with true or opinion.



Semantic text similarity offers a powerful way to measure the resemblance between pieces of text. Imagine it as a language understanding wizard that not only recognizes identical phrases but also comprehends the underlying meaning. In our work, we harness semantic text similarity to identify duplicate news articles, ensuring that businesses can efficiently spot replicated content, maintain content quality, and avoid redundancy. Our sophisticated algorithms analyze the semantic structure of text to determine its likeness, helping organizations maintain a strong editorial stance while ensuring that their readers receive fresh and valuable insights every time.