Blog

The power of natural language processing

Oct 19, 2023

Every day, hundreds of thousands of news stories are published online. At the same time, governments and agencies constantly update and refine their watchlists and databases. It’s an unimaginably huge volume of data. And it’s also unstructured - meaning that traditional database technology can’t easily query it. Yet, hidden within this vast resource are the vital insights that can protect businesses from counterparty risk. For example, an adverse media mention or a new listing on a sanctions database.

Natural language processing (NLP) is a building block of AI and is the fast-evolving technology that’s ideally suited to making sense of unstructured data at a time when organizations depend more than ever on the insights it can provide.

What is natural language processing?

The goal of Natural Language Processing (NLP) is to enable computers to “understand” human language, either written or spoken. It’s not a new technology. In fact, it’s been in development for over 60 years - since researchers first tried to use early computing technologies to automatically translate languages. It’s given us the current proliferation of digital assistants and chatbots, but there are many types of NLP. Our teams focus on entity extraction, key phrase extraction, text classification and semantic text similarity.

Entity extraction plays a pivotal role in our NLP capabilities, fundamentally altering how we extract pertinent information from textual data. It operates as a discerning detective, meticulously combing through extensive text to identify specific entities, encompassing individuals, organizations, and geographical locations. Within our work, we utilize entity extraction to reinforce KYC processes, efficiently pinpointing entities associated with financial misconduct and regulatory infringements.

Key phrase extraction is designed to unlock valuable insights from textual data. It involves the identification of essential keywords or phrases within a document, helping to distill the most critical information. In the context of compliance and third-party risk management, key phrase extraction plays a crucial role by pinpointing and highlighting keywords related to compliance, risks, and crucial data points.

Text classification is the task of assigning a set of predefined categories to free text. Classifiers can be used to organize, structure, and categorize. For example, a document, paragraph, or a sentence can be classified as risk relevant or non-risk relevant. A sentence might be classified as factual, not to be confused with true or opinion.

Semantic text similarity offers a powerful way to measure the resemblance between pieces of text. Imagine it as a language understanding wizard that not only recognizes identical phrases but also comprehends the underlying meaning. In our work, we harness semantic text similarity to identify duplicate news articles, ensuring that businesses can efficiently spot replicated content, maintain content quality, and avoid redundancy. Our sophisticated algorithms analyze the semantic structure of text to determine its likeness, helping organizations maintain a strong editorial stance while ensuring that their readers receive fresh and valuable insights every time.

The value of data

Because NLP analyzes human language it relies on training sets of data to interpret text more accurately, measuring context, and determining what parts are important. The more data you expose to the algorithm, the more comprehensive the results.

Our data reservoir is fed daily by 200,000 sources across 210 jurisdictions and covering 70+ languages, our database contains 19+ million curated profiles built to find unique risk. To address the challenge of analyzing and interpreting this mass of structured and unstructured data, we’ve been using NLP in our screening engine for more than a decade.

For unstructured media, we use tools for word tokenization, text classification, entity-event extraction, and key phrase extraction. Using both NLP and machine learning, we provide unique risk profiles, which are summaries of material risks that are annotated with sources. Coupled with a bibliography of relevant media reports and a full audit trail of creation and modifications, this process solves industry problems. It removes multiple reference points, eliminates redundant and duplicate media reports, and avoids the proliferation of multiple reports housed all under a common name.

Using our rich set of training data, we’re always working to refine and develop the effectiveness of our screening technology. Currently, we’re experimenting with Large Language Models (LLMs) and the smart use of Gen AI to help our customers become more efficient.

Large Language Models (LLMs)

Large Language Models, at their core, are AI systems designed to understand and generate human-like text. Think of them as digital language wizards that learn from massive amounts of text data to grasp the nuances of language. These models use complex algorithms and neural networks to predict what comes next in a sentence or generate coherent responses. Our work harnesses the power of Large Language Models to provide cutting-edge solutions in natural language understanding, text generation, and content optimization. By leveraging these models, we enable businesses to enhance customer interactions, automate content creation and gain deeper insights from textual data. Whether it's improving chatbots, automating content generation, or enhancing content quality, our expertise in LLM empowers organizations to stay at the forefront of AI-driven innovation.

Generative AI (Gen AI)

Generative AI, or Gen AI, is a remarkable technology that brings a touch of creativity to the realm of artificial intelligence. At its essence, Gen AI is designed to understand patterns in data and generate new, human-like content. Imagine it as a digital artist capable of crafting text, images, and more. In our work, we leverage the power of Gen AI to offer cutting-edge solutions like automated content generation and chat-based research and investigation tools. Gen AI analyzes vast datasets to produce written content that's not only coherent but also tailored to specific needs. Whether it's creating engaging articles or assisting in complex research tasks, our expertise in Gen AI enables us to deliver innovative solutions that streamline processes and unlock new possibilities.

Explainable AI

Explainable AI, or XAI, is the key to demystifying the magic behind advanced AI systems. It's all about making AI more transparent and understandable for everyone. At its core, XAI unravels the inner workings of AI models, shedding light on how they arrive at their predictions or decisions. In our work, we specialize in explaining Natural Language Processing (NLP) models by revealing the predictive power of words. This means we can not only tell you what the model predicts but also why it makes those predictions. By making AI more explainable, we empower businesses to trust and utilize these powerful technologies with confidence, ensuring transparency and accountability in every AI-driven decision-making process.

Get in touch

Moody’s Analytics KYC is transforming risk and compliance, creating a world where risk is understood so decisions can be made with confidence.

Our intelligent screening solutions are designed to help organizations with screening and ongoing risk monitoring, as well as a host of other areas of compliance and third-party risk management. By bringing together data and leading-edge algorithms purposely developed for AML screening, Moody’s Analytics can dramatically improve screening efficiency, eliminate false positives, and improve the productivity of compliance analysts.

Customers build their own unique know your customer (KYC) ecosystems. They use our innovative AI, flexible workflow orchestration, access to real-time data, analytical insights, and integrations with other global data providers to support a risk-based approach to compliance.

Contact us to learn more how language-based AI can improve your risk management programs.

Blog

The power of natural language processing

What is natural language processing?

The value of data

Large Language Models (LLMs)

Generative AI (Gen AI)

Explainable AI

Get in touch

CONTACT

PRODUCTS

SOLUTIONS

SECTORS

RESOURCES

ABOUT

TERMS