de en

Text comprehension and automated text generation with NLP, NLU and NLG

So far, we have generally steered clear of the areas of text comprehension and text generation by ML in our practical examples for the basic understanding of AI. For good reason, we have focused primarily on two types of problems: classification of images and prediction of numerical values.

This is because in such tasks it is obvious what the input and output are to a machine learning algorithm: Images can be expressed by Numbers as a sequence of colour and brightness values. Numerical problems such as “How many crickets chirp per minute at a summer temperature of 27°C?” are already expressed in numbers, so it is intuitively clear that a neural network, for example, can “calculate” with them.

Challenge for the algorithm: language and text

But how does it behave if, for example, you want to know whether an email is spam or not? Or if you want to filter out the most urgent and the most negative from a mass of customer requests for the purpose of prioritisation? This is also a classification problem. However, it is not immediately clear here how an email or a request can be expressed by numbers in such a way that machine learning algorithms can be used to solve this problem. 

In the next articles, we will look specifically at which methods and functions are used in NLP, in the automated capture and creation of texts. Today, however, we first want to give an overview. What exactly is NLP? Which sub-areas does it cover and what are the current challenges and possible applications?

Natural Language Processing - NLP

The technical term for understanding and processing text is “natural language processing”, usually abbreviated to NLP. In German, the term “computer linguistics” is still often used. However, the term NLP is increasingly gaining acceptance in this country as well, which is why we also want to use it.

For outsiders, it may sound strange that we always speak of natural language, i.e. “natural” language. Is there such a thing as “unnatural language”? This emphasis is based on the fact that computers were already able to understand languages perfectly very early on - but only those that were made for them, namely programming languages. In order to make a clear distinction here, the term “natural” was chosen, since human language (with a few exceptions) is not formally constructed, but has developed naturally.

Written language vs. spoken word in ML

To anticipate a common misunderstanding right away: NLP is about processing written language, not the spoken word. The conversion of spoken language into text is called speech-to-text (STT). Conversely, e.g. with a screen reader, one speaks of text-to-speech (TTS). Since audio signal processing also plays a role here, this is usually not included in NLP.

The same applies to the conversion of scanned documents into text files (optical character recognition - OCR) or handwriting recognition. Since this is more about recognition of visual components, it (like STT and TTS) does not properly belong to NLP. Simply put, NLP is anything where a simple text file (.txt) can serve as input and/or output to an AI system.

Breakdown of NLP into Understanding (NLU) and Generation (NLG).

NLP breaks down further into natural language understanding (NLU) and natural language generation (NLG).

Acquisition of text by an AI model (NLU)

In NLU, a computer program must “understand” some aspect of the text in order to solve a problem. Most of the time, however, this understanding of text by the AI is only focused on a very specific aspect, so it is not the same understanding that a human has when reading a text, for example.

NLU starts with tasks that we still know from primary school - grammatical analysis of texts, e.g.:

  • Distinguishing types of words: What is a verb, what is an adjective?
  • Recognising plural, singular and grammatical gender
  • Recognising cases

The NLU becomes more complex when the meaning of words (semantics) plays a role alongside simple grammatical tasks. Then the tasks become more difficult, but also more exciting and useful:

  • Distinguishing between subject and object (Who does something to what?)
  • Recognising people, places, products, brands etc. (What is being talked about in a text?)
  • Recognising key words in a text (Which words are particularly important? What is a customer complaining about?)
  • Classifying text (Contract or invoice? Spam or important email? Urgent task or small talk? Positive or negative customer feedback?)

With systems trained to handle such tasks, even simple office tasks that were previously reserved for humans can be automated. Documents and emails can be automatically recognised and sent to the right recipients, important information such as invoice recipients, order numbers can be automatically extracted and transferred to CRM and ERP systems.

Generation of texts by an AI model (NLG)

The counterpart to NLU is natural language generation (NLG). Here the focus is on the language as the output of the system. Classic NLG applications are, for example:

  • Generation of text based on machine-readable data, such as weather reports, sports reports, financial texts or product descriptions.
  • Summarising texts
  • Translation of texts

NLG systems enable the development of new business areas that would not have been profitable in the past due to the high costs of manual text creation, i.e. the “long tail” area. For example, a large number of differentiated landing pages can be created for SEO, which would not be affordable without machine support.

In cooperation with a good NLP system, NLG systems become even more interesting, as the automatic generation of text can completely automate significantly more tasks than pure understanding. A typical example are chatbots that have to understand the request of a conversation partner (NLU) and then generate a suitable answer (NLG).

In the next articles, we will then take a closer look at how a machine learning and, in particular, deep learning system can solve such problems.


What is NLP?

The abbreviation NLP stands for natural language processing and refers to a focus of machine learning in which natural (i.e. human) language is processed automatically by algorithms.

NLP is in turn divided into two sub-areas, namely NLU (natural language understanding), i.e. the automated acquisition of language, and NLG (natural language generation), the automated generation of texts. Ideally, both tasks are solved by combining different ML methods (e.g. determination of word type, case or gender, classification of different text genres, recognition of proper names or keywords).

A challenge in NLP is the fact that (for example, due to varying sentence length, etc.) the input values are always changing and this must be taken into account.

What is the difference between NLU and NLG?

Both NLU and NLG are approaches to automatically process natural language using artificial intelligence and machine learning.

NLU stands for natural language understanding - the algorithm is to be trained to correctly capture machine texts. An example of an application would be spam filters that sort corresponding mails without errors, ML systems that correctly classify and file documents and files, or sentiment analysis, which can be used to pre-sort and prioritise customer enquiries or reviews based on the sentiment of the text.

NLG (short for natural language generation) refers to the automated generation of texts. As with other ML approaches, the error-free input of machine-readable data and a successful training phase are necessary for this. Correctly trained, the system can then produce, for example, weather reports or sports news.

Why do we speak of natural language in NLP (natural language processing)?

Natural language is explicitly used to exclude, for example, computer or programming languages. NLP (also known as computational linguistics) is about the processing of human language.

Is NLP (natural language processing) also applied to the processing and generation of spoken language?

NLP refers exclusively to machine-readable information. In order to process spoken language, it must first be converted or written down. This is called speech-to-text (STT) or - for example in the case of a screen reader/reading device - text-to-speech (TTS). Since audio signal processing also plays a role here, this is usually not included in NLP.

The same applies to document capture or handwriting recognition by scanning, where methods of image recognition (optical character recognition - OCR) are applied.