As long as Artificial Intelligence helps us to get more out of the natural language, we see more tasks and fields mushrooming at the intersection of AI and linguistics. In one of our previous articles, we discussed the difference between Natural Language Processing and Natural Language Understanding. Both fields, however, have natural languages as input. At the same time, the urge to establish two-way communication with computers has lead to the emergence of a separate subcategory of tasks dealing with producing (quasi)-natural speech. This subcategory, called Natural Language Generation will be the focus of this blog post.
What is NLG?
Natural Language Generation, as defined by Artificial Intelligence: Natural Language Processing Fundamentals, is the “process of producing meaningful phrases and sentences in the form of natural language.” In its essence, it automatically generates narratives that describe, summarize or explain input structured data in a human-like manner at the speed of thousands of pages per second.
However, while NLG software can write, it can’t read. The part of NLP that reads human language and turns its unstructured data into structured data understandable to computers is called Natural Language Understanding.
In general terms, NLG (Natural Language Generation) and NLU (Natural Language Understanding) are subsections of a more general NLP domain that encompasses all software which interprets or produces human language, in either spoken or written form:
- NLU takes up the understanding of the data based on grammar, the context in which it was said and decide on intent and entities.
- NLP converts a text into structured data.
- NLG generates a text based on structured data.
Major applications of NLG
NLG makes data universally understandable making the writing of data-driven financial reports, product descriptions, meeting memos, and more much easier and faster. Ideally, it can take the burden of summarizing the data from analysts to automatically write reports that would be tailored to the audience.The main practical present-day applications of NLG are, therefore, connected with writing analysis or communicating necessary information to customers:
Practical Applications of NLG
At the same time, NLG has more theoretical applications that make it a valuable tool not only in Computer Science and Engineering, but also in Cognitive Science and Psycholinguistics. These include:
NLG Applications in Theoretical Research
Evolution of NLG Design and Architecture
In the attempts to mimic human speech, NLG systems used different methods and tricks to adapt their writing style, tone and structure according to the audience, the context and purpose of the narrative. In 2000 Reiter and Dale pipelined NLG architecture distinguishing three stages in the NLG process:
1. Document planning: deciding what is to be said and creating an abstract document that outlines the structure of the information to be presented.
2. Microplanning: generation of referring expressions, word choice, and aggregation to flesh out the document specifications.
3.Realisation: converting the abstract document specifications to a real text, using domain knowledge about syntax, morphology, etc.
Three Stages of the NLG Process
This pipeline shows the milestones of natural language generation, however, specific steps and approaches, as well as the models used, can vary significantly with the technology development.
There are two major approaches to language generation: using templates and dynamic creation of documents. While only the latter is considered to be “real” NLG, there was a long and multistage way from basic straightforward templates to the state-of-the-art and each new approach expanded functionality and added linguistic capacities:
Simple Gap-Filling Approach
One of the oldest approaches is a simple fill-in-the-gap template system. In texts that have a predefined structure and need just a small amount of data to be filled in, this approach can automatically fill in such gaps with data retrieved from a spreadsheet row, database table entry, etc. In principle, you can vary certain aspects of the text: for example, you can decide whether to spell numbers or leave them as is, this approach is quite limited in its use and is not considered to be “real” NLG.
Scripts or Rules-Producing Text
Basic gap-filling systems were expanded with general-purpose programming constructs via a scripting language or by using business rules. The scripting approach, such as using web templating languages, embeds a template inside a general-purpose scripting language, so it allows for complex conditionals, loops, access to code libraries, etc. Business rule approaches, which are adopted by most document composition tools, work similarly, but focus on writing business rules rather than scripts. Though more powerful than straightforward gap filling, such systems still lack linguistic capabilities and cannot reliably generate complex high-quality texts.
Word-Level Grammatical Functions
A logical development of template-based systems was adding word-level grammatical functions to deal with morphology, morphophonology, and orthography as well as to handle possible exceptions. These functions made it easier to generate grammatically correct texts and to write complex template systems.
Dynamic Sentence Generation
Finally taking a step from template-based approaches to dynamic NLG, this approach dynamically creates sentences from representations of the meaning to be conveyed by the sentence and/or its desired linguistic structure. Dynamic creation means that the system can do sensible things in unusual cases, without needing the developer to explicitly write code for every boundary case. It also allows the system to linguistically “optimise” sentences in a number of ways, including reference, aggregation, ordering, and connectives.
Dynamic Document Creation
While dynamic sentence generation works at a certain “micro-level”, the “macro-writing” task produces a document which is relevant and useful to its readers, and also well-structured as a narrative. How it is done depends on the goal of the text. For example, a piece of persuasive writing may be based on models of argumentation and behavior change to mimic human rhetoric; and a text that summarizes data for business intelligence may be based on an analysis of key factors that influence the decision.
Even after NLG shifted from templates to dynamic generation of sentences, it took the technology years of experimenting to achieve satisfactory results. As a part of NLP and, more generally, AI, natural language generation relies on a number of algorithms that address certain problems of creating human-like texts:
The Markov chain was one of the first algorithms used for language generation. This model predicts the next word in the sentence by using the current word and considering the relationship between each unique word to calculate the probability of the next word. In fact, you have seen them a lot in earlier versions of the smartphone keyboard where they were used to generate suggestions for the next word in the sentence.
Recurrent neural network (RNN)
Neural networks are models that try to mimic the operation of the human brain. RNNs pass each item of the sequence through a feedforward network and use the output of the model as input to the next item in the sequence, allowing the information in the previous step to be stored. In each iteration, the model stores the previous words encountered in its memory and calculates the probability of the next word. For each word in the dictionary, the model assigns a probability based on the previous word, selects the word with the highest probability and stores it in memory. RNN’s “memory” makes this model ideal for language generation because it can remember the background of the conversation at any time. However, as the length of the sequence increases, RNNs cannot store words that were encountered remotely in the sentence and makes predictions based on only the most recent word. Due to this limitation, RNNs are unable to produce coherent long sentences.
To address the problem of long-range dependencies, a variant of RNN called Long short-term memory (LSTM) was introduced. Though similar to RNN, LSTM models include a four-layer neural network. The LSTM consists of four parts: the unit, the input door, the output door and the forgotten door. These allow the RNN to remember or forget words at any time interval by adjusting the information flow of the unit. When a period is encountered, the Forgotten Gate recognizes that the context of the sentence may change and can ignore the current unit state information. This allows the network to selectively track only relevant information while also minimizing the disappearing gradient problem, which allows the model to remember information over a longer period of time.
Still, the capacity of the LSTM memory is limited to a few hundred words due to their inherently complex sequential paths from the previous unit to the current unit. The same complexity results in high computational requirements that make LSTM difficult to train or parallelize.
A relatively new model was first introduced in the 2017 Google paper “Attention is all you need”, which proposes a new method called “self-attention mechanism.” The Transformer consists of a stack of encoders for processing inputs of any length and another set of decoders to output the generated sentences. In contrast to LSTM, the Transformer performs only a small, constant number of steps, while applying a self-attention mechanism that directly simulates the relationship between all words in a sentence. Unlike previous models, the Transformer uses the representation of all words in context without having to compress all the information into a single fixed-length representation that allows the system to handle longer sentences without the skyrocketing of computational requirements.
One of the most famous examples of the Transformer for language generation is OpenAI, their GPT-2 language model. The model learns to predict the next word in a sentence by focusing on words that were previously seen in the model and related to predicting the next word. A more recent upgrade by Google, the Transformers two-way encoder representation (BERT) provides the most advanced results for various NLP tasks.
You can see that natural language generation is a complicated task that needs to take into account multiple aspects of language, including its structure, grammar, word usage and perception. Luckily, you probably won’t build the whole NLG system from scratch as the market offers multiple ready-to-use tools, both commercial and open-source.
Commercial NLG Tools
Arria NLG PLC is believed to be one of the global leaders in NLG technologies and tools and can boast the most advanced NLG engine and reports generated by NLG narratives. The company has patented NLG technologies available for use via Arria NLG platform.
AX Semantics: offers eCommerce, journalistic and data reporting (e.g. BI or financial reporting) NLG services for over 100 languages. It is a developer-friendly product that uses AI and machine learning to train the platform’s NLP engine.
Yseop is known for its smart customer experience across platforms like mobile, online or face-to-face. From the NLG perspective, it offers Compose that can be consumed on-premises, in the cloud or as a service, and offers Savvy, a plug-in for Excel and other analytics platforms.Quill by Narrative Science is an NLG platform powered by advanced NLG. Quill converts data to human-intelligent narratives by developing a story, analysing it and extracting the required amount of data from it.
Wordsmith by Automated Insights is an NLG engine that works chiefly in the sphere of advanced template-based approaches. It allows users to convert data into text in any format or scale. Wordsmith also provides a plethora of language options for data conversion.
Open-Source NLG Tools
Simplenlg is probably the most widely used open-source realiser, especially by system-builders. It is an open-source Java API for NLG written by the founder of Arria. It has the least functionality but also is the easiest to use and best documented.
NaturalOWL is an open-source toolkit which can be used to generate descriptions of OWL classes and individuals to configure an NLG framework to specific needs, without doing much programming.
NLG capabilities have become the de facto option as analytical platforms try to democratize data analytics and help anyone understand their data. Close to human narratives automatically explain insights that otherwise could be lost in tables, charts, and graphs via natural language and act as a companion throughout the data discovery process. Besides, NLG coupled with NLP are the core of chatbots and other automated chats and assistants that provide us with everyday support.
As NLG continues to evolve, it will become more diversified and will provide effective communication between us and computers in a natural fashion.