{"id":9205,"date":"2020-08-04T07:15:48","date_gmt":"2020-08-04T07:15:48","guid":{"rendered":"https:\/\/www.experfy.com\/blog\/?p=9205"},"modified":"2023-11-23T13:20:56","modified_gmt":"2023-11-23T13:20:56","slug":"temporal-graph-networks","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/ai-ml\/temporal-graph-networks\/","title":{"rendered":"Temporal Graph Networks"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"9205\" class=\"elementor elementor-9205\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-1bba9292 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"1bba9292\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-6fcabcd4\" data-id=\"6fcabcd4\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-137b1ce7 elementor-widget elementor-widget-text-editor\" data-id=\"137b1ce7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\n<p><strong><em>TL;DR<\/em><\/strong><em>\u00a0Many real-world problems involving networks of transactions of various nature and social interactions and engagements are dynamic and can be modelled as graphs where nodes and edges appear over time. In this post, we describe Temporal Graph Network, a generic framework for deep learning on dynamic graphs.<\/em><\/p>\n\n\n\n<p><em>This post was co-authored with\u00a0<\/em><a href=\"https:\/\/www.emanuelerossi.co.uk\/\" target=\"_blank\" rel=\"noreferrer noopener\"><em>Emanuele Rossi<\/em><\/a><em>.<\/em><\/p>\n\n\n<hr class=\"wp-block-separator\" \/>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-eafeeca elementor-widget elementor-widget-text-editor\" data-id=\"eafeeca\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\n<p><a href=\"https:\/\/www.experfy.com\/blog\/simple-scalable-graph-neural-networks\/\" target=\"_blank\" rel=\"noreferrer noopener\">Graph neural networks<\/a> (GNNs) research has surged to become one of the\u00a0<a href=\"https:\/\/twitter.com\/prlz77\/status\/1178662575900368903\" target=\"_blank\" rel=\"noreferrer noopener\">hottest topics<\/a>\u00a0in machine learning this year. GNNs have seen a series of recent successes in problems from the fields of biology, chemistry, social science, physics, and many others. So far, GNN models have been primarily developed for\u00a0<em>static<\/em>\u00a0graphs that do not change over time. However, many interesting real-world graphs are\u00a0<em>dynamic<\/em>\u00a0and evolving in time, with prominent examples including social networks, financial transactions, and recommender systems. In many cases, it is the dynamic behaviour of such systems that conveys important insights, otherwise lost if one considers only a static graph.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2842728 elementor-widget elementor-widget-image\" data-id=\"2842728\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/1023\/0*kWGR5wsMTeL_PYZV\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8ef51d6 elementor-widget elementor-widget-text-editor\" data-id=\"8ef51d6\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<em>A dynamic network of Twitter users interacting with tweets and following each other. All the edges have a timestamp. Given such a dynamic graph, we want to predict future interactions, e.g., which tweet a user will like or whom they will follow.<\/em>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2b23f07 elementor-widget elementor-widget-text-editor\" data-id=\"2b23f07\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\n<p>A dynamic graph can be represented as an ordered list or asynchronous \u201cstream\u201d of timed events, such as additions or deletions of nodes and edges [1]. A social network like Twitter is a good illustration to keep in mind: when a person joins the platform, a new node is created. When they follow another user, a follow edge is created. When they change their profile, the node is updated.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a297860 elementor-widget elementor-widget-image\" data-id=\"a297860\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/1185\/0*RCKsJWbDLibekTDc\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a04aeea elementor-widget elementor-widget-text-editor\" data-id=\"a04aeea\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\n<p>This stream of events is ingested by an encoder neural network that produces a time-dependent embedding for each node of the graph. The embedding can then be fed into a decoder that is designed for a specific task. One example task is predicting future interactions by trying to answer the question: what is the probability of having an edge between nodes\u00a0<em>i<\/em>\u00a0and\u00a0<em>j<\/em>\u00a0at time\u00a0<em>t<\/em>? The ability to answer this question is crucial to recommendation systems that e.g. suggest social network users whom to follow or decide which content to display. The following figure illustrates this scenario:<\/p>\n\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-bd37f18 elementor-widget elementor-widget-image\" data-id=\"bd37f18\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/1600\/0*V_gp4Mq1eJNDnUJq\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-70a38d3 elementor-widget elementor-widget-text-editor\" data-id=\"70a38d3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\n<p>The key piece in the above setup is the encoder that can be trained with any decoder. On the aforementioned task of future interaction prediction, the training can be done in a self-supervised manner: during each epoch, the encoder processes the events in chronological order and predicts the next interaction based on the previous ones [2].<\/p>\n\n\n\n<p>Temporal Graph Network (TGN) is a general encoder architecture we developed at Twitter with colleagues Fabrizio Frasca, Davide Eynard, Ben Chamberlain, and Federico Monti [3]. This model can be applied to various problems of learning on dynamic graphs represented as a stream of events. In a nutshell, a TGN encoder acts by creating compressed representations of the nodes based on their interactions and updates them upon each event. To accomplish this, a TGN has the following main components:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-05d3b29 elementor-widget elementor-widget-text-editor\" data-id=\"05d3b29\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\n<p><strong>Memory.\u00a0<\/strong>The memory stores the states of all the nodes, acting as a compressed representation of the node\u2019s past interactions. It is analogous to the hidden state of an RNN; however, here we have a separate state vector\u00a0<strong><em>s<\/em><\/strong><em>\u1d62<\/em>(<em>t<\/em>) for each node\u00a0<em>i<\/em>. When a new node appears, we add a corresponding state initialised as a vector of zeros. Moreover, since the memory for each node is just a state vector (and not a parameter), it can also be updated at test time when the model ingests new interactions.<\/p>\n\n\n\n<p><strong>Message Function\u00a0<\/strong>is the main mechanism of updating the memory. Given an interaction between nodes\u00a0<em>i<\/em>\u00a0and\u00a0<em>j<\/em>\u00a0at time\u00a0<em>t<\/em>, the message function computes two messages (one for\u00a0<em>i<\/em>\u00a0and one for\u00a0<em>j<\/em>) , which are used to update the memory. This is analogous to the messages computed in message-passing graph neural networks [4]. The message is a function of the memory of nodes\u00a0<em>i<\/em>\u00a0and\u00a0<em>j<\/em>\u00a0at the instance of time\u00a0<em>t\u207b\u00a0<\/em>preceding the interaction, the interaction time\u00a0<em>t<\/em>, and the edge features [5]:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6be1423 elementor-widget elementor-widget-image\" data-id=\"6be1423\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/2760\/1*7GDtYEv0FD0Yk9SyrFTnsA.png\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-9b68bb9 elementor-widget elementor-widget-image\" data-id=\"9b68bb9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/1600\/0*8805s5Nc7AMq2E9d\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-544c70b elementor-widget elementor-widget-text-editor\" data-id=\"544c70b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\n<p><strong>Memory Updater<\/strong>\u00a0is used to update the memory with the new messages. This module is usually implemented as an RNN.<\/p>\n\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6c55617 elementor-widget elementor-widget-image\" data-id=\"6c55617\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/2766\/1*kuyeVC6PLG0C9FAjgMFMbQ.png\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4ef656e elementor-widget elementor-widget-image\" data-id=\"4ef656e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/1600\/0*mpkrKms6NMu1psok\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c660c3f elementor-widget elementor-widget-text-editor\" data-id=\"c660c3f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\n<p>Given that the memory of a node is a vector updated over time, the most straightforward approach is to use it directly as the node embedding. In practice, however, this is a bad idea due to the\u00a0<em>staleness<\/em>\u00a0problem: given that the memory is updated only when a node is involved in an interaction, a long period of inactivity of a node causes its memory to go out of date. As an illustration, think of a user staying off Twitter for a few months. When the user returns, they might have already developed new interests in the meantime, so the memory of their past activity is no more relevant. We therefore need a better way to compute the embeddings.<\/p>\n\n\n\n<p><strong>Embedding.\u00a0<\/strong>A solution is to look at the node neighbours. To solve the staleness problem, the embedding module computes the temporal embedding of a node by performing a graph aggregation over the spatiotemporal neighbours of that node. Even if a node has been inactive for a while, it is likely that some of its neighbours have been active, and by aggregating their memories, TGN can compute an up-to-date embedding for the node. In our example, even when a user stays off Twitter, their friends continue to be active, so when they return, the friends\u2019 recent activity is likely to be more relevant than the user\u2019s own history.<\/p>\n\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-cbf83e4 elementor-widget elementor-widget-image\" data-id=\"cbf83e4\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/1600\/0*_vccf5DvvDcpCylp\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7735232 elementor-widget elementor-widget-text-editor\" data-id=\"7735232\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<em>The graph embedding module computes the embedding of a target node by performing an aggregation over its temporal neighbourhood. In the above diagram, when computing the embedding for node 1 at some time t greater than t<\/em>\u2082<em>, t<\/em>\u2083<em> and t<\/em>\u2084<em>, but smaller than t<\/em>\u2085<em>, the temporal neighbourhood will include only edges occurred before time t. Therefore, the edge with node 5 is not involved in the computation, as it happens in the future. Instead, the embedding module aggregates from both the features (v) and memory (s) of neighbours 2, 3 and 4, as well as the features on the edges to compute a representation for node 1. The best performing graph embedding module in our experiments is graph attention, which is able to learn which neighbours are the most important based on their memory, features and time of interaction.<\/em>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ebaec70 elementor-widget elementor-widget-text-editor\" data-id=\"ebaec70\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\n<p>The overall computations performed by TGN on a batch of training data are summarised in the following figure:<\/p>\n\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c584b29 elementor-widget elementor-widget-image\" data-id=\"c584b29\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/1600\/0*bS68H4ztcv6LfU9X\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-38e290a elementor-widget elementor-widget-text-editor\" data-id=\"38e290a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<em>Computations performed by TGN on a batch of training data. On the one side, embeddings are produced by the embedding module using the temporal graph and the node\u2019s memory (1). The embeddings are then used to predict the batch interactions and compute the loss (2, 3). On the other side, these same interactions are used to update the memory (4, 5).<\/em>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-726dfde elementor-widget elementor-widget-text-editor\" data-id=\"726dfde\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>\u00a0<\/p>\n<p>By looking at the above diagram, you may be wondering how the memory-related modules (<em>Message function<\/em>,\u00a0<em>Message aggregator,<\/em>\u00a0and\u00a0<em>Memory updater<\/em>) are trained, given that they seem not to directly influence the loss and therefore do not receive a gradient. In order for these modules to influence the loss, we need to update the memory before predicting the batch interactions. However, that would cause leakage, since the memory would already contain information about what we are trying to predict. The strategy we propose to get around this problem is to update the memory with messages coming from\u00a0<em>previous<\/em>\u00a0batches, and then predict the interactions. The diagram below shows the flow of operations of TGN, which is necessary to train the memory-related modules:<\/p>\n<p><!-- \/wp:paragraph --><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f7f16b7 elementor-widget elementor-widget-image\" data-id=\"f7f16b7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/1600\/0*VJJMrYj-X-Grj-DJ\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-2e7ec0f elementor-widget elementor-widget-text-editor\" data-id=\"2e7ec0f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<em>Flow of operations of TGN necessary to train the memory-related modules. A new component is introduced, the raw message store, which stores the necessary information to compute messages, which we call raw messages, for interactions which have been processed by the model in the past. This allows the model to delay the memory update brought by an interaction to later batches. At first, the memory is updated using messages computed from raw messages stored in previous batches (1 and 2). The embeddings can then be computed using the just updated memory (grey link) (3). By doing this, the computation of the memory-related modules directly influences the loss (4, 5), and they receive a gradient. Finally, the raw messages for this batch interactions are stored in the raw message store (6) to be used in future batches.<\/em>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-dca8969 elementor-widget elementor-widget-text-editor\" data-id=\"dca8969\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Inextensive experimental validation on various dynamic graphs, TGN significantly outperformed competing methods [6] on the tasks of future edge prediction and dynamic node classification both in terms of accuracy and speed. One such dynamic graph is Wikipedia, where users and pages are nodes, and an interaction represents a user editing a page. An encoding of the edit text is used as interaction features. The task in this case is to predict which page a user will edit at a given time. We compared different variants of TGN with baseline methods:<\/p>\n<!-- \/wp:paragraph -->\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-122a63c elementor-widget elementor-widget-image\" data-id=\"122a63c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/1600\/0*phEv4nTgVlfIlIor\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7db5271 elementor-widget elementor-widget-text-editor\" data-id=\"7db5271\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t&#8220;Comparison of various configurations of TGN and older methods (TGAT and Jodie) on future link prediction&#8221; \n<figcaption><em>Comparison of various configurations of TGN and older methods (TGAT and Jodie) on future link prediction on the Wikipedia dataset in terms of prediction accuracy and time. We wish more papers reported both of these important criteria in a rigorous way.<\/em>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4af9d6d elementor-widget elementor-widget-text-editor\" data-id=\"4af9d6d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<!-- wp:paragraph -->\n<p>This ablation study sheds light on the importance of different TGN modules and allowing us to make a few general conclusions. First, the memory is important: its absence leads to a large drop in performance [7]. Second, the use of the embedding module (as opposed to directly outputting the memory state) is important. Graph attention-based embedding appears to perform the best. Third, having the memory makes it sufficient to use only one graph attention layer (which drastically reduces the computation time), since the memory of 1-hop neighbours gives the model indirect access to 2-hop neighbours information.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>Asa concluding remark, we consider learning on dynamic graphs an almost virgin research area, with numerous important and exciting applications and a significant potential impact. We believe that our TGN model is an important step towards advancing the ability to learn on dynamic graphs consolidating and extending previous results. As this research field develops, better and bigger benchmarks will become essential. We are now working on creating new dynamic graph datasets and tasks as part of the\u00a0<a href=\"https:\/\/ogb.stanford.edu\/docs\/team\/\" target=\"_blank\" rel=\"noreferrer noopener\">Open Graph Benchmark<\/a>.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:separator --><hr class=\"wp-block-separator\" \/><!-- \/wp:separator -->\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0553046 elementor-widget elementor-widget-text-editor\" data-id=\"0553046\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<!-- wp:paragraph -->\n<p>[1] This scenario is usually referred to as \u201ccontinuous-time dynamic graph\u201d. For simplicity, here we consider only interaction events between pairs of nodes, represented as edges in the graph. Node insertions are considered as a special case of new edges when one of the endpoints is not presently in the graph. We do not consider node- or edge deletions, implying that the graph can only grow over time. Since there may be multiple edges between a pair of nodes, technically speaking, the object we have is a\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Multigraph#:~:text=In%20mathematics%2C%20and%20more%20specifically,by%20more%20than%20one%20edge.\" target=\"_blank\" rel=\"noreferrer noopener\">multigraph<\/a>.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>[2] Multiple interactions can have the same timestamp, and the model predicts each one of them independently. Moreover, mini-batches are created by splitting the\u00a0<em>ordered<\/em>\u00a0list of events in contiguous chunks of a fixed size.<\/p>\n<!-- \/wp:paragraph -->\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-b3bbb5f elementor-widget elementor-widget-text-editor\" data-id=\"b3bbb5f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<!-- wp:paragraph -->\n<p>[3] E. Rossi et al.\u00a0<a href=\"https:\/\/arxiv.org\/abs\/2006.10637\" target=\"_blank\" rel=\"noreferrer noopener\">Temporal graph networks for deep learning on dynamic graphs<\/a>\u00a0(2020). arXiv:2006.10637.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>[4] For simplicity, we assume the graph to be undirected. In case of a directed graph, two distinct message functions, one for sources and one for destinations, would be needed.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>[5] J. Gilmer et al.\u00a0<a href=\"https:\/\/arxiv.org\/abs\/1704.01212\" target=\"_blank\" rel=\"noreferrer noopener\">Neural Message Passing for Quantum Chemistry<\/a>\u00a0(2017). arXiv:1704.01212.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>[6] There are only a handful of methods for deep learning on dynamic graphs, such as TGN of D. Xu et al.\u00a0<a href=\"https:\/\/arxiv.org\/abs\/2002.07962\" target=\"_blank\" rel=\"noreferrer noopener\">Inductive representation learning on temporal graphs<\/a>\u00a0(2020), arXiv:2002.07962 and Jodie of S. Kumar et al.\u00a0<a href=\"https:\/\/arxiv.org\/abs\/1908.01207\" target=\"_blank\" rel=\"noreferrer noopener\">Predicting dynamic embedding trajectory in temporal interaction networks<\/a>\u00a0(2019), arXiv:1908.01207. We show that the methods can be obtained as particular configurations of TGN. For this reason, TGN appears to be the most general model for learning on dynamic graphs so far. Earlier works considered dynamic knowledge graphs, e.g. A. Garc\u00eda-Dur\u00e1n et al.\u00a0<a href=\"https:\/\/www.aclweb.org\/anthology\/D18-1516.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Learning sequence encoders for temporal knowledge graph completion<\/a>\u00a0(2018). Proc. EMNLP.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>[7] While the memory contains information about\u00a0<em>all<\/em>\u00a0past interactions of a node, the graph embedding module has access only to a sample of the temporal neighbourhood (for computational reasons) and may therefore lack access to information critical for the task at hand.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:separator --><hr class=\"wp-block-separator\" \/><!-- \/wp:separator -->\n\n<!-- wp:paragraph -->\n<p><em>The authors are grateful to Giorgos Bouritsas, Ben Chamberlain, and Federico Monti for proof-reading this post, and to Stephan G\u00fcnnemann for referring to earlier works on dynamic knowledge graphs.<\/em><\/p>\n<!-- \/wp:paragraph -->\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Many real-world problems involving networks of transactions of various nature and social interactions and engagements are dynamic and can be modelled as graphs where nodes and edges appear over time.<\/p>\n","protected":false},"author":874,"featured_media":9206,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[183],"tags":[206,517,516,515],"ppma_author":[3686],"class_list":["post-9205","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-ml","tag-deep-learning","tag-dynamic-graphs","tag-generic-framework","tag-temporal-graph-network"],"authors":[{"term_id":3686,"user_id":874,"is_guest":0,"slug":"michael-bronstein","display_name":"Michael Bronstein","avatar_url":"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/08\/Michael-Bronstein-150x150.jpg","user_url":"https:\/\/www.imperial.ac.uk\/people\/m.bronstein","last_name":"Bronstein","first_name":"Michael","job_title":"","description":"Michael Bronstein is Professor, Chair in Machine Learning and Pattern Recognition at Imperial College, London, besides Head of Graph ML at Twitter \/ ML Lead at ProjectCETI\/ ex Founder &amp; Chief Scientist at Fabula_ai\/ ex at Intel #AI #ML #graphs. His main expertise is in theoretical and computational geometric methods for data analysis, and his research encompasses a broad spectrum of applications ranging from machine learning, computer vision, and pattern recognition to geometry processing, computer graphics, and imaging. He has authored over 150 papers, the book Numerical geometry of non-rigid shapes (Springer 2008), and holds over 30 granted patents."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/9205","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/874"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=9205"}],"version-history":[{"count":5,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/9205\/revisions"}],"predecessor-version":[{"id":34292,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/9205\/revisions\/34292"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/9206"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=9205"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=9205"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=9205"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=9205"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}