{"id":1409,"date":"2019-02-15T10:32:08","date_gmt":"2019-02-15T10:32:08","guid":{"rendered":"http:\/\/kusuaks7\/?p=1014"},"modified":"2023-07-14T11:51:12","modified_gmt":"2023-07-14T11:51:12","slug":"next-and-prior-pointing-in-data-models","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/bigdata-cloud\/next-and-prior-pointing-in-data-models\/","title":{"rendered":"Next and Prior: Pointing in Data Models"},"content":{"rendered":"<p><strong><em>Ready to learn Data Science? Browse\u00a0<a href=\"https:\/\/www.experfy.com\/training\/tracks\/data-science-training-certification\">Data Science Training and Certification<\/a> courses developed by industry thought leaders and Experfy in Harvard Innovation Lab.<\/em><\/strong><\/p>\n<p>Pointers\u00a0have been in and out of data models. From the advent of the rotating disk drive in the 60s and until around 1990, pointers were all over the place (together with \u201chierarchies\u201d, which were early versions of aggregates of co-located data). But relational and SQL made them go away, only to reappear around year 2000 as parts of Graph Databases.<\/p>\n<p>Although summer is over, this post is easy going and comes around. Join me on a hopefully fascinating journey of the history of pointers in data models.<\/p>\n<h4><strong>Pointing at Things<\/strong><\/h4>\n<p>Pointing is a very basic gesture for humans.<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic1a.png?x38402\" rel=\"nofollow noopener\"><img fetchpriority=\"high\" decoding=\"async\" src=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic1a.png?x38402\" sizes=\"(max-width: 500px) 100vw, 500px\" srcset=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic1a.png?x38402 800w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic1a-300x200.png?x38402 300w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic1a-768x512.png?x38402 768w\" alt=\"experfy-blog\" width=\"500\" height=\"333\" \/><\/a><\/p>\n<p style=\"text-align: center;\"><em>by\u00a0<a href=\"https:\/\/commons.wikimedia.org\/wiki\/File%3ACute_young_afro_american_boy_child.jpg\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">Hillebrand Steve<\/a>, U.S. Fish and Wildlife Service [Public domain], via Wikimedia Commons.<\/em><\/p>\n<p>Nobody is in doubt about what the intention is, regardless of culture, language and circumstance. Pointers are one of the strongest visualization tools in our possession. That is why I am so much in favor of directed graphs, as in e.g. concept maps and property graphs.<\/p>\n<p>Now I am pointing at pointers \u2013 over there, please!<\/p>\n<h4><strong>Episode 1: Pointer DMBS\u2019s<\/strong><\/h4>\n<p>One of the very first data stores was from IBM (in 1960). Its predecessor, which ran on magnetic tape, was called BOMP. It stands for Bill of Materials Processor. It was developed specifically for some large manufacturers, who needed to be able to do what later became materials requirements planning.<\/p>\n<p>Once BOMP was adapted for the new, revolutionary disk storage technology, it changed name to DBOMP, this time meaning \u201cDatabase Organization and Maintenance Processor\u201d! This happened in 1960. It was generalized somewhat and it does qualify as one of the first disk-based data stores.<\/p>\n<p>It was based on the concept of pointers which was the new very exciting possibility of rotating disks. Some added common sense was also added \u2013 the developers soon went away from raw disk addresses to schemes with blocks of data and index-numbers for rows on the block. Plus some free space administration. The blocks (and with them the data) could then be moved around on the disk. (In the beginning there was typically only one disk).<\/p>\n<p>Here is a greatly simplified bill of materials structure (tree) of a bicycle:<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic2.png?x38402\" rel=\"nofollow noopener\"><img decoding=\"async\" style=\"width: 700px; height: 410px;\" src=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic2.png?x38402\" sizes=\"(max-width: 711px) 100vw, 711px\" srcset=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic2.png?x38402 711w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic2-300x176.png?x38402 300w\" alt=\"experfy-blog\" \/><\/a><\/p>\n<p>Now, when you read and manipulate bills of materials, you need to:<\/p>\n<ul>\n<li>Go from the item master for a product (or intermediate) to the items used in that product, commonly seen as the product structure tree<\/li>\n<li>Go to the where used item (the owner)<\/li>\n<li>Go to the next in the structure<\/li>\n<li>Go to the next in the \u201cwhere used\u201d set of records<\/li>\n<li>Sometimes you need to go the prior, also in both of the sets (structure and where used)<\/li>\n<li>Sometimes you need to position yourself by way of a PART-NUMBER lookup and then continue downwards (structure) or upwards (where used) from there.<\/li>\n<\/ul>\n<p>For all of that you need a lot of relationships, which at that time translated into pointers.<\/p>\n<p>Here is a simplified view of the basic DBOMP\u00a0pointer structures:<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic3.png?x38402\" rel=\"nofollow noopener\"><img decoding=\"async\" src=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic3-1008x1024.png?x38402\" sizes=\"(max-width: 565px) 100vw, 565px\" srcset=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic3-1008x1024.png?x38402 1008w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic3-295x300.png?x38402 295w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic3-768x780.png?x38402 768w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic3.png?x38402 1288w\" alt=\"experfy-blog\" width=\"565\" height=\"574\" \/><\/a><\/p>\n<p>I have only shown a few of the actual relationships that the pointers\u00a0materialize.<\/p>\n<p>The pointer-based disk addressing made headway into later DBMS-products, including IDMS\u00a0and Total. Pointers at that time tended to be rather physical, which meant that you risked serious data corruption in case of e.g. somebody pulling the power plug of the disk drive.<\/p>\n<p>However, the pointer still lives on (in a non-physical, key\/value manner) today in the many graph DBMS\u2019s.<\/p>\n<h4><strong>Episode 2: The Programmer as Navigator<\/strong><\/h4>\n<p>The 1973 Turing award lecture (Communications of the ACM, Vol. 16, No 11, November 1973), was given to Charles M. Bachman, the inventor of the first database system (1964), the Integrated Data Store, I-D-S, running on General Electric (later Honeywell) mainframes. He described the transition from sequential processing to new opportunities coming from the introduction of the random access disk storage devices. The title of his talk was \u201cThe Programmer as Navigator\u201d. Primary Data Keys were the principal identifiers of data records, but Secondary Data Keys were also employed.<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic4.png?x38402\" rel=\"nofollow noopener\"><img decoding=\"async\" src=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic4.png?x38402\" alt=\"experfy-blog\" width=\"150\" height=\"168\" \/><\/a><\/p>\n<p style=\"text-align: center;\"><em><a href=\"http:\/\/www.sis.pitt.edu\/mbsclass\/hall_of_fame\/bachman.html\" target=\"_blank\" rel=\"noopener noreferrer\">Charles Bachman<\/a><\/em><\/p>\n<p>Besides records and keys, Charles Bachman chose the context of the set (for relationships). Not in the strict mathematical sense, but simply on the logical level, that of having a database with a Department-Employee set which consists of sets of Employees working in the Departments.<\/p>\n<p>The physical model was based on \u201cDatabase keys\u201d, which were close to physical pointers\u00a0(using one level of mapping in a page\/index structure) to the target record, or the originating record, for that matter. Relationships (\u201csets\u201d) would be implemented with forward (NEXT) as well as backward (PRIOR) and upwards (OWNER) pointers using the aforementioned database keys.<\/p>\n<p>Hence the notion of the programmer as navigator. Navigating sets either forwards or backwards or climbing hierarchical relationships. The paradigm was called the \u201cnetwork database\u201d at that time.<\/p>\n<p>Charlie Bachman actually invented the first kind of entity-relationship modeling. He called it a Data Structure Diagram. The version below is the version that was\/is used with the IDMS database system, one of the handful of CODASYL network database systems, which appeared around 1970.<\/p>\n<p>The \u201cnetwork\u201d database paradigms explained above were standardized by a committee called CODASYL (Conference on Data Systems Languages) in 1966.<\/p>\n<p>The major drawback of IDMS from a practical point of view was that of broken chains in case of a power failure. That meant manually repairing pointer chains\u2026sometimes in the middle of the night. I know \u2013 I was in IDMS tech support and I got those phone calls!<\/p>\n<h4><strong>Episode 3: Chen, Entities, Attributes, and Relationships<\/strong><\/h4>\n<p>The network databases (like IDMS and Total) fit nicely with the emerging Entity-Relationship\u00a0data model paradigm (Peter Chen, ACM Transactions on Database Systems Volume 1 Issue 1, March 1976).<\/p>\n<p>Chen became the father of much of the entity-relationship data modeling over the years.<\/p>\n<p>The diagram below is a Chen-style Entity Relationship\u00a0data model of Departments and Employees based on examples found in his ACM article mentioned above. Note that not all attributes are included (for space-saving reasons).<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic5.png?x38402\" rel=\"nofollow noopener\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic5.png?x38402\" sizes=\"(max-width: 600px) 100vw, 600px\" srcset=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic5.png?x38402 783w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic5-300x233.png?x38402 300w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic5-768x595.png?x38402 768w\" alt=\"experfy-blog\" width=\"600\" height=\"465\" \/><\/a><\/p>\n<p>Notice that in the original Chen-style, the attributes are somewhat independent and the relationships between entities are named and carry cardinalities.<\/p>\n<p>There is no doubt that Peter Chen\u00a0wanted the diagrams to be on the top level, business facing (i.e. what was then known as conceptual models). I like the explicitness of classic Chen modeling. Attributes are related to their \u201cowner table\u201d in what undoubtedly is a functional dependency.<\/p>\n<p><em>Actually,\u00a0<\/em>Peter Chen\u00a0went all the way to the \u201cbinary level\u201d before constructing his Entity-Relationship paradigm from those bottom-level drawings, something like this:<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic6.png?x38402\" rel=\"nofollow noopener\"><img decoding=\"async\" style=\"width: 600px; height: 539px;\" src=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic6.png?x38402\" sizes=\"(max-width: 600px) 100vw, 600px\" srcset=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic6.png?x38402 803w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic6-300x269.png?x38402 300w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic6-768x690.png?x38402 768w\" alt=\"experfy-blog\" \/><\/a><\/p>\n<p>Chen\u2019s diagram above is, in fact, a directed graph representing a piece of a data model at the most \u201catomic level\u201d. And Chen\u2019s work really is quite similar to the graph models we will look at later. Graph was close to data models for the first time.<\/p>\n<p>But then came SQL.<\/p>\n<h4><strong>Episode 5: SQL and Bill-of-Materials Processing<\/strong><\/h4>\n<p>Ted Codd, the inventor of the relational model published initially in 1970, spent a lot of energy of arguing against having pointers\u00a0in databases. Admittedly, some of the physical implementations of pointers had some serious drawbacks (\u201cbroken chains\u201d).<\/p>\n<p><em>Never the less, pointers\u00a0are here to stay. Today they proliferate corporate, relational, databases under the disguise of \u201csurrogate keys\u201d.<\/em><\/p>\n<p>\u201cRelational\u201d slowly materialized as SQL, which first appeared in 1974 and became a standard as late as in 1986. Ted Codd got his well-deserved Turing award in 1981. Note that SQL was more an IBM thing than something that Ted Codd designed.<\/p>\n<p>SQL was a paradigm shift, indeed. Let us do some Bill-of-Material processing with SQL.<\/p>\n<p>BOM handling is essentially graph traversal, which is standard functionality in Graph Databases. In SQL databases BOM handling requires recursive SQL\u00a0and that is not for the faint-hearted, as many of you know. Here is an approximate example:<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic7.png?x38402\" rel=\"nofollow noopener\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic7-1024x380.png?x38402\" sizes=\"(max-width: 565px) 100vw, 565px\" srcset=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic7-1024x380.png?x38402 1024w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic7-300x111.png?x38402 300w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic7-768x285.png?x38402 768w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic7.png?x38402 1033w\" alt=\"experfy-blog\" width=\"565\" height=\"210\" \/><\/a><\/p>\n<p>In a best practice (relational) data model the traversal would be based on a surrogate key, probably called \u201cItemID\u201d or something similar. Surrogate keys are (in principle) keys, which only carry information about relationship structure. So they are pointers, essentially.<\/p>\n<p><em>With the advent of SQL, the named relationships were not named anymore. Since foreign key relationships are constraints, and constraints may have names in most SQL implementations. It is strange why this happened. From a business semantics point of view, this is a very sad loss of information.<\/em><\/p>\n<h4><strong>Episode 5: Graph Models<\/strong><\/h4>\n<p>Graphs emerged as data models in the late 1990s. The development took three paths:<\/p>\n<ul>\n<li>The Semantic Web standards (RDF, OWL, SPARQL)<\/li>\n<li>Document referencing in Document Databases (like in MongoDB and several other products)<\/li>\n<li>Pure directed graph technology (like Neo4j and other products)<\/li>\n<\/ul>\n<p>Not that graphs were new concepts\u00a0even at that time. Graph theory has been a part of mathematics since 1736! The first paper was written by Leonhard Euler and addressed the now famous military problem of the Seven Bridges of K\u00f6nigsberg.<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic8.png?x38402\" rel=\"nofollow noopener\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic8.png?x38402\" sizes=\"(max-width: 302px) 100vw, 302px\" srcset=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic8.png?x38402 302w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic8-300x236.png?x38402 300w\" alt=\"experfy-blog\" width=\"302\" height=\"238\" \/><\/a><\/p>\n<p style=\"text-align: center;\"><em>by\u00a0<a href=\"https:\/\/commons.wikimedia.org\/wiki\/File%3AKonigsberg_bridges.png\" target=\"_blank\" rel=\"noopener noreferrer nofollow\">Bogdan Giu\u015fc\u0103<\/a>\u00a0[Public domain (PD)]<\/em><\/p>\n<p>Aiming to solve the optimization problem, Euler designed a network of four nodes and seven edges. The nodes represent the \u201cland masses\u201d at the ends of the bridges, whereas the edges (the relationships) represent the bridges. Working with that particular representation is called \u201cgraph traversal\u201d today.<\/p>\n<p>Formal graphs are now part of mathematics and are a well-researched area. In the data modeling community, however, graphs emerged considerably later.<\/p>\n<p><em><strong>Caveat:<\/strong>\u00a0the world is full of relationships, and they express vivid dynamics. This is the space that the graph data models explore. If you ask me, structure (relationships) is of higher importance than contents (the list of properties), if and when your challenge is to look at a somewhat complex context and learn the business semantics from it. Visuals are a great help and visualizing structure is the same as saying \u201cdraw a graph of it\u201d.<\/em><\/p>\n<p>Let us look at a business case (borrowed from\u00a0<a href=\"http:\/\/www.neo4j.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">www.neo4j.com<\/a>\u00a0) and how it looks in graph:<\/p>\n<p>\u201cWhich Employee had the Highest Cross-Selling Count of \u2018Chocolate\u2019 and Which Other Product\u201d?<\/p>\n<p>The graph could look somewhat like this:<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/OrderGraph.jpg?x38402\" rel=\"nofollow noopener\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/OrderGraph.jpg?x38402\" sizes=\"(max-width: 466px) 100vw, 466px\" srcset=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/OrderGraph.jpg?x38402 466w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/OrderGraph-300x106.jpg?x38402 300w\" alt=\"experfy-blog\" width=\"466\" height=\"165\" \/><\/a><\/p>\n<p>Formulated in Neo4J\u2019s query language, Cypher, the traversal looks like this:<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic10.png?x38402\" rel=\"nofollow noopener\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic10-1024x187.png?x38402\" sizes=\"(max-width: 565px) 100vw, 565px\" srcset=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic10-1024x187.png?x38402 1024w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic10-300x55.png?x38402 300w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic10-768x140.png?x38402 768w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic10.png?x38402 1030w\" alt=\"experfy-blog\" width=\"565\" height=\"103\" \/><\/a><\/p>\n<p>Notice how the path goes from Product via Order to Employee to find the chocolate selling Employees and then back via Order to Product to find those other products. \u201co2\u201d is just a labeling (a variable name) of what we later want to count.<\/p>\n<p>Relationships are basically connections, but they can also carry a name and some properties.<\/p>\n<p>The property graph\u00a0data model\u00a0is a simple, powerful general purpose data model. It is for those reasons that I recommend to use it as a generic representation of any data model.<\/p>\n<h4><strong>Coming Full Circle: Tying the Pointers Together<\/strong><\/h4>\n<p>Can we use pointers in today\u2019s Graph Databases? I did a little experiment using Neo4j. And I was able to set up a little data model containing Parent and Child as well as three named relationships:<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic11.png?x38402\" rel=\"nofollow noopener\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic11.png?x38402\" sizes=\"(max-width: 230px) 100vw, 230px\" srcset=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic11.png?x38402 230w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic11-189x300.png?x38402 189w\" alt=\"experfy-blog\" width=\"230\" height=\"365\" \/><\/a><\/p>\n<p>I then added just a little bit of data, and got this graph:<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic12.png?x38402\" rel=\"nofollow noopener\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic12.png?x38402\" sizes=\"(max-width: 468px) 100vw, 468px\" srcset=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic12.png?x38402 468w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic12-300x242.png?x38402 300w\" alt=\"experfy-blog\" width=\"468\" height=\"377\" \/><\/a><\/p>\n<p>This now gives me the opportunity of reading the little set of data either forwards or backwards:<\/p>\n<p style=\"text-align: center;\"><a href=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic13.png?x38402\" rel=\"nofollow noopener\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic13-1024x395.png?x38402\" sizes=\"(max-width: 565px) 100vw, 565px\" srcset=\"https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic13-1024x395.png?x38402 1024w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic13-300x116.png?x38402 300w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic13-768x296.png?x38402 768w, https:\/\/d3an9kf42ylj3p.cloudfront.net\/uploads\/2018\/09\/091018-pic13.png?x38402 1034w\" alt=\"experfy-blog\" width=\"565\" height=\"218\" \/><\/a><\/p>\n<p>But, wait, the semantics are slightly different today. Yester decades\u2019 \u201cforwards\u201d and \u201cbackwards\u201d meant exactly that. The result set was delivered, one record at a time, in the order that the records were chained together. Today cursor-based delivery is not used that much; much has been packed away in \u201cpagination\u201d mechanisms in code and in platforms.<\/p>\n<h4><strong>The Business Semantics of Pointers<\/strong><\/h4>\n<p>So, are there any useful semantics of \u201cnext\u201d, \u201cprior\u201d, \u201cfirst\u201d and \u201clast\u201d, even today?<\/p>\n<p>I think so. First of all, these concepts smell very much of time series, time-ordered events and the like. And obviously you can come along way with \u201cchained\u201d events as outlined above. But you can do that too using ORDER BY backed by a good index. And there are also temporal designs, as-is\/as-of, data vaults etc.<\/p>\n<p>But, if the connectedness is on the high side, graph data models are superior performers. Which means that for some classes of business contexts \u201cchains\u201d (graph-based relationships) will offer top performance based on an intuitive data model.<\/p>\n<p>First and last offer special opportunities. Some use cases (versioned data, for example) ask for complete, time-stamped, version history. But 90 % of all access is to the current (latest) version. A quick \u201cfirst\u201d or \u201clast\u201d (\u201cpointer\u201d) will beat the overhead of a large index responsible for the sorting of the events and for doing efficient \u201cmax(transaction_date)\u201d selection to get the latest transaction within a position in a portfolio, today (for example).<\/p>\n<p>There are also use cases where ORDER BY is not sufficient. Order may potentially only be\u00a0 maintained by business logic and is not always in ascending order. Late arriving facts (like corrections of historical data) is a good example.<\/p>\n<p>And of course, \u201cnext\u201d and \u201cprior\u201d are just generalized cases of specific relationship types. Actors acting in movies may be grouped by an \u201cacted in\u201d relationship, which has richer business semantics than \u201cnext\u201d. That is the real power of the labeled property graph data model.<\/p>\n<p>Don\u2019t shy back from pointing at things! Do it as you find appropriate.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Pointers&nbsp;have been in and out of data models. From the advent of the rotating disk drive in the 60s and until around 1990, pointers were all over the place together with &ldquo;hierarchies&rdquo;, which were early versions of aggregates of co-located data. But relational and SQL made them go away, only to reappear around year 2000 as parts of Graph Databases. Here is the fascinating journey of the history of pointers in data models.<\/p>\n","protected":false},"author":361,"featured_media":3333,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[187],"tags":[94],"ppma_author":[2944],"class_list":["post-1409","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bigdata-cloud","tag-data-science"],"authors":[{"term_id":2944,"user_id":361,"is_guest":0,"slug":"thomas-frisendal","display_name":"Thomas Frisendal","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Frisendal","first_name":"Thomas","job_title":"","description":"Thomas Frisendal, Owner and database consultant for TF Informatik, is an experienced database consultant with more than 30 years on the IT vendor side and as an independent consultant. As writer and speaker, his area of excellence lies within the art of turning data into information and knowledge.&nbsp;, He has published &quot;Graph Data Modeling for NoSQL and SQL - Visualize Structure and Meaning&quot;, &quot;Visual Design of GraphQL Data&quot;, and &quot;Design Thinking Business Analysis - Business Concept Mapping Applied&quot; books."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1409","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/361"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=1409"}],"version-history":[{"count":2,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1409\/revisions"}],"predecessor-version":[{"id":29204,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1409\/revisions\/29204"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/3333"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=1409"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=1409"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=1409"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1409"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}