{"id":1696,"date":"2019-05-14T03:26:36","date_gmt":"2019-05-14T03:26:36","guid":{"rendered":"http:\/\/kusuaks7\/?p=1301"},"modified":"2023-07-07T13:59:50","modified_gmt":"2023-07-07T13:59:50","slug":"by-jupyter-is-this-the-future-of-open-science","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/bigdata-cloud\/by-jupyter-is-this-the-future-of-open-science\/","title":{"rendered":"By Jupyter-Is This the Future of Open Science?"},"content":{"rendered":"<h3 style=\"color: #aaa; font-style: italic;\"><strong><em>Taking the scientific paper to the next level.<\/em><\/strong><\/h3>\n<p>In a recent article, I explained why open source is <a href=\"https:\/\/www.experfy.com\/blog\/open-science-means-open-source-or-least-it-should\">a vital part of open science<\/a>.\u00a0As I pointed out, alongside a massive failure on the part of funding bodies to make open source a key aspect of their strategies, there&#8217;s also a similar lack of open-source engagement with the needs and challenges of open science. There&#8217;s not much that the Free Software world can do to change the priorities of funders. But, a lot can be done on the other side of things by writing good open-source code that supports and enhances open science.<\/p>\n<p>People working in science potentially can benefit from every piece of free software code\u2014the operating systems and apps, and the tools and libraries\u2014so the better those become, the more useful they are for scientists. But there&#8217;s one open-source project in particular that already has had a significant impact on how scientists work\u2014<a href=\"https:\/\/github.com\/jupyter\/design\/wiki\/Jupyter-Logo\" rel=\"noopener\">Project Jupyter<\/a>:<\/p>\n<blockquote><p>Project Jupyter is a set of open-source software projects that form the building blocks for interactive and exploratory computing that is reproducible and multi-language. The main application offered by Jupyter is the Jupyter Notebook, a web-based interactive computing platform that allows users to author documents that combine live code, equations, narrative text, interactive dashboard and other rich media.<\/p><\/blockquote>\n<p><a href=\"https:\/\/jupyter.org\/index.html\" rel=\"noopener\">Project Jupyter<\/a>\u00a0was spun-off from\u00a0<a href=\"https:\/\/ipython.org\/\" rel=\"noopener\">IPython<\/a>\u00a0<a href=\"https:\/\/speakerdeck.com\/fperez\/project-jupyter\" rel=\"noopener\">in 2014 by Fernando P\u00e9rez<\/a>. Although it began as an environment for programming Python, its ambitions have grown considerably. Today, dozens of Jupyter kernels exist that allow other languages to be used. Indeed,\u00a0<a href=\"https:\/\/jupyter.org\/about\" rel=\"noopener\">the project itself speaks<\/a>\u00a0of supporting &#8220;interactive data science and scientific computing across all programming languages&#8221;. As well as this broad-based support for programming languages, Jupyter is noteworthy for its power. It enables users to create and share documents that contain live code, equations, visualizations and narrative text. Uses include data cleaning and transformation, numerical simulation, statistical modeling, data visualization and machine learning.<\/p>\n<p>In a way, Project Jupyter is the ultimate scientific tool, since it can be used in any discipline and for multiple purposes. As an article in the\u00a0<em>Atlantic<\/em>\u00a0rightly put it, it also can be thought of as\u00a0<a href=\"https:\/\/www.theatlantic.com\/science\/archive\/2018\/04\/the-scientific-paper-is-obsolete\/556676\/\" rel=\"noopener\">the scientific paper taken to the next level<\/a>\u00a0by exploiting the possibilities of digital technology. A key aspect is that it&#8217;s interactive\u2014readers can use the embedded code to explore the data and carry out limitless &#8220;what ifs&#8221;. It&#8217;s such an obvious idea, you may wonder why it hasn&#8217;t been done before. And the answer is that it has, notably in the form of\u00a0<a href=\"https:\/\/www.wolfram.com\/mathematica\/\" rel=\"noopener\">Mathematica from Wolfram Research<\/a>.<\/p>\n<p>Mathematica is an innovative and powerful program with one huge flaw: it&#8217;s proprietary. As such, it suffers all the usual downsides, one of which more or less disqualifies it for science: you can&#8217;t check the code. That means you don&#8217;t really know why it produces the results it does; you just have to take it on trust. That&#8217;s not science; that&#8217;s voodoo.<\/p>\n<p>Its closed-source nature means that Mathematica can&#8217;t tap into the community of users in the same way open-source projects can. Whatever advantages Mathematica once had, it&#8217;s only a matter of time before open-source alternatives like Jupyter surpass it. Indeed, it&#8217;s interesting that the Google Trends comparison of searches for Mathematica and searches for Jupyter show that interest in the latter is rising, while Google searches for the former are falling. It&#8217;s an inexact metric, of course, but\u00a0<a href=\"https:\/\/trends.google.com\/trends\/explore?date=today%205-y&amp;q=mathematica,jupyter\" rel=\"noopener\">the overall trends are clear<\/a>: Mathematica, like Microsoft Windows, is the past, and Jupyter, like GNU\/Linux, is the future.<\/p>\n<p>It&#8217;s not just about the familiar dynamics of open-source development. There&#8217;s a key reason why Jupyter has beaten Mathematica, as the academic\u00a0<a href=\"https:\/\/paulromer.net\/jupyter-mathematica-and-the-future-of-the-research-paper\" rel=\"noopener\">Paul Romer explained in a perceptive post<\/a>:<\/p>\n<blockquote><p>Mathematica failed, despite technical accomplishments, because the norms of its developers clashed so obviously with the norms of its intended users. Jupyter is succeeding because the norms of the community that is developing it are aligned with the norms of its users.<\/p><\/blockquote>\n<p>As well as its culture, there&#8217;s another aspect of Jupyter that makes it a perfect fit for open science. On the page listing dozens of Jupyter notebooks\u2014all freely accessible\u2014there&#8217;s a section titled &#8220;<a href=\"https:\/\/github.com\/jupyter\/jupyter\/wiki\/A-gallery-of-interesting-Jupyter-Notebooks#reproducible-academic-publications\" rel=\"noopener\">Reproducible academic publications<\/a>&#8220;:<\/p>\n<blockquote><p>This section contains academic papers that have been published in the peer-reviewed literature or pre-print sites such as the ArXiv that include one or more notebooks that enable (even if only partially) readers to reproduce the results of the publication.<\/p><\/blockquote>\n<p>Coupled with the transparency of the underlying code, this ability for anybody to check the logic and results of a publication is a real breakthrough in open science. At the moment, most academic papers can be read only superficially. In theory, anyone could set about reproducing the final conclusions\u2014at least, provided the relevant datasets are freely available. Few will take the trouble to do so though, because there are no academic incentives to expend all that time and energy. With papers published not as static documents, but as dynamic Jupyter notebooks with full open datasets, it is possible to check the results properly, as well as to plug in other datasets or tweak the underlying assumptions. In this way, Jupyter notebooks are the perfect marriage of open source, open access and open data. This is exactly how open science should work, but until now almost never does.<\/p>\n<p>The power and flexibility of the Jupyter environment make it a strong foundation for open-ended experimentation of the kind the Free Software community relishes. Moreover, coding in this domain could have a major impact on scientists using the notebook format and on the science they produce. That combination of satisfying intellectual challenge with real-world practical benefits makes it a perfect candidate for open-source coders looking for new and meaningful challenges.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>People working in science potentially can benefit from every piece of free software code&mdash;the operating systems and apps, and the tools and libraries&mdash;so the better those become, the more useful they are for scientists. But there&#8217;s one open-source project in particular that already has had a significant impact on how scientists work&mdash;Project Jupyter. Project Jupyter is a set of open-source software projects that form the building blocks for interactive and exploratory computing that is reproducible and multi-language.<\/p>\n","protected":false},"author":542,"featured_media":2724,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[187],"tags":[94],"ppma_author":[3202],"class_list":["post-1696","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bigdata-cloud","tag-data-science"],"authors":[{"term_id":3202,"user_id":542,"is_guest":0,"slug":"glyn-moody","display_name":"Glyn Moody","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/?s=96&d=mm&r=g","user_url":"","last_name":"Moody","first_name":"Glyn","job_title":"","description":"Glyn Moody&nbsp;is a Freelance Journalist, Author, and Speaker. His book, &quot;Rebel Code,&quot; is the first and only detailed history of the rise of open source, while his subsequent work, &quot;The Digital Code of Life,&quot; explores bioinformatics - the intersection of computing with genomics. He is a contributor to Ars Technica UK, Techdirt, The Guardian, The Daily Telegraph, The Financial Times, The Economist, Wired, New Scientist, and numerous computing titles. He has written over 1500 articles for Techdirt, and over 400 for Ars Technica UK, and 427 columns in the Computer Weekly.\n\n&nbsp;"}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1696","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/542"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=1696"}],"version-history":[{"count":3,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1696\/revisions"}],"predecessor-version":[{"id":29067,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/1696\/revisions\/29067"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/2724"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=1696"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=1696"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=1696"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1696"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}