{"id":2359,"date":"2020-04-06T03:48:50","date_gmt":"2020-04-06T03:48:50","guid":{"rendered":"http:\/\/kusuaks7\/?p=1964"},"modified":"2023-12-20T14:28:59","modified_gmt":"2023-12-20T14:28:59","slug":"practical-spark-tips-for-data-scientists","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/bigdata-cloud\/practical-spark-tips-for-data-scientists\/","title":{"rendered":"Practical Spark Tips for Data Scientists"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"2359\" class=\"elementor elementor-2359\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-6018a5bc elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"6018a5bc\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-6f640bd8\" data-id=\"6f640bd8\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-30bd5bbc elementor-widget elementor-widget-text-editor\" data-id=\"30bd5bbc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<strong><em>I know \u2014 Spark is sometimes frustrating to work with.<\/em><\/strong>\n\n<strong><em>Although sometimes we can manage our big data using tools like\u00a0<a href=\"https:\/\/towardsdatascience.com\/minimal-pandas-subset-for-data-scientist-on-gpu-d9a6c7759c7f?source=---------5------------------\" target=\"_blank\" rel=\"nofollow noopener noreferrer\" class=\"broken_link\">Rapids<\/a>\u00a0or\u00a0<a href=\"https:\/\/towardsdatascience.com\/add-this-single-word-to-make-your-pandas-apply-faster-90ee2fffe9e8?source=---------11------------------\" target=\"_blank\" rel=\"nofollow noopener noreferrer\" class=\"broken_link\">Parallelization<\/a>,<\/em><\/strong>\u00a0there is no way around using Spark if you are working with Terabytes of data.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-92ebf4b elementor-widget elementor-widget-text-editor\" data-id=\"92ebf4b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tWhy? Because Spark gives memory errors a lot of times, and it is only when you genuinely work on big datasets with spark, would you be able to truly work with Spark.\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8cac775 elementor-widget elementor-widget-text-editor\" data-id=\"8cac775\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<strong>This post is going to be about \u2014 \u201cPractical Spark and memory management tips for Data Scientists.\u201d<\/strong>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-cbaff1a elementor-widget elementor-widget-heading\" data-id=\"cbaff1a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><h2 id=\"1-map-side-joins\">1. Map Side Joins<\/h2><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6b37d8e elementor-widget elementor-widget-image\" data-id=\"6b37d8e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/mlwhiz.com\/images\/practicalspark\/0.png\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d2032b2 elementor-widget elementor-widget-text-editor\" data-id=\"d2032b2\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>alt=&#8221;Joining Dataframes&#8221; \/&gt;<\/p><p>The syntax of joins in Spark is pretty similar to pandas:<\/p><pre><code>df3 = df1.join(df2, df1.column == df2.column,how='left'\n<\/code><\/pre>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-763524d elementor-widget elementor-widget-text-editor\" data-id=\"763524d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tBut I faced a problem. The df1 had around 1Billion rows while df2 had around 100 Rows. When I tried the above join, it didn\u2019t work and failed with memory exhausted errors after running for 20 minutes.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d7348ff elementor-widget elementor-widget-text-editor\" data-id=\"d7348ff\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tI was writing this code on a pretty big cluster with more than 400 executors with each executor having more than 4GB RAM. I was stumped as I tried to repartition my data frames using multiple schemes, but nothing seemed to work.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-8798564 elementor-widget elementor-widget-text-editor\" data-id=\"8798564\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tSo what should I do? Is Spark not able to work with a mere billion rows? Not Really. I just needed to use Map-side joins or broadcasting in Spark terminology.\n<pre><code>from pyspark.sql.functions import broadcast\ndf3 = df1.join(broadcast(df2), df1.column == df2.column,how='left')\n<\/code><\/pre>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-7bbf69c elementor-widget elementor-widget-text-editor\" data-id=\"7bbf69c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tUsing the simple broadcasting code above, I was able to send the smaller df2 to all the nodes, and this didn\u2019t take a lot of time or memory. What happens in the backend is that a copy of df2 is sent to all the partitions and each partition uses that copy to do the join. That means that there is no data movement when it comes to df1, which is a lot bigger than df2.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-6dbf1a8 elementor-widget elementor-widget-heading\" data-id=\"6dbf1a8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">\n<h2 id=\"2-spark-cluster-configurations\">2. Spark Cluster Configurations<\/h2><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-586616c elementor-widget elementor-widget-image\" data-id=\"586616c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/mlwhiz.com\/images\/practicalspark\/1.png\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-57d67c8 elementor-widget elementor-widget-text-editor\" data-id=\"57d67c8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: center;\"><em>Set the Parallelism and worker nodes based on your task size<\/em><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-b66497a elementor-widget elementor-widget-text-editor\" data-id=\"b66497a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tWhat also made my life difficult while I was starting work with Spark was the way the Spark cluster needs to be configured. Your spark cluster might need a lot of custom configuration ad tuning based on the job you want to run.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-129b9a6 elementor-widget elementor-widget-text-editor\" data-id=\"129b9a6\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\nSome of the most important configurations and options are as follows:\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-ae9e741 elementor-widget elementor-widget-heading\" data-id=\"ae9e741\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\"><h3 id=\"a-spark-sql-shuffle-partitions-and-spark-default-parallelism\">a. spark.sql.shuffle.partitions and spark.default.parallelism:<\/h3><\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-bff4260 elementor-widget elementor-widget-text-editor\" data-id=\"bff4260\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tspark.sql.shuffle.partitions configures the number of partitions to use when shuffling data for joins or aggregations. The spark.default.parallelism is the default number of partitions in RDDs returned by transformations like join, reduceByKey, and parallelize when not set by the user. The default value for these is 200.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f5c301c elementor-widget elementor-widget-text-editor\" data-id=\"f5c301c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\n<strong><em>In simple words, these set the degree of parallelism you want to have in your cluster.<\/em><\/strong>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-a125d5d elementor-widget elementor-widget-text-editor\" data-id=\"a125d5d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tIf you don\u2019t have a lot of data, the value of 200 is fine, but if you have huge data, you might want to increase these numbers. It also depends on the number of executors you have. My cluster was pretty big with 400 executors, so I kept this at 1200. A rule of thumb is to keep it as a multiple of the number of executors so that each executor ends up with multiple jobs.\n<pre><code>sqlContext.setConf( \"spark.sql.shuffle.partitions\", 800)\nsqlContext.setConf( \"spark.default.parallelism\", 800)\n<\/code><\/pre>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-9ceed99 elementor-widget elementor-widget-heading\" data-id=\"9ceed99\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\"><h3 id=\"b-spark-sql-parquet-binaryasstring\">b. spark.sql.parquet.binaryAsString<\/h3><\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-0371587 elementor-widget elementor-widget-text-editor\" data-id=\"0371587\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tI was working with .parquet files in Spark, and most of my data columns were strings. But somehow whenever I loaded the data in Spark, the string columns got converted into binary format on which I was not able to use any string manipulation functions. The way I solved this was by using:\n<pre><code>sqlContext.setConf(\"spark.sql.parquet.binaryAsString\",\"true\")\n<\/code><\/pre>\nThe above configuration converts the binary format to string while loading parquet files. Now it is a default configuration I set whenever I work with Spark.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d445809 elementor-widget elementor-widget-heading\" data-id=\"d445809\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\"><h3 id=\"c-yarn-configurations\">c. Yarn Configurations:<\/h3><\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-eb9e967 elementor-widget elementor-widget-text-editor\" data-id=\"eb9e967\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tThere are other configurations that you might need to tune that define your cluster. But these need to be set up when the cluster is starting and are not as dynamic as the above ones. The few I want to put down here are for managing memory spills on the executor nodes. Sometimes the executor core gets a lot of work.\n<ul>\n \t<li>spark.yarn.executor.memoryOverhead: 8192<\/li>\n \t<li>yarn.nodemanager.vmem-check-enabled: False<\/li>\n<\/ul>\nThere are a lot of configurations that you might want to tune while setting up your spark cluster. You can take a look at them in the\u00a0<a href=\"https:\/\/spark.apache.org\/docs\/latest\/configuration.html\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">official docs<\/a>.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-3069be1 elementor-widget elementor-widget-heading\" data-id=\"3069be1\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">\n<h2 id=\"3-repartitioning\">3. Repartitioning<\/h2><\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-78547c5 elementor-widget elementor-widget-image\" data-id=\"78547c5\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/mlwhiz.com\/images\/practicalspark\/2.png\" alt=\"\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-cee8842 elementor-widget elementor-widget-text-editor\" data-id=\"cee8842\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p style=\"text-align: center;\"><em>Keeping the workers happy by having them handle an equal amount of data<\/em><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-591936b elementor-widget elementor-widget-text-editor\" data-id=\"591936b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tYou might want to repartition your data if you feel your data has been skewed while working with all the transformations and joins. The simplest way to do it is by using:\n<pre><code>df = df.repartition(1000)\n<\/code><\/pre>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-dc27593 elementor-widget elementor-widget-text-editor\" data-id=\"dc27593\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tSometimes you might also want to repartition by a known scheme as this scheme might be used by a certain join or aggregation operation later on. You can use multiple columns to repartition using:\n<pre><code>df = df.repartition('cola', 'colb','colc','cold')\n<\/code><\/pre>\nYou can get the number of partitions in a data frame using:\n<pre><code>df.rdd.getNumPartitions()\n<\/code><\/pre>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-383cff7 elementor-widget elementor-widget-text-editor\" data-id=\"383cff7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tYou can also check out the distribution of records in a partition by using the glom function. This helps in understanding the skew in the data that happens while working with various transformations.\n<pre><code>df.glom().map(len).collect()\n<\/code><\/pre>\n\n<hr \/>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c713604 elementor-widget elementor-widget-heading\" data-id=\"c713604\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><h2 id=\"conclusion\">Conclusion<\/h2><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-4aa239a elementor-widget elementor-widget-text-editor\" data-id=\"4aa239a\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\tThere are a lot of things we don\u2019t know, we don\u2019t know. These are called unknown unknowns. It is only by multiple code failures and reading up on multiple stack overflow threads that we understand what we need.\n\nHere I have tried to summarize a few of the problems that I faced around memory issues and configurations while working with Spark and how to solve them. There are a lot of other configuration options in Spark, which I have not covered, but I hope this post gave you some clarity on how to set these and use them.\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>I know \u2014 Spark is sometimes frustrating to work with. Although sometimes we can manage our big data using tools like\u00a0Rapids\u00a0or\u00a0Parallelization,\u00a0there is no way around using Spark if you are working with Terabytes of data.Why? Because Spark gives memory errors a lot of times, and it is only when you genuinely work on big datasets<\/p>\n","protected":false},"author":653,"featured_media":8246,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[187],"tags":[94],"ppma_author":[3409],"class_list":["post-2359","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bigdata-cloud","tag-data-science"],"authors":[{"term_id":3409,"user_id":653,"is_guest":0,"slug":"rahul-agarwal","display_name":"Rahul Agarwal","avatar_url":"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2020\/04\/medium_cc5785b8-8195-44e6-a0de-2e33be05d7cb-150x150.png","user_url":"http:\/\/bit.ly\/384SBYb","last_name":"Agarwal","first_name":"Rahul","job_title":"","description":"Rahul Agarwal is a Data Scientist at Walmart Labs."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/2359","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/653"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=2359"}],"version-history":[{"count":5,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/2359\/revisions"}],"predecessor-version":[{"id":35082,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/2359\/revisions\/35082"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/8246"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=2359"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=2359"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=2359"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=2359"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}