{"id":22865,"date":"2021-04-26T15:15:59","date_gmt":"2021-04-26T15:15:59","guid":{"rendered":"https:\/\/www.experfy.com\/blog\/the-operationalized-data-library-using-your-data-library-to-create-value-quickly-and-efficiently\/"},"modified":"2023-08-23T13:17:00","modified_gmt":"2023-08-23T13:17:00","slug":"the-operationalized-data-library-using-your-data-library-to-create-value-quickly-and-efficiently","status":"publish","type":"post","link":"https:\/\/www.experfy.com\/blog\/bigdata-cloud\/the-operationalized-data-library-using-your-data-library-to-create-value-quickly-and-efficiently\/","title":{"rendered":"The \u201cOperationalized\u201d Data Library- Using Your Data Library To Create Value Quickly And Efficiently"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"22865\" class=\"elementor elementor-22865\" data-elementor-post-type=\"post\">\n\t\t\t\t\t\t<section class=\"has_eae_slider elementor-section elementor-top-section elementor-element elementor-element-4d56865 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"4d56865\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"has_eae_slider elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-5092432\" data-id=\"5092432\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-f48d56d elementor-widget elementor-widget-text-editor\" data-id=\"f48d56d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>In previous articles in this series on the usage of a data library I dove into the first two of the four characteristics of a data library. This article will explain how the last two characteristics come together in the \u201coperationalization\u201d of your data library.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-258dbca elementor-widget elementor-widget-heading\" data-id=\"258dbca\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">What is a data library?<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-47a8717 elementor-widget elementor-widget-text-editor\" data-id=\"47a8717\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>* A set of principles for data management, not a technology stack.\u00a0<\/p>\n<p>* An informal, loosely but adequately connected data architecture consisting of data ponds, analytics datasets, and reporting datasets.<\/p>\n<p>* A balance of speed of development, agility, usability, and cost.<\/p>\n<p>* Prioritizes inclusion of data based on potential business value, difficulty, and data privacy concerns for a particular data source.<\/p>\n<p>At a previous company I reported to a former engineer-turned-data-analytics-leader who often urged us to not only create, but \u201coperationalize\u201d our data products. After being confused for some time I asked him to explain what he meant by bringing that engineering term into the data context. \u201cOperationalizing\u201d means making your report, analysis, dashboard, model, etc. into a mature product. Like any external product it should have a defined purpose and audience and be launched into the \u201cmarket\u201d in a state that is ready to bring value to a customer. A data library is a great foundation to ensure that the work your team does can be effectively and efficiently operationalized.\u00a0<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-34fa0e9 elementor-widget elementor-widget-heading\" data-id=\"34fa0e9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Data Libraries balance speed of development, agility, usability, and cost<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-543c042 elementor-widget elementor-widget-heading\" data-id=\"543c042\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">Data Libraries support fast development<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f9f8745 elementor-widget elementor-widget-text-editor\" data-id=\"f9f8745\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Data wrangling, the collection, cleaning and preparation of data for analysis, <a href=\"https:\/\/www.forbes.com\/sites\/gilpress\/2016\/03\/23\/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says\/?sh=16147baf6f63\" target=\"_blank\" rel=\"noreferrer noopener\">can easily take more time than any other aspect of the data analytics workflow, as much as 80-90%.<\/a> But in a <a href=\"http:\/\/www.experfy.com\/blog\/bigdata-cloud\/a-tech-agnostic-principled-approach-to-grassroots-data-management\/\" target=\"_blank\" rel=\"noreferrer noopener\">data library the usage of data ponds<\/a>, analytics reservoirs, and reporting reservoirs means that your data should be ready for most new analysis, modelling, and\/or reports with little wrangling.<\/p>\n<p>The data pond will be updated automatically and stored in a location that is accessible to the needed tool for the project. Adequate documentation on the data is in place so that any analyst can take advantage of the data ponds that have been built by others on the team. The data library principles provide the foundation for faster development time.\u00a0<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-970a3b3 elementor-widget elementor-widget-heading\" data-id=\"970a3b3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">Data libraries are agile and accommodate many use cases<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-cfb79bd elementor-widget elementor-widget-text-editor\" data-id=\"cfb79bd\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>When a new project demands adding a column to a table this can expand the project scope to include editing a database and the data collection process. <a href=\"http:\/\/www.experfy.com\/blog\/bigdata-cloud\/a-tech-agnostic-principled-approach-to-grassroots-data-management\/\" target=\"_blank\" rel=\"noreferrer noopener\">Data ponds with my preferred long table structure can easily accommodate new fields without needing to add a column to a table. <\/a><\/p>\n<p>By using data ponds you should have the data you need for a project most of the time anyway. The data pond construction process requires long-term planning about data that might be needed for a variety of use cases rather than focusing merely on the data needed for a particular project.<\/p>\n<p>As I described <a href=\"http:\/\/www.experfy.com\/blog\/bigdata-cloud\/a-tech-agnostic-principled-approach-to-grassroots-data-management\/\" target=\"_blank\" rel=\"noreferrer noopener\">previously<\/a> data libraries do not require a particular tech stack and can utilize what your company\/team already has available to it with little to no additional investment in BI tools or servers. Using Power BI instead of Excel is preferable, as is storing data in a database rather than flat files; but these investments are not absolutely necessary if cost is an issue. The data library principles are tech stack agnostic.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-bcd757e elementor-widget elementor-widget-heading\" data-id=\"bcd757e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">When building a data library, prioritize data based on business value, difficulty, and data privacy concerns<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-885f34e elementor-widget elementor-widget-text-editor\" data-id=\"885f34e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Just as a public library\u2019s collection is not acquired at once, data libraries should be built incrementally with iterations through a standard cataloguing process for each data source. This allows you to get value from the library quickly. If you want to build a data library while also keeping up with regular projects, reporting, and ad hoc requests then set a goal for the pace at which you will build the library. A simple goal is to choose to add one-to-three data sources per quarter. 25% of the team\u2019s time spent on cataloguing new data sources should be enough to make significant progress while also keeping up with its most important priorities.<\/p>\n<p>If you set a goal to add two data sources per quarter then you may need to spend anywhere from six months to multiple years to build out the initial library. Given those time constraints it is paramount that you only <a href=\"http:\/\/(http:\/\/www.experfy.com\/blog\/bigdata-cloud\/introduction-to-data-libraries-for-small-data-science-teams-2\/\" target=\"_blank\" rel=\"noreferrer noopener\">catalogue data sources that have significant business value<\/a> and that these are ranked. I rank data sources according to three elements:<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-c2f1bfe elementor-widget elementor-widget-heading\" data-id=\"c2f1bfe\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\">Prioritize data sources according to these elements<\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-f9882ed elementor-widget elementor-widget-text-editor\" data-id=\"f9882ed\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<ol>\n<li><strong>Business Value<\/strong><\/li>\n<li><strong>Difficulty<\/strong><\/li>\n<li><strong>Privacy concerns<\/strong><\/li>\n<\/ol>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-eb3b999 elementor-widget elementor-widget-heading\" data-id=\"eb3b999\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">Business Value<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-3a141a3 elementor-widget elementor-widget-text-editor\" data-id=\"3a141a3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>For many problems business value is difficult to quantify. In my experience the value of data products has usually been to better understand the business and support future decision-making. Straightforward, direct cost-savings or revenue increases may be more common for certain types of analytics teams and industries but just getting better at <a href=\"https:\/\/www.forbes.com\/sites\/blakemorgan\/2019\/02\/21\/descriptive-analytics-prescriptive-analytics-and-predictive-analytics-for-customer-experience\/?sh=11c347dc69e0\" target=\"_blank\" rel=\"noreferrer noopener\">descriptive analytics, the most common type<\/a>, is a worthy goal for most.<\/p>\n<p>While quantifying the value is difficult it is possible, especially in relative terms when comparing data sources. A consistent scoring rubric should be used with questions that reflect the value that your company receives from analytics. This may include things like saving a busy person time, increasing the frequency at which reports can be updated, and making new analyses possible that were previously too difficult or time-consuming to pursue. In my next article I will share the rubric that I use at TechSmith to evaluate and compare the business value of more than 25 data sources.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-71fe08d elementor-widget elementor-widget-heading\" data-id=\"71fe08d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">Difficulty<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-891cecd elementor-widget elementor-widget-text-editor\" data-id=\"891cecd\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Difficulty is even more subjective than business value as it is derivative of the skills of and technology available to an analytics team. What might be easy for one team is very difficult for another. Similarly one data source may be very easy to collect, store, document, and clean but near-impossible to automate. Like the business value these measures should be developed for your company and each data sourced scored. The next article will show what we use at TechSmith.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-21f57f1 elementor-widget elementor-widget-heading\" data-id=\"21f57f1\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">Privacy concerns<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-3234747 elementor-widget elementor-widget-text-editor\" data-id=\"3234747\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Data privacy is often overlooked or intentionally disregarded. <a href=\"https:\/\/www.osano.com\/articles\/data-privacy-laws\" target=\"_blank\" rel=\"noreferrer noopener\">Until recently there existed little regulatory incentive<\/a>, though the ethics of data privacy should be considered regardless of where your company operates. An analytics team that works with personal information should have standards regulating things like how long that data is retained, who has access, and how it is secured.\u00a0<\/p>\n<p>It is not necessary to score data sources based on privacy concerns but the concerns should be identified and a \u201ct-shirt size\u201d estimate (small, medium, or large) given. This will help to identify investments in data privacy policy and\/or technology that is needed in order to ethically build a data library (and comply with applicable regulations).<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-28bd8d2 elementor-widget elementor-widget-heading\" data-id=\"28bd8d2\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h3 class=\"elementor-heading-title elementor-size-default\">Visualize your prioritization in a matrix<\/h3>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t<div class=\"elementor-element elementor-element-d6191e9 elementor-widget elementor-widget-text-editor\" data-id=\"d6191e9\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>I recommend visualizing the results of doing this prioritization work in a matrix that shows where data sources land on the dimensions of value and difficulty, as well as the level of privacy concerns. In the next article I will share the design of the visual I use as well as the R code to create it.<\/p>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>In previous articles in this series on the usage of a data library I dove into the first two of the four characteristics of a data library. This article will explain how the last two characteristics come together in the \u201coperationalization\u201d of your data library. What is a data library? * A set of principles<\/p>\n","protected":false},"author":1135,"featured_media":23739,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","footnotes":""},"categories":[187],"tags":[988,985,977],"ppma_author":[3185],"class_list":["post-22865","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bigdata-cloud","tag-analytics-datasets","tag-data-library","tag-data-management"],"authors":[{"term_id":3185,"user_id":1135,"is_guest":0,"slug":"chris-umphlett","display_name":"Chris Umphlett","avatar_url":"https:\/\/www.experfy.com\/blog\/wp-content\/uploads\/2021\/05\/Chris-Umphlett-150x150.jpg","user_url":"","last_name":"Umphlett","first_name":"Chris","job_title":"","description":"Chris Umphlett is the Manager of Data Analysis and Data Privacy at TechSmith, the makers of great software like Snagit and Camtasia. Before that he worked on analytics teams in the consumer packaged goods, life insurance, and utility industries. He lives in East Lansing, Michigan with his wife and young children."}],"_links":{"self":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22865","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/users\/1135"}],"replies":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/comments?post=22865"}],"version-history":[{"count":15,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22865\/revisions"}],"predecessor-version":[{"id":31247,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/posts\/22865\/revisions\/31247"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media\/23739"}],"wp:attachment":[{"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/media?parent=22865"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/categories?post=22865"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/tags?post=22865"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.experfy.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=22865"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}