R is an extraordinarily powerful language with a vast community of great resources, but where should you start when all you want to do is get your data into a usable format? How do you know your data might be ready? What are the pitfalls you should watch for so that you don’t perform an analysis on bad data? This course will teach you from start to finish how to get your data into R efficiently and polish it up so that it is as good as it can be. This will let you or your team focus after this step on the statistical modeling, visualization, reporting, sharing, or any other post-processing task you wish to perform. Confidence, reliability, and reproducibility in your data acquisition and preparation are the kingpins to being able to maximize your data’s value. This course uses a variety of real-world data sets that contain real-world data quality, formatting, and other issues. It will ensure that you understand not just the R syntax to perform a task, but also sources of quality issues, how to recognize hidden data problems, and the benefits and adverse effects of the most common data manipulations. This course will give you real experience in the art and science of data preparation that you can take to your next real project forward with confidence. The capstone project utilizes open agricultural industry data in preparation for a future statistical analysis of the products and brands of the companies. Like a real project, the project goals and background are provided but the step-by-step data preparation is not given - the course will have provided the methods and insights needed to prepare this data for future statistical analysis! The capstone project is reviewed by the instructor and feedback is individually provided to each student in the course along with a full project solution.
What am I going to get from this course?
- Understand the R syntax to perform a task
- Identify sources of quality issues
- Recognize hidden data problems
- Understand benefits/detriments of the most common data manipulations
- Prepare a real-world dataset for future statistical analysis and utilize the capstone project as a portfolio piece.