facebook-pixel

Cleaning Historical Financial (end of day) Dataset to Create Accurate Dataset. (Data prep / data quality)

Industry Financial Services

Specialization Or Business Function Finance, R&D

Technical Function Analytics (Data Preparation)

Technology & Tools Big Data and Cloud (MySQL), Data Warehouse Appliances, Programming Languages and Frameworks (C#, Python)

COMPLETED Dec 04, 2015

Project Description

You will be working to clean our historical (end of day) financial dataset (this contains date, symbolid, symbol, open, high, low, close, and volume – 8 fields in total). This data set contains about 17M records. Most of the records are clean; however, some records contain NULLs, 0’s or invalid values. Also, there are some duplicate records. See attached sample_data.csv.

This project will involve reviewing the full CSV of this data, and extracting supplemental data from Xignite for comparison. You will also gather (free) end of day data from Yahoo Finance and Google Finance as an additional comparison. Then, the datasets must be compared. The goal is to create a single most accurate dataset, which is likely a combination of the sources.  

Qualifications:  Quality oriented. Programming and database skills (SQL, XML, and some method of connecting to APIs (e.g. Python, C#, etc). Detail oriented data cleaning experience.  Financial Industry background / interest in the stock market helpful.

In proposal, please provide how many hours you think this project will require. 

Project Overview

  • Posted
    November 05, 2015
  • Planned Start
    November 14, 2015
  • Preferred Location
    From anywhere

Client Overview


EXPERTISE REQUIRED
Data Quality

Matching Providers