Experfy Big Data, Analytics, and BI Projects

Browse Projects

Apply as an Expert Hire Expert

381 Projects that match your criteria

Sort by:

Curate and Analyze Publicly Available Financial Data for Business Intelligence

We would like to aggregate both structured and unstructured financial data to inform our decision making process.

13F data: Funds with asset of over $100M are required to publish their equity holdings on a quarterly basis via the SEC form 13F. We want to incorporate the information disclosed in these filings to evaluate current and prospective investments. Our goals would be to analyze the data to help inform our decision making process. A solution would involve a mix of data feeds: 13F filings, stock prices for holdings listed in the 13F. We currently have multiple
Public Company Filings & Relevant News: Feed relevant public filing and news data on companies within our portfolio. This would include holdings brought in via 13F data as well any additional companies that we wish to track.
Stock Price Alerts: Track any large fluctuations in tracked companies listed above.

Technology Stack:

Database: We maintain a SQL Server at AWS and would anticipate utilizing this going forward
Business Intelligence Tool: We have evaluated multiple BI solutions (Tableau, MSFT Power BI, Domo, Qlik, etc); Goal would be transition our current reporting setup (via SSRS) to a more dynamic BI solution that is available on demand. The solution needs to be mobile friendly.

Please provide your approach, how you would structure the milestones and how much time it would take.

Financial Services

Portfolio Optimization

Risk and Compliance

$100/hr - $150/hr

Starts Jan 19, 2017

11 Proposals Status: COMPLETED

Client: S****** ***** ***

Posted: Dec 26, 2016

Health Economics / Outcomes Research Data Scientist

We need following skill set to help support a project and to train another data scientist

Required Skills:

Deep understanding of healthcare databases (e.g., claims, EHR, hospital, registry) and the pros/cons of various sources
Experience in conducting a range of real world health research studies (e.g., retrospective database analyses, cost effectiveness, comparative effectiveness)
Strong background in research methodology and study design
Experience developing Research Plans and Protocols
Experienced in creating and programming epidemiologic and economic models
Ability to manage time and prioritize tasks

Qualifications:

Master’s or PhD in relevant area (e.g., health economics, epidemiology)
Minimum 3 years of combined experience in outcomes research, health economics, epidemiology, or directly related field

Pharmaceutical and Life Sciences

Biology, Health and Medicine

Analytics

$50/hr - $150/hr

Starts Jan 01, 2017

10 Proposals Status: CLOSED

Client: D*** ***** ********

Posted: Dec 22, 2016

Development of a Resume Scoring Algorithm Follow-up

We are a provider of eRecruitment technology which is used by our clients to manage the workflow of recruiting new hires including the following steps: posting vacancies, providing online application forms, integration of recruitment tests, communication with candidates etc.

This is a follow-up project and will be awarded to the same data scientist as before. The details have been discussed already.

Professional Services

Job Applicant Scoring

Human Resources

$24,000 - $25,000

Starts Feb 02, 2017

4 Proposals Status: IN PROGRESS

Net 30

Client: W*** ***

Posted: Dec 15, 2016

Cluster Analysis and Wheel Visualization.

Building a series of breakthrough visualizations for many analysis tasks on the DOMO platform. You areseeking a qualitatively improved way to view clusters of information, compared to existing methods. Viewing data that naturally “clusters together” is of value in many application domains, including data formatted as surveys,transactions, and text. In preparation for this project, we collaborated on a UI sketch, which we have rendered in a mockup image below.

Key elements include the following:

1) Data for the cluster analysis is derived by a similar process as we followed in Market Basket. A transaction file is converted to a cluster file by our back-end AWS framework, which is then returned to the front end.

The difference is that, here, clusters may contain more than the two items. Some details:

a. As with the Market Basket application, we will create an iFrame within the DOMO application that the user can use to send transaction data to a back-end processor, which does combinatoric search to find the clusters with greatest support. Note that, as with Market Basket, this back-end task is nontrivial and if configured incorrectly can take a very long time.

b. To generate the clusters, we will use the frequent itemset method12. This approach uses heuristics to reduce the time required for the 2^n search for all clusters; the algorithm involves

Market Research

Customer Analytics

Domo

$22,000

Starts Dec 13, 2016

1 Proposal Status: COMPLETED

Client: V********

Posted: Dec 12, 2016

Predict the Trustworthiness and Background of a Person Using Multiple Sources of Data

Summary

We would like to build a product that can accept some basic identifying information about a person of interest and then uses both traditional and non-traditional public data determine whether that person is safe and trustworthy.

We want to do this by replicating the steps taken by a great private investigator when they perform this very task for their clients. A great private investigator will do this in the following ways:

Obtain some basic identifying parameters about the person of interest. This typically includes things like a photo, name, date of birth, email address, birth place, last known city and state, etc.
Run a background check on the person of interest using TLO or Checker or similar. These background checks are fairly commoditized and are primarily checking things like state and county criminal conviction records and other traditional public records.
Check all public social media going back as many years as possible to see if the person posted anything racist or otherwise egregious or salacious.
Search news and media archives to see if the person was named in a scandal or anything else adverse that wouldn't necessarily show up in a traditional background check.
Do a google search on the person of interest and go several pages deep looking for anything adverse.
Search any data that was hacked and then dumped in the dark web. HaveIBeenPwned.com is a good aggregator of this data.
Do all of the above on not only the person of interest but also on their 5-10 closest friends and family members or anyone they shared an address or phone number with in the last few years. Looking, for example, to see if they have a clean record but live with 5 family members who were all convicted of fraud in the last few years. This would imply that the person of interest is at least questionable and not entirely clean.
Present all of their findings in a nice, simple report. This report will have a summary page that looks like a credit report summary page showing some score or grade on trust & safety, a count of how many negative hits there were, a count of how many positive hits there were, and a count of anything neutral or in need of further investigation. The remainder of the report will have the raw or detailed results. See sample reports attached.

Scope of Work

We are taking a phased approach, so this specific project will consist of building a minimum viable product demonstrating the basic ability to take some basic identifying information about a person of interest and return results or report showing their trust and safety score or grade, a summary of negative / positive / inconclusive hits, and a verbose list of detailed results.

The MVP should consist of a few main parts:

Data Sources

The MVP should be limited to these data sources:

Traditional background check data using TLO or Checker or a similar API (it is ok if this costs money - we want quality data)
A News API (Google news? Yahoo! news? AP?)
The HaveIBeenPwned data that was recently made available

Algorithm/s

You will need to come up with some way to match the parameters we get about the person of interest with the data sets that we are searching against.

Training / Machine Learning

We will need a way for our employees and investigators to review the results sets and reports, see a % for each match showing how accurate you think it is, and for them to either accept it as accurate or correct it. The human interaction should train the algorithm to be better at matching the more times we do this. We assume this will be done through some form of machine learning but are open to specific suggestions.

Budget and Timeline

Please provide an estimate of hours required to build the minimum viable product, which takes data from the three sources and displays the desired report on a dashboard. The user can then mark each match with "thumbs up” or “thumbs down.”

In your response, please also provide details of the technology stack you would use. Please keep in mind that the MVP will have a user interface but it need not be polished.

The budget for the entire version 1 which can be used in production is USD 100K-150K. We would expect the expert selected to work with us for the entire project, however we currently need an estimate for phase 1, i.e. MVP suggested above.

Professional Services

Fraud Identification and Prevention

Risk and Compliance

$50,000 - $75,000

15 Proposals Status: CLOSED

Client: T********

Posted: Dec 11, 2016

Data Mining of a Small Text-based Dataset

Data mining with RapidMiner

Doctoral candidate requesting a Rapid Miner professional experienced with text data mining. Must be certified and will need to provide evidence of certification.

Looking for help with trend/topic detection in a ~2000 record dataset. 1 column/field. As many as 400-500 words in a record. See attached data sample.

The data is extracted from a Help Desk database. I'm interested in data mining, trend/topic detection for column F (Description).

Looking for someone to do the mining tasks. I can do data cleansing. I'm looking for a partner that I can discuss the project with. I'll tell you what I'm trying to do, you tell me how you need the data cleaned and prepped, then I'll tell you what (I think) I need from mining round 1. You provide the results and details of the methodology, I'll review and request a second round. We'll do this a few times. My ultimate goal is something like a ranked list of topics reported to the help desk.

As this is for a doctoral dissertation, I'll need clear details on exactly what steps you executed.

Data is .csv or Excel, or anything else you need. Attached example is in Excel.

Deliverables are the results of the initial analyses and some secondary analyses. Final deliverable is a list of trends across the records in the dataset, with frequency (etc) information. Excel or Word or whatever format you like. Will need to discuss analyses and results with the miner.

When providing proposal, please consider this is a student project, and not a commercial project. Please submit proposal accordingly.

I am a doctoral candidate and paying for this out of pocket. It's not a huge pay day but it's not much work either! I can provide a research assistant credit if you'd like. (Dates in the posting are approximate).

Education

Non-Profit

Education

$80/hr - $100/hr

Starts Dec 21, 2016

7 Proposals Status: COMPLETED

Client: J**** ******

Posted: Dec 01, 2016

Market Segmentation Using Cluster Analysis Based On Survey Data

We've conducted a large consumer survey (5,000 respondents) that will yield approximately 100 variables about individual survey takers and are looking to use this data set to inform a market segmentation using cluster analysis (Two-Step, Latent Class and/or K-means), followed by discriminate analysis to identify the most important predictors of segment membership. Variables include relatively standard consumer market segmentation qualities:

Demographics
Psychographics
Technographics
Attitudinal Statements
Shopping/Buying Habits & Behaviors

We have a set of hypotheses that can be used as a starting point for this analysis and are looking to test these and formalize them through statistical analysis of the data, which currently resides in Qualtrics and can be easily exported in the desired format.

Key deliverables include:

Receive briefing on the data and project
Collaborate on hypothesis establishment for segment
Formulate recommended clustering approach and methodology for approval
Run clustering analysis (3 rounds, with refinements between)
Descriptive analysis of clusters and identifications of intereting similarities/differences
Final report document on segments and key characteristics

Market segmentation

Customer Segmentation

$2,500 - $5,000

Starts Nov 22, 2016

20 Proposals Status: CLOSED

Client: P*******

Posted: Nov 20, 2016

3D Cubing For Top E-commerce Retailer

About us:

As part of our fullfillment, we process online orders and find the best package (Box/bag) to group items and ship to the customer.

The problem:

The best utilization of capacity of a box to reduce the number of boxes/bags required to cube a customer order and reduce overall shipping cost and enhance customer experience

The solution should contain mock data, test simulation and results analysis

Expertise Required: Expert in Optimization and/or algorithms to design an approach for solving the above problem. If the expert is proficient in Java and code the solution will be an added bonus.

Our current technology Stack: Java 8

Deliverables: Algorithm, Psuedo code, Simulated test and Analysis of the results. Making the algorithm work in Java is an added bonus. The deliverables will be deployed in our infrastructure.

Sample Data: We can provide data in CSV format. Here is what it looks like

Number of different sized Boxes ..... : N
Width of nth box ............................. : an
Depth of nth box ............................. : bn
Height of nth box ............................ : cn
Weight of nth box..............................:wn
Volume of nth box...............................:vn
Box type (Bag/Box) of nth box..............:tn
Box Weight of nth box..............................:bwn
Box Volume of nth box...............................:bvn

Order Matrix

Number of Items/Item attributes ..... : M*N (M - Number of Items and N number of attributes)
N(1) - Unit Weight of the item
N(2) - Unit Volume of the item
N(3) - Unit Height of the item
N(4) - Unit Width of the item
N(5) - Unit Depth of the item
N(6) - Items combine indicator (We can group items having the same indicator, different inidicator items cannot be grouped into the same package)
N(7) - Orientation indicator(Garment items can be folded) Y/N
N(8) - Comma separated Box Type (Box,Bag....)

Consumer Goods and Retail

Hi-Tech

Transportation and Warehousing

$80/hr - $200/hr

Starts Dec 15, 2016

12 Proposals Status: COMPLETED

Client: M***** ******* *** **********

Posted: Nov 08, 2016

Analytics Strategy: Capabilities Evaluation and Redesign

For starters we are looking for an expert with the best in class understanding of the broadest big data/analytics landscape, including systems. We are looking for a consultant who can:

Understand - our broad business strategy and needs

Evaluate - the current state of our analytics technology and data Design - high level design/ architect and structure our to-be analytics capabilities)

Identify Gaps - with respect to addressing our business strategy

Recommend - various paths/options to achieve our objectives given constraints

We would like a very quick turn around of the project (~3-5 weeks) and would like to work only with the best experts who have a very good sense for what the leading companies are doing, the best available technology options, and posses consultative skills to work with us and help solve problems effectively. Some understanding of health care data may be valuable but it is not critical. We would like to start project as early as possible.

We are a leading Health Care Technology, Research and Consulting company working with most of the largest Hospital Systems and other constituents of the industry.
Deliverable: Report with recommendations

Healthcare

Biology, Health and Medicine

Consumer Experience

$200/hr - $250/hr

Starts Nov 14, 2016

20 Proposals Status: CLOSED

Client: A******** *****

Posted: Nov 06, 2016

Predict Store Traffic for a Top E-commerce Retailer

Problem Description:

We are a top ten E-commerce retailer looking to create a working mathematical model or algorithm to predict customer traffic hours so that store associates can be fully available to connect with customers and engage in their journey.

Goal is to deliver a working algorithm to predict store traffic and pilot a few stores.

Technology stack: open to suggestions but current in house stack includes Java, Gurobi, Spark, hadoop, tibco suite - active spaces, BW, BE, EMS.

In your proposal please tell us your approach and suggest what factors should we look into to predict customer traffic - demographic, near by location, office or residential area, time of the year, upcoming events, one day sales etc. We can use online data as we do not have or own any of the external data. We will be glad to provide you any relevant data, if needed and if we own it.