Browse Projects

290 Projects that match your criteria

Sort by:

Thumb 1c79a23b aa85 49b9 8bf7 92dec448d2fe

Data Anomaly Detection and Suggestions

We are plannig to develop an unsupervised learning solution to detect anomalies in structured data.  Our data will be single tables only.  We need a solution that will detect anomalies across various data types (time series, configuration, etc.).  We need the solution to identify normalcy patterns in individual columns and in intercolumn relationships and then identify anomalies within those patterns and make suggestions for what a correct value could be (providing some confidence interval).

I realize the desrciption above is a bit generic but thats because we are looking to develop a base solution that works (to some level) in an unsupervised manner across a wide variet of data types.  We expect to need further tuning in order to maximize signal from different data types.

Stage 1: Planning (current project)

For this stage we are interested in finding data scientists with relevant background and experience in data anomaly detection similar to what is described above.  During this stage we will hire (pay) 1-3 data scientists to consult for one phone call on project planning and strategy.  Please apply to this project an explain why we should consult with you.

Stage 2: Execution (future project)

A future project posting will provide additional details, ask for proposals.  We will hire data scientist(s) to developer the solution.

Financial Services
Hi-Tech

$100/hr - $200/hr

4 Proposals Status: HIRING

Net 30

Company small

Client: C*******

Posted: Nov 22, 2017

Thumb 66383f42 2e09 4f40 b78d b65eb18faedd

Schedule Reconciliation Problem

We have challenges reconciling data related to Subject Visits.  It is difficult to aggregate this data accurately when values used to represent a Visit vary with sources.  We currently have 3 sources of visit data:

                Protocol Document

                EDC

                IRT

We are seeking a data scientist who can propose an AI technique that can be used to automate generating a concordance mapping between the systems?  It would need to detect similarities between the Names and with supporting actual dates data -  determine the mapping with high confidence.   Our efforts to manually manage this issue has never gotten any traction.

There is a business need to not only reconcile the naming, but to then be able to enhance the data with known values about the visit- Whas this a Dosing visit, Unplanned Visit, End of Study Visit, Payable Visit?

Pharmaceutical and Life Sciences

$100/hr - $150/hr

Starts Nov 27, 2017

3 Proposals Status: HIRING

Net 30

Company small

Client: C*******

Posted: Nov 21, 2017

Thumb 61343be2 e29f 41b3 acc5 b3401a3a28d6

MVP - The Future of Litigation: Practicing Law With a Crystal Ball

Technology and the law currently overlap in meaningful, but largely incomplete, ways.  This is an opportunity to bridge the gap and change the legal landscape forever. 

  1. Understanding the problem.

The entire legal industry is premised on the notion that rules, statutes, and prior decisions by judges and courts (caselaw) govern the decision-making process for lawyers when representing clients.  Of course, a lawyer's education, perspective, and unique problem-solving abilities will affect the decision making process, but not as much as you might think.  Almost invariably this is how a lawyer makes a decision for a client:

  • What does the rule of procedure say we must do?
  • Is there a statute that controls this situation? 
  • How have prior judges and/or courts ruled on similar arguments under similar circumstances

Stated another way, lawyers rely almost exclusively on the already-existing law (rules, statutes, caselaw) to create arguments and advance a certain strategy instead of another--nothing else.  This is how we are trained in law school; this is what other lawyers and judges expect; this is how the system has always worked.  But there's more data/information out there that already exists and is NOT being utilized.  Data that could prove to be far more valuable in terms of correct decision making than any rule, statute, or case.  

  1. Technology and the Law: Some Meaningful Overlap, but Something is Missing

It wasn't that long ago that if a lawyer needed to research a rule, statute, or caselaw he or she would have to phsyically go to a law library, located and retrieve books, and read them.  Then, with the advance of technology, came web-based databases that stored the information but allowed lawyers to browse them at lightining speed (e.g. Westlaw and LexisNexis).  That markeplace exists, is controlled by major players, and lawyers' use of their products are ubiqutious.  

More recently, law firms were managed in paper-heavy, intensive phsyical environments.  Physical files, endless documents, all created inefficient management of legal operations.  Needless to say creating and managing task assignemtns and workflow was really challenging.  Then came seemingly dozens of vendors all proclaiming to have the solution (e.g. PracticePanther, RocketMatter, FileVine, etc.).  And they all help in many ways.  Lawyers are beginning to use them more and more.  

In conclusion, systems exist for help lawyers research the law, communicate with their colleagues,  organize and store information, and even automate certain proceses (i.e. timekeeping, form document generation, etc.).  And while that may seem like a complete solution, it's not.  

  1. The Most Important Data Lawyers Should Consider Before Making a Decision Is Not Currently Available to Them:  We are Going to Create It!

Lawyers who handle litigation (that is, lawsuits or criminal prosecutions where cases are actually fought in courts) are working at a serious disadvantage--they just don't know it.  (Trust me, I am a litigation lawyer).  

Yes, whether a rule or statute exists and addresses a particular concern is important to know.  

Yes, a judge or court's prior decision in a written opinion is important, but doesn't give the full picture.  It's too surface level.

What if a criminal defense lawyer currently represents a white, male, 26 years old, no prior criminal record, in Miami, Florida, for DUI, and has judge John Smith, prosecutor Jane Doe, and wants to know:

  • Does this particular prosecutor ever negotiate plea bargains to amend the charge to a lower level instead of just DUI?
  • What is the likelihood that this prosecutor will offer jail time in exchange for a guilty plea?
  • If my client pleads guilty, what is the likelihood that this judge will sentence him to jail?  What if we go to trial instead of a guilty plea but at trial the jury still finds my client guilty?  Will that affect the way this judge sentences my client?
  • There are literally hundres if not thousands of other similar queries litigation lawyers always think but can never know...until now.

We are looking for one or more creative, innovative, hard-working people to help developing a web-based environment for litigation lawyers that both pulls data from available public records, but also is driven by those same user-lawyers to input case- and client-specific information about judges, other lawyers, witnesses, insurance adjusters, jurors, etc., to eventually create the capability of showing statistical probabilities of certain outcomes based on specific queries, and in some instances, demonstrate predictive outputs.  This enviornment will be largely driven by users who input data from around the country while prompted in a non-exhausting, inviting way repeatedly.  

At first, my own law firm can provide lots of data (and guidance) to help build this platform.  We can even beta test it.  The ultimate goal is to commercialize the platform.  We are looking for long term developers, not just a one and done.  

This project will require immense and particular knowledge of how litigaiton works.  The nuance is so complex that only a lawyer would understand.  However, I am confident that I can translate it and work diligently to help whomever works on this project get it done.

Imagine if lawyers could predict the future?  This could and would change the practice of law forever.  Completely disrupt the market.

  1. We are Open Minded To The Best Path to Reach Our Goal, But Here Is My Rought Idea of the Journey
  • Explore - Using my own law firm and tons of data for personal injury and criminal defense cases in Florida, we can provide both data and tons of example queries and factors you should consider when developing the platform.  This is more exploration for you to gain deeper insight into how litigation works and what times of data will be needed; what types of queries will the system need to be able to process.
  • Test and Fix - We can first start beta testing with my own law firm.
  • Limited Launch - Launch the platform and begin to market it without to other attorneys we know.  
  • Fix the Bugs - Absorb the feedback and fix the problems that need to be addressed
  • Major Launch - Develop a website to accompany the platform and to drive traffic/potential users too.  That website will also be where users login, where info on our platform is stored, and other basic company stuff.  But this is the first meaningful step in commercializing the product via subscription or license model.
  • Ongoing Development and Support - Even after the launch, keep you on board for the future to help continually develop, fine tune, and better the platform.

We are looking to first build an MVP. In your proposal please submit the milestones for the MVP.

Legal
Customer Behavior Analysis
Consumer Experience

$10,000 - $30,000

Starts Nov 18, 2017

3 Proposals Status: HIRING

Company small

Client: F********************

Posted: Nov 21, 2017

Thumb ef302d62 f7ed 4bf5 969a 568992e9f7b3

Statistical Model to Establish Relative Strength of a Business Based on Online Ratings and Reviews

We have an existing statistical that accomplishes the scope as outlined below. We want to enhance it as we have more data points. The model calculates an apartment community’s online reputation as compared to the entire population.

We collect ratings and number of reviews from various sites on over 70,000 apartment communities on a monthly basis.

As of now, there are about 19 sites that we collect data on. The number of sites is growing. For each site, we gather two variables i.e. the aggregate star rating of the property and the number of reviews that make up the star rating. Most sites have a 5 point scale. There is one site apartmentratings.com that lists the percentage of people recommending the property in addition to the star rating and the number of reviews. Please see the attached sample data.

All data and analysis will be in excel. The model should be in excel too.

The goal is to generate an overall rating that aggregates the websites’ reviews. The rating should be both simple to implement, stable, and accurate. The end result should be a score for each property on a scale of 100 and should serve as a relative ranking of a property’s online reputation.

We have data on what sites are more important to prospects while looking for an apartment. We’ll work with the consultant to fine tune the weightage of various sites. The resulting scoring methodology needs to be tested using various methods such as Sigmoid function example, Bayesian weighting, and Mean Absolute Error. 

Tools Used - Excel

Professional Services
Real Estate
Market Research

$3,000 - $7,500

Starts Nov 13, 2017

11 Proposals Status: HIRING

Company small

Client: J*****************

Posted: Nov 08, 2017

Thumb fd260e20 0c94 437a bcae 7b3a0d641b8a

Assessing the Value of User Data in Online Native Advertising

Background:

Taboola is the leading global recommendation platform, serving over 470 billion recommendations to over 1.3 billion people every month on some of the Web’s most innovative digital properties, including USA TODAY, Huffington Post, MSN, Business Insider, Chicago Tribune and The Weather Channel. Headquartered in New York City, Taboola also has offices in Los Angeles, London, Tel Aviv, New Delhi, Bangkok, São Paulo, Shanghai, Beijing, Seoul, Istanbul, Sydney and Tokyo. Taboola’s global reach is second only to Google’s and currently, 88% of America and 83% of the UK sees a Taboola recommendation 2-3 times a day. Over 60% of the business is in mobile web and mobile apps.

Taboola collects and analyzes a vast amount and range of non PII user data, most of which is related to the online behavior and content consumption of users. In addition, Taboola operates the industry's most comprehennsive data marketplace, allowing advertisers to utilize data and segments from numerous 3rd part providers in addition to Taboola segments.

User data at Taboola creates value through 2 primary mechanisms: 

  • Targeting (used by clients)
  • Personalization (used by Taboola's recommendation engine and other products)

Both of these mechanisms are ultimately related to the Revenue per Thousand page views (RPM) that Taboola is able to generate for it's supply partners (web publishers, apps, browsers etc. that use Taboola's product to monetize traffic). Both are also linked to the Cost Per Action (CPA) Taboola advertisers effectively pay.

Objectives:

  • To estimate the value of user data in RPM and/or CPA terms
  • To estimate the impact of a change in the average persistancy of user data on these metrics

Relevant Know-How and Experience:

Knowledge of real life cases of use of data in online advertising where insights were derived on the impact/value of data. Cases in other verticals (retail, healthcare...) may be relevant given high degree of similarity in other dimentions. 

Please do not submit a proposal if you don't have relevant data to share (i.e. a story through which interesting data points related to the research question can be learned). We aren't looking for amazing dta scientists with a great idea as to how they would go about solving this (not yet :) ) . We are looking for interesting information on relevant businesses (even anonymized). So ask yourself:

  1. Do you have a good case study with quantititive information?
  2. Can you share information about how companies in the advertising space have calculated the value of data?

Please note this will be a paid exploratory call for 1-2 hours. We are open to engaging with multiple experts if you do have relevant insight as per questions and information above.  There will be no interviews for this project. This could lead to further paid engagements depending on the call.

Media and Advertising
Customer Lifetime Value
Web Analytics

$100/hr - $300/hr

Starts Nov 13, 2017

5 Proposals Status: HIRING

Net 30

Company small

Client: T*******

Posted: Nov 03, 2017

Thumb 7bb91532 6203 4607 a57c 7a6158750792

NLP Machine Learning Expert to Build Predictive Model on Unstructured Data

Require a highly experienced data scientist specializing in analyzing unstructured data and build predictive self-learning algorithms.

Problem: Analyze unstructured documents (blogs, PDFs, etc) to predict more than 300 pre-identified categories/key phrases/tags/topics etc. And then build self-learning algorithm that learns from every human intervention/feedback. Sample data for building the model will be about 5000 documents shared by the end client.

Looking for a data scientist who can build the model, lead the entire project execution and be the single point of contact with the client. Or open to a team that can provide 2-3 resources i.e. a project manager and data scientists.

Should have strong expertise in text analytics, NLP based machine learning, cognitive, and AI applications leveraging open source technologies like R, python etc.

The final deliverable will be a self-learning alogrithm built in Python.

Machine Learning
cognitive analytics
Text Analytics

$50/hr - $150/hr

Starts Nov 06, 2017

19 Proposals Status: IN PROGRESS

Company small

Client: G***********************************

Posted: Oct 26, 2017

Thumb 1b41d97f 2eea 4992 9d75 717472b06c53

Quantitative Back Testing Developer

Founded in 2005, we are a small company in Santa Cruz, California that harnesses the collective wisdom of online investors to gain an edge in the stock market. Our Research Department works with one of the most extensive investor sentiment databases in the world.  As part of these activities, we use back-testing systems to evaluate investment strategies built from this proprietary data.

 

Skills required:

  • Computer Science or Software Engineering education or equivalent industry experience
  • 5+ years developing in Python, preferably in a quantitative equity trading environment.
  • 3+ years’ experience in SQL and relational databases.
  • Writes highly readable, maintainable, and testable code.
  • Knowledge of equity financial markets.

 

Additionally, we highly value the following:

  • Experience scaling systems and addressing the performance issues that come with scale.
  • Experience creating back-testing systems both loop-based and event-based  which handle share quantities, deal with corporate actions, and produce daily reports with multiple statistics (e.g. descriptive stats and performance metrics).
  • Experience in moving from simulation to a production environment with multiple trading platforms.
  • Knowledge of APIs for Interactive Brokers, Wolverine Execution Services, and/or other brokers.

 

As a small team, we value accuracy and quality. You will report to the Head of Research and operate with a high degree of independence. You value tackling difficult problems and like to work on projects that make you proud. If you are at the intersection of Finance and Engineering, are smart, have a strong work ethic, are enthusiastic, driven and know how to get things done, we want to talk to you.

Back testing
Unmet Need Analysis
R&D

$50/hr - $75/hr

Starts Oct 25, 2017

7 Proposals Status: HIRING

Company small

Client: i******

Posted: Oct 26, 2017

Thumb c0456814 9ca8 4e06 9497 51a401a12aba

Predict Rate of Recruitment for a Clinical Trial

Knowledgent is a precision-focused data intelligence firm with consistent, field-proven results across industries. Rather than follow the latest industry hype, we rise above the noise to craft innovative and reliable data and analytics solutions that help organizations use information as a strategic asset.

Problem:

We wish to develop a model capable of predicting the rate of recruitment (number of patients per site per month) for a clinical trial on a country-by-country basis and with a known confidence interval. Data include free-text elements and structured data, although it is expected that because of the limited time available, structured data will play a larger role in the analysis.

Expertise Required: 

We will need someone with data science expertise, preferably with some experience in the clinical trial space. Expertise in NLP may be beneficial, as much of the data available for the project that differentiates one clinical trial from another is in the form of free text inclusion/exclusion criteria.

Data sources:

Internal trial monitoring data from one company, public clinical trial data (e.g., clinicaltrials.gov), incidence/prevalence data by country (public and 3rd party licensed data). Data is available in Hive tables in an AWS environment, as well as in the original formats if needed/desired (varies by data source, but includes XML and xls files). It is known that this is not a comprehensive list of data that would influence the rate of recruitment. While any publicly available data may be incorporated to improve predictions, it is expected that a basic model can be produced using only the data provided.

We cannot provide a sample of the licensed data. But, much of the relevant data is publicly available. Data from clinicaltrials.gov can be most easily accessed from http://aact.ctti-clinicaltrials.org/

Technology stack:

Data available on Hive on AWS. Python server running on an AWS EC2 instance with Jupyter notebook. Note that any solution must be provided in Python

Deliverable:

An algorithm capable of predicting the rate of recruitment (number of patients per site per month) for a clinical trial within a fixed ±0.05 range (typical recruitment rates can be expected to be in the 0.1 to 1.0 range) with 80% of predictions within the target range. Predictions must be made for each country for which incidence/prevalence data is provided (~7), as well as for the trial as a whole, although the success criteria stated above only apply to the trial-level prediction.

Location Preference:

We have some preference for people located close to our Warren, NJ office, although this is not a strict requirement.

Healthcare
Pharmaceutical and Life Sciences
Biology, Health and Medicine

$75/hr - $150/hr

Starts Oct 23, 2017

10 Proposals Status: CLOSED

Company small

Client: K***********

Posted: Oct 17, 2017

Thumb 0519dbaa 0e45 4fe1 a3ab 31061e75d1f5

Data Platform Advisory Consultation for Investment Due Diligence

We are a venture fund looking for a senior technologist to help with the following due diligence activities for a data platform with search and collaboration features:

  • Validation discussion with CEO for mock of application functionality
  • Diligence review of software alternatives
  • Diligence on likelihood of adoption by target consumers
  • Analysis of technology stack

Payment will be made for a minimum of two hours.  Additional work will be compensated based on the time spent.

Hi-Tech
Strategy and Planning
Analytics

$250/hr

Starts Oct 10, 2017

10 Proposals Status: IN PROGRESS

Net 30

Company small

Client: S****************

Posted: Oct 05, 2017

Thumb cc922efb 3008 40b2 815b ed190f95d547

Business Plan for Global Philanthropy Platform

Our company is developing and offering a unique global philanthropic platform which will couple the latest technology with best-in-class accountability and transparency practices to unleash the catalytic potential of Philanthropy. The intent is to give donors an unprecedented choice to direct funds strategically and effectively towards the world’s major humanitarian and developmental challenges.

For this project, we are seeking assistance in developing our business model and documenting it as a plan suitable for potential donors and investors. The plan should include, as a minimum, an Executive Summary, a General Company Description, a description of our Platform, Marketing, an Operational Plan, a description of our Management team, a Financial plan including projections of Startup Expenses and Capitalization and Appendices as required.

This work is a follow-on project to our previous project. We intend to award this effort to our existing consultant.

Non-Profit
Strategy and Planning
Analytics

$10,000 - $20,000

Starts Oct 04, 2017

1 Proposal Status: IN PROGRESS

Net 30

Company small

Client: P*******

Posted: Oct 04, 2017

350

Matching Providers

Matching providers 2