facebook-pixel

Galina Olejnik

About Me

Galina Olejnik, a Natural language processing/machine learning scientist, is Data Scientist at NDA where she works on machine learning models pipeline creation, improvement and support using parallel processing frameworks.

Word embeddings: exploration, explanation, and exploitation (with code in Python)

Word embeddings discussion is the topic being talked about by every natural language processing scientist for many-many years. The idea behind all of the word embeddings is to capture with them as much of the semantical/morphological/context/hierarchical/etc. information as possible, but in practice one methods are definitely better than the other for a particular task. The problem of choosing the best embeddings for a particular project is always the problem of try-and-fail approach, so realizing why in particular case one model works better than the other sufficiently helps in real work.

Debugging your tensorflow code right (without so many painful mistakes)

Data scientists who are developing their first tensorflow models often struggle with the non-obvious behavior of some parts of the framework, which are hardly understandable and quite complicated to debug. The main point is that making a lot of mistakes when working on this library is perfectly fine, and for any other thing it is perfectly fine too, and asking questions, diving deep into the docs and debugging every goddamn line is very much okay too. Everything comes with practice, and hope this article will be able to make this practice a bit more pleasant and interesting.

The Harvard Innovation Lab

Made in Boston @

The Harvard Innovation Lab

350

Matching Providers

Matching providers 2
comments powered by Disqus.