Install Polyglot on Windows
Updated: May 17
There is a strong debate on the issue of usefulness/uselessness of Windows for datascience. While the majority of Python packages work perfectly on Linux, there is still a huge Windows community that struggles to set all the dependencies and packages up and running.
Definitely this issue is far beyond the scope of a simple article so we are not going to discuss it here. What we are really going to do is to install one of the most complex packages (in my opinion) called Polyglot.
Polyglot is a natural language pipeline that supports massive multilingual applications.
This is quote from the official package documentation.
And if you have once tried to install this package on Windows you have definitely seen this
or even this
So here is a complete list of steps I fulfilled on my machine to install Polyglot and finally run:
Up we go!
Windows 10, 64 bit
Visual 2017 with C++ compiler installed (finally it didn't work out so you don't need to have this )
Download tar.gz from Polyglot pypi
Unzip tar gz and unzip inner tar gz and finally go to
Install polyglot by running
pip install polyglot
At this level there is no problem, however when you import polyglot.text it sends an error that icu package is not found
If have tested multiple ways : installing c++ compiler (didn't succeed because of c++ header format which is no longer supported in Windows 10), installing from wheel but none of them worked out. So finally I went to the unofficial windows binaries storage and here is the real tip: Use PyICU‑2.3.1‑cp36‑cp36m‑win_amd64.whl ! (Python 3.6 for 64 bit system). I tried on other machines but this was the most stable version.
So download this wheel and run:
pip install PyICU‑2.3.1‑cp36‑cp36m‑win_amd64.whl
Great! No if we try to import our package it sends another error saying that pycld2 package is not found.
Again go to the binaries storage and now look for pycld2‑0.31‑cp36‑cp36m‑win_amd64.whl
pip install pycld2‑0.31‑cp36‑cp36m‑win_amd64.whl
And now run
import polyglot from polyglot.text import Text, Word
Works like a charm.
Hope this was useful.