Word2Vec
pt-br demo
Trained on pt-br Wikipedia (~1.2 billion words)
300 dimensions
550,000 words in vocabulary
Skip-gram, negative sampling (10), window size 10
10 epochs
~10 hours training time, using
Gensim
Find Most Similar Words
Find Word That Doesn't Match
Find Closest Vector
+
-