min word frequency, default=5
word vector dimensionality, default=100
window size, default=5
training epochs, default=1
batch size, default=12
sampling rate, default=0(no sampling)
learning rate, default=1E-1
mini learning rate, default=1E-6
whether to use AdaGrad, default=false
tokenizer factory, default=new DefaultTokenizerFactory
token preprocessor, default=new CommonPreprocessor
seed for random generator, default=2018
W2V
Word2Vec: an Implementation of Skip-Gram Model with Negative Sampling,
complexity: O = E * T * Q where Q = C * (D + D * log2(V)), E=epochs, T=total-words-of-training-data, C=max-window-size, D=embedding-size, V=vocabulary-size
,reference: Tomas Mikolov, Kai Chen, Greg Corrado and Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. In NIPS, 2013.