min word frequency, default=5
word vector dimensionality, default=100
window size, default=5
training epochs, default=1
batch size, default=12
sampling rate, default=0(no sampling)
learning rate, default=1E-1
mini learning rate, default=1E-6
whether to use AdaGrad, default=false
tokenizer factory, default=new DefaultTokenizerFactory
token preprocessor, default=new CommonPreprocessor
seed for random generator, default=2018
whether to train word vectors together with document vectors, default=true (PV-DM), false for PV-DBOW
label source, default=None
D2V Doc2Vec (Paragraph Vector): Doc2Vec (ParagraphVectors)
Quoc Le, Tomas Mikolov. Distributed Representations of Sentences and Documents. In ICML, 2014.
,reference: Quoc Le and Tomas Mikolov. Distributed Representations of Sentences and Documents. In ICML, 2014.