昨天看gensim的LDA Model文档的时候, 看到这样一段:
We find bigrams in the documents. Bigrams are sets of two adjacent words. Using bigrams we can get phrases like “machine_learning” in our output (spaces are replaced with underscores); without bigrams we would only get “machine” and “learning”. Note that in the code below, we find bigrams and then add them to the original data, because we would like to keep the words “machine” and “learning” as well as the bigram “machine_learning”.
bigram是指两个词组成的词组吗
|
共 1 个关于本帖的回复 最后回复于 2021-6-21 09:02