gensim官方文档教程

一般编程问题

下载此实例

开发语言：Others
实例大小：27.96M
下载次数：3
浏览次数：125
发布时间：2021-01-23
实例类别：一般编程问题
发布人：好学IT男
文件格式：.zip
所需积分：2

网友评论举报投诉收藏该页

下载此实例

实例介绍

【实例简介】
资源为Gensim官方文档对应教程，其中包含语料集，常见算法的Jupyter形式教程，比如Word2vec及相似度比较等。
【实例截图】
【核心代码】
286bdcb2-2e31-4d4d-9704-7a7084f20592
└── gensim-develop
├── appveyor.yml
├── CHANGELOG.md
├── continuous_integration
│   ├── appveyor
│   │   ├── install.ps1
│   │   ├── requirements.txt
│   │   └── run_with_env.cmd
│   └── travis
│   ├── flake8_diff.sh
│   ├── install.sh
│   └── run.sh
├── CONTRIBUTING.md
├── COPYING
├── docker
│   ├── check_fast_version.py
│   ├── Dockerfile
│   ├── README.md
│   └── start_jupyter_notebook.sh
├── docs
│   ├── notebooks
│   │   ├── annoytutorial.ipynb
│   │   ├── annoytutorial-text8.ipynb
│   │   ├── atmodel_tutorial.ipynb
│   │   ├── Coherence.gif
│   │   ├── Convergence.gif
│   │   ├── Corpora_and_Vector_Spaces.ipynb
│   │   ├── datasets
│   │   │   ├── keras_classifier_training_data.csv
│   │   │   ├── mycorpus.txt
│   │   │   ├── news_corpus
│   │   │   ├── news_corpus.index
│   │   │   ├── news_corpus.vocab
│   │   │   ├── news_dictionary
│   │   │   ├── questions-words.txt
│   │   │   └── word_vectors_training_data.txt
│   │   ├── deepir.ipynb
│   │   ├── Diff.gif
│   │   ├── distance_metrics.ipynb
│   │   ├── distributed.md
│   │   ├── doc2vec-IMDB.ipynb
│   │   ├── doc2vec-lee.ipynb
│   │   ├── doc2vec-wikipedia.ipynb
│   │   ├── doc_lda_pca.png
│   │   ├── doc_lda_tsne.png
│   │   ├── dtm_example.ipynb
│   │   ├── Dynamic Topic Model.png
│   │   ├── FastText_Tutorial.ipynb
│   │   ├── gensim_news_classification.ipynb
│   │   ├── gensim Quick Start.ipynb
│   │   ├── index
│   │   ├── index.d
│   │   ├── keras_wrapper.ipynb
│   │   ├── lda_model_difference.ipynb
│   │   ├── ldaseqmodel.ipynb
│   │   ├── lda_training_tips.ipynb
│   │   ├── Monkey Brains New.png
│   │   ├── Monkey Brains.png
│   │   ├── online_w2v_tutorial.ipynb
│   │   ├── pca.png
│   │   ├── Perplexity.gif
│   │   ├── Similarity_Queries.ipynb
│   │   ├── sklearn_api.ipynb
│   │   ├── summarization_tutorial.ipynb
│   │   ├── Tensorboard.png
│   │   ├── Tensorboard_visualizations.ipynb
│   │   ├── test_notebooks.py
│   │   ├── topic_coherence_model_selection.ipynb
│   │   ├── topic_coherence-movies.ipynb
│   │   ├── topic_coherence_tutorial.ipynb
│   │   ├── Topic_dendrogram.ipynb
│   │   ├── topic_methods.ipynb
│   │   ├── topic_network.ipynb
│   │   ├── Topics_and_Transformations.ipynb
│   │   ├── topic_with_coordinate.png
│   │   ├── Training_visualizations.ipynb
│   │   ├── translation_matrix.ipynb
│   │   ├── tsne.png
│   │   ├── Varembed.ipynb
│   │   ├── visdom_graph.png
│   │   ├── WMD_tutorial.ipynb
│   │   ├── Word2Vec_FastText_Comparison.ipynb
│   │   ├── word2vec.ipynb
│   │   ├── Wordrank_comparisons.ipynb
│   │   └── WordRank_wrapper_quickstart.ipynb
│   └── src
│   ├── about.rst
│   ├── apiref.rst
│   ├── changes_080.rst
│   ├── conf.py
│   ├── corpora
│   │   ├── bleicorpus.rst
│   │   ├── corpora.rst
│   │   ├── csvcorpus.rst
│   │   ├── dictionary.rst
│   │   ├── hashdictionary.rst
│   │   ├── indexedcorpus.rst
│   │   ├── lowcorpus.rst
│   │   ├── malletcorpus.rst
│   │   ├── mmcorpus.rst
│   │   ├── sharded_corpus.rst
│   │   ├── svmlightcorpus.rst
│   │   ├── textcorpus.rst
│   │   ├── ucicorpus.rst
│   │   └── wikicorpus.rst
│   ├── dist_lda.rst
│   ├── dist_lsi.rst
│   ├── distributed.rst
│   ├── gensim_theme
│   │   ├── domainindex.html
│   │   ├── genindex.html
│   │   ├── layout.html
│   │   ├── page.html
│   │   ├── search.html
│   │   ├── static
│   │   │   ├── doctools.js
│   │   │   ├── jquery.js
│   │   │   └── underscore.js
│   │   └── theme.conf
│   ├── indextoc.rst
│   ├── install.rst
│   ├── interfaces.rst
│   ├── intro.rst
│   ├── Makefile
│   ├── matutils.rst
│   ├── models
│   │   ├── atmodel.rst
│   │   ├── coherencemodel.rst
│   │   ├── doc2vec.rst
│   │   ├── hdpmodel.rst
│   │   ├── keyedvectors.rst
│   │   ├── lda_dispatcher.rst
│   │   ├── ldamodel.rst
│   │   ├── ldamulticore.rst
│   │   ├── ldaseqmodel.rst
│   │   ├── lda_worker.rst
│   │   ├── logentropy_model.rst
│   │   ├── lsi_dispatcher.rst
│   │   ├── lsimodel.rst
│   │   ├── lsi_worker.rst
│   │   ├── models.rst
│   │   ├── normmodel.rst
│   │   ├── phrases.rst
│   │   ├── rpmodel.rst
│   │   ├── tfidfmodel.rst
│   │   ├── word2vec.rst
│   │   └── wrappers
│   │   ├── dtmmodel.rst
│   │   ├── fasttext.rst
│   │   ├── ldamallet.rst
│   │   ├── ldavowpalwabbit.rst
│   │   ├── varembed.rst
│   │   ├── wordrank.rst
│   │   └── wrappers.rst
│   ├── parsing
│   │   ├── porter.rst
│   │   └── preprocessing.rst
│   ├── scripts
│   │   ├── glove2word2vec.rst
│   │   ├── make_wikicorpus.rst
│   │   └── word2vec_standalone.rst
│   ├── similarities
│   │   ├── docsim.rst
│   │   ├── index.rst
│   │   └── simserver.rst
│   ├── simserver.rst
│   ├── sklearn_integration
│   │   └── sklearn_wrapper_gensim_ldamodel.rst
│   ├── _static
│   │   ├── css
│   │   │   ├── anythingslider.css
│   │   │   ├── jquery.qtip.min.css
│   │   │   └── style.css
│   │   ├── favicon.ico
│   │   ├── images
│   │   │   ├── arrows.png
│   │   │   ├── bg.png
│   │   │   ├── bullets.png
│   │   │   ├── checker.png
│   │   │   ├── default.png
│   │   │   ├── direct-install.png
│   │   │   ├── download.png
│   │   │   ├── favicon.ico
│   │   │   ├── features
│   │   │   │   ├── converters.png
│   │   │   │   ├── efficient_implementations.png
│   │   │   │   ├── free_lgpl.png
│   │   │   │   ├── memory_independence.png
│   │   │   │   ├── platform_independence.png
│   │   │   │   ├── robust.png
│   │   │   │   ├── similarity_queries.png
│   │   │   │   └── support.png
│   │   │   ├── forkme_left_white_ffffff.png
│   │   │   ├── gensim_code.png
│   │   │   ├── gensim_compact.png
│   │   │   ├── gensim-footer.png
│   │   │   ├── gensim.png
│   │   │   ├── get-started.png
│   │   │   ├── googlegroups.png
│   │   │   ├── loading.gif
│   │   │   ├── logo-gensim_compact.png
│   │   │   ├── logo-gensim.png
│   │   │   ├── menubutton.png
│   │   │   ├── references
│   │   │   │   ├── logo_dtu.gif
│   │   │   │   ├── logo_dynadmic.png
│   │   │   │   ├── logo_eudml.png
│   │   │   │   ├── logo_ghent.png
│   │   │   │   ├── logo_ibcn.png
│   │   │   │   ├── logo_issuu.jpeg
│   │   │   │   ├── logo_roistr.png
│   │   │   │   ├── logo_sportsauthority.png
│   │   │   │   └── logo_tailwind.png
│   │   │   ├── tagline_compact.png
│   │   │   ├── tagline.png
│   │   │   ├── twitterbird.png
│   │   │   ├── ukazka2.png
│   │   │   └── ukazka.png
│   │   └── js
│   │   ├── jquery-1.9.1.min.js
│   │   ├── jquery.anythingslider.min.js
│   │   ├── jquery-migrate-1.1.1.min.js
│   │   └── jquery.qtip.min.js
│   ├── summarization
│   │   ├── bm25.rst
│   │   ├── commons.rst
│   │   ├── graph.rst
│   │   ├── keywords.rst
│   │   ├── pagerank_weighted.rst
│   │   ├── summariser.rst
│   │   ├── syntactic_unit.rst
│   │   └── textcleaner.rst
│   ├── support.rst
│   ├── _templates
│   │   └── indexcontent.html
│   ├── topic_coherence
│   │   ├── aggregation.rst
│   │   ├── direct_confirmation_measure.rst
│   │   ├── indirect_confirmation_measure.rst
│   │   ├── probability_estimation.rst
│   │   └── segmentation.rst
│   ├── tut1.rst
│   ├── tut2.rst
│   ├── tut3.rst
│   ├── tutorial.rst
│   ├── utils.rst
│   └── wiki.rst
├── ez_setup.py
├── gensim
│   ├── corpora
│   │   ├── bleicorpus.py
│   │   ├── csvcorpus.py
│   │   ├── dictionary.py
│   │   ├── hashdictionary.py
│   │   ├── indexedcorpus.py
│   │   ├── __init__.py
│   │   ├── lowcorpus.py
│   │   ├── malletcorpus.py
│   │   ├── mmcorpus.py
│   │   ├── sharded_corpus.py
│   │   ├── svmlightcorpus.py
│   │   ├── textcorpus.py
│   │   ├── ucicorpus.py
│   │   └── wikicorpus.py
│   ├── examples
│   │   └── dmlcz
│   │   ├── dmlcorpus.py
│   │   ├── gensim_build.py
│   │   ├── gensim_genmodel.py
│   │   ├── gensim_xml.py
│   │   ├── __init__.py
│   │   ├── runall.sh
│   │   └── sources.py
│   ├── __init__.py
│   ├── interfaces.py
│   ├── matutils.py
│   ├── models
│   │   ├── atmodel.py
│   │   ├── basemodel.py
│   │   ├── callbacks.py
│   │   ├── coherencemodel.py
│   │   ├── doc2vec_inner.c
│   │   ├── doc2vec_inner.pyx
│   │   ├── doc2vec.py
│   │   ├── fasttext.py
│   │   ├── hdpmodel.py
│   │   ├── __init__.py
│   │   ├── keyedvectors.py
│   │   ├── lda_dispatcher.py
│   │   ├── ldamodel.py
│   │   ├── ldamulticore.py
│   │   ├── ldaseqmodel.py
│   │   ├── lda_worker.py
│   │   ├── logentropy_model.py
│   │   ├── lsi_dispatcher.py
│   │   ├── lsimodel.py
│   │   ├── lsi_worker.py
│   │   ├── normmodel.py
│   │   ├── phrases.py
│   │   ├── rpmodel.py
│   │   ├── tfidfmodel.py
│   │   ├── translation_matrix.py
│   │   ├── voidptr.h
│   │   ├── word2vec_inner.c
│   │   ├── word2vec_inner.pxd
│   │   ├── word2vec_inner.pyx
│   │   ├── word2vec.py
│   │   └── wrappers
│   │   ├── dtmmodel.py
│   │   ├── fasttext.py
│   │   ├── __init__.py
│   │   ├── ldamallet.py
│   │   ├── ldavowpalwabbit.py
│   │   ├── varembed.py
│   │   └── wordrank.py
│   ├── nosy.py
│   ├── parsing
│   │   ├── __init__.py
│   │   ├── porter.py
│   │   └── preprocessing.py
│   ├── scripts
│   │   ├── glove2word2vec.py
│   │   ├── __init__.py
│   │   ├── make_wikicorpus.py
│   │   ├── make_wiki_lemma.py
│   │   ├── make_wiki_online_lemma.py
│   │   ├── make_wiki_online_nodebug.py
│   │   ├── make_wiki_online.py
│   │   ├── make_wiki.py
│   │   ├── word2vec2tensor.py
│   │   └── word2vec_standalone.py
│   ├── similarities
│   │   ├── docsim.py
│   │   ├── index.py
│   │   └── __init__.py
│   ├── sklearn_api
│   │   ├── atmodel.py
│   │   ├── d2vmodel.py
│   │   ├── hdp.py
│   │   ├── __init__.py
│   │   ├── ldamodel.py
│   │   ├── ldaseqmodel.py
│   │   ├── lsimodel.py
│   │   ├── phrases.py
│   │   ├── rpmodel.py
│   │   ├── text2bow.py
│   │   ├── tfidf.py
│   │   └── w2vmodel.py
│   ├── summarization
│   │   ├── bm25.py
│   │   ├── commons.py
│   │   ├── graph.py
│   │   ├── __init__.py
│   │   ├── keywords.py
│   │   ├── pagerank_weighted.py
│   │   ├── summarizer.py
│   │   ├── syntactic_unit.py
│   │   └── textcleaner.py
│   ├── test
│   │   ├── basetmtests.py
│   │   ├── __init__.py
│   │   ├── simspeed2.py
│   │   ├── simspeed.py
│   │   ├── svd_error.py
│   │   ├── test_aggregation.py
│   │   ├── test_atmodel.py
│   │   ├── test_big.py
│   │   ├── test_coherencemodel.py
│   │   ├── test_corpora_dictionary.py
│   │   ├── test_corpora_hashdictionary.py
│   │   ├── test_corpora.py
│   │   ├── test_data
│   │   │   ├── alldata-id-10.txt
│   │   │   ├── bgwiki-latest-pages-articles-shortened.xml.bz2
│   │   │   ├── cp852_fasttext.bin
│   │   │   ├── DTM
│   │   │   │   └── sstats_test.txt
│   │   │   ├── dtm_test.dict
│   │   │   ├── dtm_test.mm
│   │   │   ├── EN.1-10.cbow1_wind5_hs0_neg10_size300_smpl1e-05.txt
│   │   │   ├── enwiki-latest-pages-articles1.xml-p000000010p000030302-shortened.bz2
│   │   │   ├── head500.noblanks.cor
│   │   │   ├── head500.noblanks.cor.bz2
│   │   │   ├── head500.noblanks.cor_tfidf.model
│   │   │   ├── head500.noblanks.cor_wordids.txt
│   │   │   ├── IT.1-10.cbow1_wind5_hs0_neg10_size300_smpl1e-05.txt
│   │   │   ├── large_tag_doc_10_iter50
│   │   │   ├── ldamodel_python_2_7
│   │   │   ├── ldamodel_python_2_7.expElogbeta.npy
│   │   │   ├── ldamodel_python_2_7.id2word
│   │   │   ├── ldamodel_python_2_7.state
│   │   │   ├── ldamodel_python_3_5
│   │   │   ├── ldamodel_python_3_5.expElogbeta.npy
│   │   │   ├── ldamodel_python_3_5.id2word
│   │   │   ├── ldamodel_python_3_5.state
│   │   │   ├── ldavowpalwabbit.dict.txt
│   │   │   ├── ldavowpalwabbit.txt
│   │   │   ├── lee_background.cor
│   │   │   ├── lee.cor
│   │   │   ├── lee_fasttext
│   │   │   ├── lee_fasttext.bin
│   │   │   ├── lee_fasttext_new.bin
│   │   │   ├── lee_fasttext.vec
│   │   │   ├── mihalcea_tarau.kwpos.txt
│   │   │   ├── mihalcea_tarau.kw.txt
│   │   │   ├── mihalcea_tarau.summ.txt
│   │   │   ├── mihalcea_tarau.txt
│   │   │   ├── miIslita.cor
│   │   │   ├── mini_newsgroup
│   │   │   ├── non_ascii_fasttext.bin
│   │   │   ├── OPUS_en_it_europarl_train_one2ten.txt
│   │   │   ├── para2para_text1.txt
│   │   │   ├── para2para_text2.txt
│   │   │   ├── PathLineSentences
│   │   │   │   ├── 1.txt
│   │   │   │   └── 2.txt.bz2
│   │   │   ├── pre_0_13_2_model
│   │   │   ├── pre_0_13_2_model.state
│   │   │   ├── questions-words.txt
│   │   │   ├── similarities0-1.txt
│   │   │   ├── simlex999.txt
│   │   │   ├── small_tag_doc_5_iter50
│   │   │   ├── testcorpus.blei
│   │   │   ├── testcorpus.blei.index
│   │   │   ├── testcorpus.blei.vocab
│   │   │   ├── testcorpus.low
│   │   │   ├── testcorpus.low.index
│   │   │   ├── testcorpus.mallet
│   │   │   ├── testcorpus.mallet.index
│   │   │   ├── testcorpus.mm
│   │   │   ├── testcorpus.mm.index
│   │   │   ├── test_corpus_ok.mm
│   │   │   ├── test_corpus_small.mm
│   │   │   ├── testcorpus.svmlight
│   │   │   ├── testcorpus.svmlight.index
│   │   │   ├── testcorpus.txt
│   │   │   ├── testcorpus.uci
│   │   │   ├── testcorpus.uci.index
│   │   │   ├── testcorpus.uci.vocab
│   │   │   ├── test_glove.txt
│   │   │   ├── testlowdistinctwords.txt
│   │   │   ├── testrepeatedkeywords.txt
│   │   │   ├── testsummarization_unrelated.txt
│   │   │   ├── varembed_lee_subcorpus.cor
│   │   │   ├── varembed_morfessor.bin
│   │   │   ├── varembed_vectors.pkl
│   │   │   ├── word2vec_pre_kv_c
│   │   │   ├── word2vec_pre_kv_py2
│   │   │   ├── word2vec_pre_kv_py3
│   │   │   ├── word2vec_pre_kv_py3_4
│   │   │   ├── word2vec_pre_kv_sep_py2
│   │   │   ├── word2vec_pre_kv_sep_py2.neg_labels.npy
│   │   │   ├── word2vec_pre_kv_sep_py2.syn0_lockf.npy
│   │   │   ├── word2vec_pre_kv_sep_py2.syn0.npy
│   │   │   ├── word2vec_pre_kv_sep_py2.syn1neg.npy
│   │   │   ├── word2vec_pre_kv_sep_py3
│   │   │   ├── word2vec_pre_kv_sep_py3_4
│   │   │   ├── word2vec_pre_kv_sep_py3_4.neg_labels.npy
│   │   │   ├── word2vec_pre_kv_sep_py3_4.syn0_lockf.npy
│   │   │   ├── word2vec_pre_kv_sep_py3_4.syn0.npy
│   │   │   ├── word2vec_pre_kv_sep_py3_4.syn1neg.npy
│   │   │   ├── word2vec_pre_kv_sep_py3.neg_labels.npy
│   │   │   ├── word2vec_pre_kv_sep_py3.syn0_lockf.npy
│   │   │   ├── word2vec_pre_kv_sep_py3.syn0.npy
│   │   │   ├── word2vec_pre_kv_sep_py3.syn1neg.npy
│   │   │   └── wordsim353.tsv
│   │   ├── test_direct_confirmation.py
│   │   ├── test_doc2vec.py
│   │   ├── test_dtm.py
│   │   ├── test_fasttext.py
│   │   ├── test_fasttext_wrapper.py
│   │   ├── test_glove2word2vec.py
│   │   ├── test_hdpmodel.py
│   │   ├── test_indirect_confirmation.py
│   │   ├── test_keras_integration.py
│   │   ├── test_keywords.py
│   │   ├── test_ldamallet_wrapper.py
│   │   ├── test_ldamodel.py
│   │   ├── test_ldaseqmodel.py
│   │   ├── test_ldavowpalwabbit_wrapper.py
│   │   ├── test_lee.py
│   │   ├── test_logentropy_model.py
│   │   ├── test_lsimodel.py
│   │   ├── test_miislita.py
│   │   ├── test_normmodel.py
│   │   ├── test_parsing.py
│   │   ├── test_phrases.py
│   │   ├── test_probability_estimation.py
│   │   ├── test_rpmodel.py
│   │   ├── test_segmentation.py
│   │   ├── test_sharded_corpus.py
│   │   ├── test_similarities.py
│   │   ├── test_similarity_metrics.py
│   │   ├── test_sklearn_api.py
│   │   ├── test_summarization.py
│   │   ├── test_text_analysis.py
│   │   ├── test_tfidfmodel.py
│   │   ├── test_tmdiff.py
│   │   ├── test_translation_matrix.py
│   │   ├── test_utils.py
│   │   ├── test_varembed_wrapper.py
│   │   ├── test_wikicorpus.py
│   │   ├── test_word2vec.py
│   │   └── test_wordrank_wrapper.py
│   ├── topic_coherence
│   │   ├── aggregation.py
│   │   ├── direct_confirmation_measure.py
│   │   ├── indirect_confirmation_measure.py
│   │   ├── __init__.py
│   │   ├── probability_estimation.py
│   │   ├── segmentation.py
│   │   └── text_analysis.py
│   └── utils.py
├── gensim Quick Start.ipynb
├── ISSUE_TEMPLATE.md
├── jupyter_execute_cell.png
├── jupyter_home.png
├── MANIFEST.in
├── README.md
├── setup.cfg
├── setup.py
└── tutorials.md

43 directories, 479 files

标签：

实例下载地址