NLP初步——NLTK

2022年5月21日11:03:13 发表评论 832 views

安装相关库:

conda  install NLTK

then  import nltk      # 调用

首次在python中执行nltk.download()

可以设置相关的下载地址

NLP初步——NLTK

无法支持向量转换,可以通过jieba进行分词和向量转换,最后再用nltk处理

 

#-*- coding:utf-8 -*-

import nltk
text = 'Join thousands of learners from around the world who are improving their English listening skills with our online courses. Join thousands of learners from around the world who are improving their English listening skills with our online courses.'   # 必须后面句号后面有空格才能分句
sens = nltk.sent_tokenize(text,language='english')
print(sens)
words = []
for sent in sens:
    words.append(nltk.word_tokenize(sent))
print(words)

# 词性标注
tags = []
for token in words:
    tags.append(nltk.pos_tag(token))
print(tags)

textzh = '本人喜欢折腾,倒腾大数据和AI人工智能滴一些技术。'
sens_zh = nltk.sent_tokenize(textzh)  # 目测无法处理中文,且句号后要加空格
print(sens_zh)

 

 

 

 

发表评论

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: