I just want to do tagging of POS tags but got some error.
Text=open('news/article.txt') t=Text.read() print t text=nltk.word_tokenize(t); posTagged=nltk.pos_tag(текст) печать posTagged
и получил это:
Maybe that is why whenever we go to watch any live sport in India they lock us within cages. Thanks to the Cricket Lovers in the Barabati Stadium in Cuttack, this is probably only going to get worse.
But right now you, the Cricket Lovers at the Barabati Stadium, have a bigger problem to deal with. I hope you realize what you have done. You didn’t just disrupt a game last evening, you may have just ensured you won’t get international cricket in your city. So much for your love!
Thanks to a bunch of hooligans, every Indian fan has been blackened. We are all hanging our heads in shame. This feeling is far worse than losing just a cricket match.
Traceback (most recent call last):
File "C:\Python27\TestProj1.py", line 12, in <module>
posTagged=nltk.pos_tag(text)
File "C:\Python27\lib\site-packages\nltk\tag\__init__.py", line 106, in pos_tag
return tagger.tag(tokens)
File "C:\Python27\lib\site-packages\nltk\tag\sequential.py", line 61, in tag
tags.append(self.tag_one(tokens, i, tags))
File "C:\Python27\lib\site-packages\nltk\tag\sequential.py", line 81, in tag_one
tag = tagger.choose_tag(tokens, index, history)
File "C:\Python27\lib\site-packages\nltk\tag\sequential.py", line 634, in choose_tag
featureset = self.feature_detector(tokens, index, history)
File "C:\Python27\lib\site-packages\nltk\tag\sequential.py", line 736, in feature_detector
'prevtag+word': '%s+%s' % (prevtag, word.lower()),
UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position 4: ordinal not in range(128)
Но для некоторых других текстовых файлов он работает отлично. Как это решить?
UnicodeDecodeError
? Потому что несколько минут назад в вашем вопросе была другая ошибка (извините за путаницу с откатом). - person alexis   schedule 06.10.2015