পাইথন - স্প্যাসি ব্যবহার করে PoS ট্যাগিং এবং লেমাটাইজেশন

spaCy সেরা টেক্সট বিশ্লেষণ লাইব্রেরি এক. spaCy বৃহৎ আকারের তথ্য নিষ্কাশনের কাজে পারদর্শী এবং এটি বিশ্বের দ্রুততম একটি। এটি গভীর শিক্ষার জন্য পাঠ্য প্রস্তুত করার সর্বোত্তম উপায়। spaCy NLTKTagger এবং TextBlob-এর চেয়ে অনেক দ্রুত এবং নির্ভুল৷

কিভাবে ইনস্টল করবেন?

pip install spacy
python -m spacy download en_core_web_sm

উদাহরণ

#importing loading the library
import spacy
# python -m spacy download en_core_web_sm
nlp = spacy.load("en_core_web_sm")
#POS-TAGGING
# Process whole documents
text = ("""My name is Vishesh. I love to work on data science problems. Please check out my github profile!""")
doc = nlp(text)
# Token and Tag
for token in doc:
print(token, token.pos_)
# You want list of Verb tokens
print("Verbs:", [token.text for token in doc if token.pos_ == "VERB"])
#Lemmatization : It is a process of grouping together the inflected #forms of a word so they can be analyzed as a single item, #identified by the word’s lemma, or dictionary form.
import spacy
# Load English tokenizer, tagger,
# parser, NER and word vectors
nlp = spacy.load("en_core_web_sm")
# Process whole documents
text = ("""My name is Vishesh. I love to work on data science problems. Please check out my github profile!""")
doc = nlp(text)
for token in doc:
print(token, token.lemma_)