"Text Mining: How to split data according to language"

New Altair Community Member

Jul 10, 2014

Updated Nov 5, 2024 by Jocelyn

Hi there,

I am currently trying to split the text corpus I am working with into the different languages the texts are written in, but I fail and seek help.
First, I classified the languages of each text in my text corpus by using a Naive Bayes based language detector. Thus, I already know which of the texts are e.g. German or English. Now, I want to select only the German or English texts in order to analyze them seperately, but I fail and don't know the correct operators to use. I already tried to use the Filter Examples operator, but it looks like only the different prediction labels for the languages are filtered and the corresponding texts are omitted.

Can anybody help?

Thanks in advance!!

Ute

Find more posts tagged with

AI Studio

Text Mining + NLP

Split

🎉Community Raffle - Win $25

"Text Mining: How to split data according to language"

Find more posts tagged with

Quick Links