AMOUNT OF EXAMPLES DOES NOT CORRELATES WITH INPUT DATA LOADED FROM PDFs

Question

on="1.0" encoding="UTF-8"?> Type your comment Type your comment I tried to tokenize pdf articles, resulting in only 21 examples. Why does it happen? It should outcome many more. To do so, I used: "Process data from files" and inside I included "Tokenize" and "filter stopwords", Which again works but not throughout all the documents. What should I do to fix it? Cheers, Antonio

lionelderkrikor · Answer

Hi @antonio_heredia,

Do you have a lot of files ?

Can your share these files in order we can reproduce what you observe ?

Regards,

Lionel

NB : The first line of your XML process is broken, however I was able to repair it.