Hello,
I would like to first process one or more documents (tokenize, n-grams, etc. -> done) and then compare each document with several sample data lists. If there is a match/similarity, the name of the respective list should be matched to the original document. If the documents contain common tokens but do not agree with a list, then "Others" should be mapped additional. It should later be possible to trace which lists fit into a document. I imagine this to be similar to a sentiment analysis with a training model, except that besides positive and negative there are a lot of assignments. Unfortunately, I don't find an approach how to proceed.
I would appreciate your help :smileyhappy: