compare and analysis text documents

Question

Hi Experts, I‘m experimenting in text mining and analysis. I’ve created a neighborhood co-occurence from one text and try to analysis and compare it with a larger corpus. My Example Set look like: Row No. | Document | Word1 | Word2 | n 1 aaa bbb 2 1 bbb ddd 3 1 aaa bbb 4 2 aaa ccc 3 2 aaa bbb 4 2 ccc aaa 3 This is my process: % # filter(!word1 %in% stopwords("de")) %>% # filter(!word2 %in% stopwords("de")) return(list(devided_bigrams)) } "/> % # filter(!word1 %in% stopwords("de")) %>% # filter(!word2 %in% stopwords("de")) return(list(devided_bigrams)) } "/> I’m out of ideas how to compare and analyse them. Please, has someone an idea how I can do this? Regards Tobias

MartinLiebig · Answer

Ok,

I would concat the two words, Pivot, Replace Missings with 0 and use Cross Distance.

Best,

Martin

TobiasNehrig · Answer

Hi @mschmitz,

in my understanding these should be Tupels.

Regards

Tobias

MartinLiebig · Answer

Hi @TobiasNehrig,

are these texts or tupels you are working on? And does the order matter? I guess the solution is something like Pivot + Cross Distance or Aggregate + Cross Distance. But the precise solution depends on your use case.

Cheers,

Martin