"Traversing an XML DOM object (Using process documents?)"

Question

Hello, My goal is to traverse an XML document that has various attributes associated with features. Here is an example: no consensus has yet emerged on the question of whether a dividend tax penalty is capitalized into the return on a firm's common stock. The purpose of this paper is to provide additional evidence on this question. The tags follow this format. I have several hundred of these files and my goal is to traverse these files and assemble a 'bag of words' associated with each attribute and, if possible, each type as well. I have, so far, tried: 'process documents from files' > Extract content (WebMining/HTMLprocessing/) > Tokenize > Stemmer (snowball) store example set store word list Any advice would be helpful. Thank you.