"Regarding Text Mining"
Hi,
I have a text document.How can I delete the contents in between two special characters (For Example my document contains #something#). I want to delete the special character also. I tried with TextCleaner but we have to include the content whatever we want to delete.So I think this will not work out if its for huge amount of data.Is there any Operators available in RM?
Thanks,
Maria
I have a text document.How can I delete the contents in between two special characters (For Example my document contains #something#). I want to delete the special character also. I tried with TextCleaner but we have to include the content whatever we want to delete.So I think this will not work out if its for huge amount of data.Is there any Operators available in RM?
Thanks,
Maria
you might add an TokenReplace Operator before the Tokenizer during TextProcessing and then use regular expressions to capture whatever you want.
Here's an example process setup: For more information about regular expressions, you could visit wikipedia http://en.wikipedia.org/wiki/Regular_expression and for trying something without executing the process, you could use the online form at http://en.wikipedia.org/wiki/Regular_expression.
Greetings,
Sebastian