I am dealing with malformed xml that is using illegal characters as well as quatations all over the place. The first file from the client I went to the effort of correcting the XML issues after importing the file and before sending it through to the Read XML operator.
The next file from the client will be impossible to apply the same fixes as the file is real broken. I was wondering what the best practise was in terms of reading the file into a MySQL database. If I should be using RapidMiner or if I should be calling an external program to fix the issue before processing the data through the operator.
If you recommend staying inside of RapidMiner what would your approach be in terms of tackeling the problem?
Kind regards
Robin