Totally different results with model depending on storage method
Hi there,
are there limitations on which models you can safely store using the 'store model' operator? I noticed huge differences using a model stored as xml, and the same model stored directly in the server repository.
The model I used was the Weka MultiBayes, and originally I stored it as xml. Unfortunatly while my training / test results were more than ok, applying the model to new data was resulting in almost no match at all. When I tried to do the same thing with the same model stored on the repo they were as expected. So it seems as there is a huge difference with how models are read from xml as from repo.
I also wanted to try it with saving it as a binary rather than XML, but it won't even load it since it seems to expect an xml by default. Is there a special extention that needs to be used when saving as binary?
I can help myself for the time being by storing the model in the database, but since this is not always the best way to do for real heavy models I'd like to understand where I go wrong with using the 'store model' option.
Find more posts tagged with
Agreed, I have had issues both in viewing results and in storing models from various Weka modeling operators before. I don't know what the root cause is, although I suspect it is simply some kind of deep software bug stemming from incompatibilities between Weka's and RapidMiner's internal configuration. This has been true for many recent versions of RapidMiner. Unfortunately I haven't ever really heard an explanation from tech support of the issue or a plan to remedy it either.
WHy I can't comment on the storing XML vs Repo part, what I can say is that I've always had problems storing Weka extension derived models. I do remember maybe @Telcontar120 might have chimed in on this topic before.