Is it worth it to apply TurboPrep?
tonyboy9
New Altair Community Member
I finished reading RapidMiner "What you need to know about Data Preparation."
In the real world of work, it's good to know how screwed up data can present itself. When I import a data set in RapidMiner Studio, at the bottom right it says, "No problems." To me that says the data set is clean for learning purposes. Is it worth it to apply TurboPrep?
In the real world of work, it's good to know how screwed up data can present itself. When I import a data set in RapidMiner Studio, at the bottom right it says, "No problems." To me that says the data set is clean for learning purposes. Is it worth it to apply TurboPrep?
Tagged:
0
Best Answers
-
Hi @tonyboy9 ,the words "no problem" can have a lot of meanings. In the case of importing your data, it simply states, that this particular task went well (no corrupted files, wrong data types, unreadable date formats and so on).At this stage, RapidMiner can't give you any feedback on the actual data quality or if it's worth to apply TurboPrep (big guess: probably it is). How to prepare and improve a data set always depends on the actual use case. In some cases you might want to keep the data as raw as possible (for education, to define a baseline for improvement, for compliance and so on). In other cases, you might spend 90% of your time on data preparation and understanding to solve a particular problem.To summarize:"No problems" during import is only a technical statement, that the import will run smoothly. Data preparation is an additional and independent step afterwards.Best,
David1 -
Nicely phrased response indeed. Thank you for your time.0
Answers
-
Hi @tonyboy9 ,the words "no problem" can have a lot of meanings. In the case of importing your data, it simply states, that this particular task went well (no corrupted files, wrong data types, unreadable date formats and so on).At this stage, RapidMiner can't give you any feedback on the actual data quality or if it's worth to apply TurboPrep (big guess: probably it is). How to prepare and improve a data set always depends on the actual use case. In some cases you might want to keep the data as raw as possible (for education, to define a baseline for improvement, for compliance and so on). In other cases, you might spend 90% of your time on data preparation and understanding to solve a particular problem.To summarize:"No problems" during import is only a technical statement, that the import will run smoothly. Data preparation is an additional and independent step afterwards.Best,
David1 -
Nicely phrased response indeed. Thank you for your time.0