Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Is it worth it to apply TurboPrep?
tonyboy9
I finished reading RapidMiner "
What you need to know about Data Preparation."
In the real world of work, it's good to know how screwed up data can present itself. When I import a data set in RapidMiner Studio, at the bottom right it says, "No problems." To me that says the data set is clean for learning purposes. Is it worth it to apply TurboPrep?
Find more posts tagged with
AI Studio
Turbo Prep
Accepted answers
David_A
Hi
@tonyboy9
,
the words "no problem" can have a lot of meanings. In the case of importing your data, it simply states, that this particular task went well (no corrupted files, wrong data types, unreadable date formats and so on).
At this stage, RapidMiner can't give you any feedback on the actual data quality or if it's worth to apply TurboPrep (big guess: probably it is). How to prepare and improve a data set always depends on the actual use case. In some cases you might want to keep the data as raw as possible (for education, to define a baseline for improvement, for compliance and so on). In other cases, you might spend 90% of your time on data preparation and understanding to solve a particular problem.
To summarize:
"No problems" during import is only a technical statement, that the import will run smoothly. Data preparation is an additional and independent step afterwards.
Best,
David
tonyboy9
Nicely phrased response indeed. Thank you for your time.
All comments
David_A
Hi
@tonyboy9
,
the words "no problem" can have a lot of meanings. In the case of importing your data, it simply states, that this particular task went well (no corrupted files, wrong data types, unreadable date formats and so on).
At this stage, RapidMiner can't give you any feedback on the actual data quality or if it's worth to apply TurboPrep (big guess: probably it is). How to prepare and improve a data set always depends on the actual use case. In some cases you might want to keep the data as raw as possible (for education, to define a baseline for improvement, for compliance and so on). In other cases, you might spend 90% of your time on data preparation and understanding to solve a particular problem.
To summarize:
"No problems" during import is only a technical statement, that the import will run smoothly. Data preparation is an additional and independent step afterwards.
Best,
David
tonyboy9
Nicely phrased response indeed. Thank you for your time.
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups