"Can Rapid Miner be used to do predictions using aggregated data"
Hi,
I would like to enquire if Rapid Miner can use aggregated data to develop failure prediction models?
I have aggregated vehicle test data for cars that were tested in 2016 by the national vehicle testing authority.
I would like to analyse the aggregated data to see if I can produce a prediction model that will allow me to predict the failure rates (and reasons for failures) for vehicles based on the manufacturer (brand), model and year of manufacture.
For example, what is the liklihood that a 2008 Toyota Camry will fail the national vehicle test and if it does fail, what are the reason that it will fail, e.g. Brakes, Lights, Emissions, etc.
I have aggregated test data that shows the total number tested, the number that passed, failed and the reasons for failure
See sample data below.
Test result Reason for failure
Manufacturer, Model, Year of Manuf, Total tested, Pass, Fail, Brakes, Lights, Electrical, Emissions, etc.
Toyato, Camry, 2010, 1600, 1000, 600, 100, 600, 250, 120
Toyato, Camry, 2009, 2000, 800, 1200, 500, 1200, 200, 100
Vehicles can fail for multiple reason. Whichever reason produces the highest failure rate, this will determines the over all failure rate.
For example, in the table above, for test year 2016, a total of 2,000 Camrys were tested. These vehilces manufactured in 2009, i.e. they were 7 years old.
Of 2,000 that were tested, the highest falure rate was for lighting where 1,200 Camry's failed.
This means that overall, 1,200 (60%) of the 2,000 Camrys tested failed.
Would you anyone be able to advise or assist in developing a Raipd Miner process that would allow me to use the aggregated test results to predict the failure rates (and reasons) for vehicles that will be tested in future.
Regards
Tom