🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

"Beginner Machine Laerning Question"

User: "Ghostrider"
New Altair Community Member
Updated by Jocelyn
Say I want to predict the price of an automobile based on attributes of the automobile.  Assume that I know things such as tire size, date of manufacture, number of doors, etc.  I could throw all these attributes into a decision tree learner and hope to find some relation about the cost of the car.  But can I get a better result by using relations that I already know about the attributes?  For example, assume that I don't know how much horsepower that the engine produces, but I do know information about the attributes that correlate with the engine's horsepower such as the engine displacement, number of cylinders, and number of gears in the transmission.  Although I don't know the horsepower, assume that I can roughly calculate it form these parameters.  Question is, doesn't it make more sense to try to isolate these attributes from the other attributes and use them exclusively for building a model for engine horsepower which can then be supplied to a higher layer learner that can try to figure out how horsepower and other factors affect an automobile's price?  Obviously, if I don't have any idea about how the attributes relate, it's probably better to just supply them all into one learning algorithm.  But if I know information about the relation among certain attributes, it seems like it would be a better approach to isolate the attributes into groups, build a model for what these attributes represent, and then use these sub-models to train another model, this would be like a hierarchy of learning, going from detailed attributes (number of cylinders, engine displacement, gears in transmission) to predict higher-level attributes (horsepower, torque), and finally predict price of auto from these higher level attributes (horsepower, quality of interior, car marker's reputation, etc).  Question is, is this a good approach?  Idea is to use information about relationships that I already know and direct the learning process.  Second question, what if I don't know how to calculate horsepower from those low-level attributes, I only know that those attributes are related?

Find more posts tagged with