"rule induction- correlated variables and cross validation of individual rules"

Question

Hi, I have a couple questions about rule induction in rapid miner, I am a bit of a novice in data mining. First of all I need to say the reason I am using rule induction instead of other learning techniques is because it is very important to generate a classifier that can be interpreted by a human, so although a classifier like svm might perform well, its rules would likely not be of any use for us. I have a problem where certain variables are always going to be positively correlated (this is known ahead of time and is just the nature of the variables), so although these variables may have different thresholds, if one of them if found to have a certain minimum value to be classified in a given class the the other variable should also have a minimum value, not a maximum value. I often get rules where where one of these positively correlated variables is given a minimum (x > ..) and the other is given a maximum (y < ..) which clearly indicates over-fitting. Is there a way to specify in the dataset that certain variables are positively correlated so such rules will never be examined? In my dataset it also only makes sense for certain variable to have thresholds, not ranges, so similarly whenever I see rules like x > min and x < max it is a case of overfitting and I would similarly like to tell the learner not to attempt such combinations if this is possible. I was also wondering if there is a way to perform cross validation on rules independently. We only have a subset of the total number of variables that would be required to build a proper classifier and are aware that in many, probably most cases, we cannot classify based on the variables we have. I am however interested in the cases where we can classify, so the individual rules that provide strong evidence for a given outcome. Cross validation in rapid miner, as I have used it, performs poorly because it is based on trying to classify everything. I would however like to see how well the best individual rules perform on unseen data instead of how well the entire rule set performs on unseen data, is this possible? Anybody have any suggestions? Thanks in advance, Barsh