How does RM Decision Tree algorithm choose between two attributes that produce equally good splits?
It seems possible that a DT algorithm will find cases that are split equally well into two different groups (branches) by say two different case attributes. How does it choose one attribute to use, in these circumstances? And what can we say about the likely structure of the tree that would have been made had the neglected attribute been used? I have not seen anything like this discussed in the literature on DTs