Determining value for parameter

joandcruz
joandcruz New Altair Community Member
edited November 2024 in Community Q&A

Please Help Me. I am stuck.

I have a general decision tree and also CHAID and ID-3.

The parameters are

- minimal size for split
- minimal leaf size
- minimal gain
- maximal depth
- confidence

My training data is 400.
Ny features are 6707
My amount of total text is 27910

How can I determine a good value for the parameter without testruns. Testruns would take too much time due to the high enourmous amount of data.
Who has an idea for me?

Thank you!!!
Tagged:

Answers

  • mafern76
    mafern76 New Altair Community Member
    What do you mean by total text?

    If you are working with text and a lot of attributes and short on time you could give Naive Bayes a try.

    Also you can try pruning some of your text vectors and removing correlated attributes.
  • MartinLiebig
    MartinLiebig
    Altair Employee
    Hello joandcruz,

    the data is not as big as you might think. It sounds pretty reasonable to use a parameter optimization on that. You can do this either by grid or with an evolutionary approach.

    If this is text mining, i would recommend a SVM. Usually they score better and you only have one parameter to optimize for in the linear case (C).

    Cheers,

    Martin