I want to try out those 3 new algorithms that came with 7.2 on my dataset (4500 examples with 25 num. attributes), what are the most important parameters to tune in a grid optimization operator for them? and in what intervals? are there any experiences..?
Yea I'd like to get some ideas about the best params and their ranges to start tweaking with.
I run some sweeps and was rather disappointed.
I cannot run 10 params so any pointers are welcome.
Also surprisingly I get a better generalization of a smaller set than a bigger one (my total data set is just a few thousands of examples), what gives..?!
No free hunch :smileywink:
For deel learning, it basically depend on the network design and specific domain knowledge,
how to choose activation function, # epochs, hidden layer sizes, learning rate, parameters for avoid overfitting etc....
Why not download the booklet and take a look at the reference for the supervised models you just mentioned from
http://www.h2o.ai/docs/
you will get more helpful information there