Hi,
i have a question... of course

The doc for the neural netowrk operators says: the (hidden) layer size will be set to (number of attributes + number of classes) / 2 + 1.
2 question:
1. Why not using 2*num_attrs+1 hidden neurons? With that number is can be guaranteed to approximate any function, as you might know

2. What is the number of classes, if i am making a regression? 1?
Thank you very much