I'm afraid you are right. There is no documentation on how the decay influences the learning rate. I've just looked it up from the source code: In every learning iteration the learning rate is diveded by the number of learning iterations plus one. Thus it drops to 1/2, 1/3, 1/4, etc.. of the default learning rate.
I'm afraid you are right. There is no documentation on how the decay influences the learning rate.
I've just looked it up from the source code: In every learning iteration the learning rate is diveded by the number of learning iterations plus one. Thus it drops to 1/2, 1/3, 1/4, etc.. of the default learning rate.
Best,
Nils