A program to recognize and reward our most engaged community members
If those files are in different directories, you could use the Process Documents from Files operator. This way you can tag each directory with a label so that when you build a model (i.e. Naive Bayes) you could see how well specific documents classify.
Since they are in a XLS links, you could use Get Pages operator in conjunction with a loop to extract each URL, get the page, and save it.
You could append them with the macro %{t} which will give you a timestamp. Then you'd have to build a process that converts that timestamp to say 2017-05-16, which you can then aggregate the documents on.