New Forum Board For Theoertical Machine Learning?

wessel
wessel New Altair Community Member
edited November 5 in Community Q&A
Dear All,

Is it possible to create an extra board in the sub forum "General Community"?

Over the past few years there have been quite a few interesting discussions on this forum on theoretical questions.
Stuff like: performance measures, cross validation verses fix split validation, Gain Ratio vs Gini Index, MDL, sigmoid function, SVM parameters, etc.

These discussions are now scattered all over the forum in different boards.
Would be cool to have a board where these theoretical discussions are grouped together.

Best regards,

Wessel

Tagged:

Answers

  • IngoRM
    IngoRM New Altair Community Member
    Hi Wessel,

    in principle no problem and I like the idea. However, this type of discussion was actually the thing we intended with the board "General Community" - "Data Mining" and I would say that most of the discussions you mentioned in your post are actually in this board.

    What would you say: Could it be a good idea to update the description of the boards so that it is more clear where somebody actually should post? Or is a new board a better idea (but what is the distinction then to "General Community" - "Data Mining")?

    Cheers,
    Ingo
  • wessel
    wessel New Altair Community Member
    My hope is that a new board with a very clear description will have a self organizing effect on the community.

    You are right, currently the "Data Mining" board is the most suitable board for these types of discussions,
    and most theoretical discussions (although not all) are already located on this board.
    Having that said, for many people the "Data Mining" board is a good place to post just about anything.
    It contains posts with an enormous variety in topics, and my expectation is that this variation will only keep growing.

    The distinction with other boards would be: theory, methodology and algorithms.

    I would say any material that either broadens or deepens information you typically find in Machine Learning text books is good.
    It is hard to give good examples that are not biased to my own interests, but here are some examples:
    - What is over-fitting?
    - How to compare two decision tree models?
    - Why does PCA work very well one some problems, and very bad on others?
    - How can we try and prove that feature selection is unlikely to over-fit?
    - How do Support Vector Machines work?
    - What assumptions does the "No Free Lunch Theorem Proof" use?
    - What is the correspondence between Logistic Regression and Gaussian Naive Bayes?
    - How does boosting work?
    - Is binary classification a regression or classification problem?
    - Can we model the processes of being right or wrong on N test instances using the bi-nominal distribution?

    It is interesting to see what problems other people are trying to tackle.
    Sometimes you can help out, and oftentimes you learn a lot.