Continuous and categorical mixed features

Yin
Yin New Altair Community Member
edited November 5 in Community Q&A
What function can i use to perform PCA on a dataset with mixed continuous and mixed features in rapidminer? I applied PCA on continuous features that have been standardized and left the categorical variables with dummy encoding only without PCA. Is there a dimensionality reduction method (e.g.FAMD) that can be used on such dataset? Thanks in advance.

Best Answer

  • MartinLiebig
    MartinLiebig
    Altair Employee
    Answer ✓
    Hi,
    PCA itself is simply not defined on non-numerical types. Any other solution would not be a PCA.

    Best,
    Martin

Answers

  • MartinLiebig
    MartinLiebig
    Altair Employee
    Hi,
    PCA is simply not defined on nominal values. You would need to transform it first to numericals (i.e. using Target encoding).

    Best,
    Martin
  • Yin
    Yin New Altair Community Member
    Hi, my data is transformed, but i am asking if the PCA in rapid miner can handle categorical variables. Typically speaking, it is not good to use PCA on one-hot encoded variables or categorical dummy encoded ones. There should be a specific function implementation that deals with mixed data and i'm asking if this is already integrated here.
  • MartinLiebig
    MartinLiebig
    Altair Employee
    Answer ✓
    Hi,
    PCA itself is simply not defined on non-numerical types. Any other solution would not be a PCA.

    Best,
    Martin
  • Yin
    Yin New Altair Community Member
    Agreed. I should have said dimentionality reduction.  I will resolve this Q and start a new one.