"SVM label attribute (positive or Negative examples)"

Shubha
Shubha New Altair Community Member
edited November 5 in Community Q&A
Hi,

I need to clarify, whether the attribute indicating the positive and negative examples (label attribute: 1,-1) of an SVM exampleset is an input to SVM or is it the output of SVM. I am a bit confused on this.

To put my question in another way,

Suppose i want to perform a HyperHyper SVM for the exampleset in, .....rm_workspace\sample\data/polynomial.dat. I cannot run the HyperHyper SVM on this because, RM ask for the label/class attribute. But, how do i create this attribute for the data, polynomial.dat? On what basis should I put 1 or -1 to the examples of the dataset? (I thought SVM will give +1/-1 column as an output)

Sorry if I did not get the topic well, i am a bit confused with this...

Thanks for your help,
Shubha
Tagged:

Answers

  • steffen
    steffen New Altair Community Member
    Hello

    Sorry, I tried this

    <operator name="Root" class="Process" expanded="yes">
        <operator name="Input" class="ExampleSource">
            <parameter key="attributes" value="C:\workspaces\workspace1-datamining\sample\data\polynomial.aml"/>
        </operator>
        <operator name="HyperHyper" class="HyperHyper">
        </operator>
    </operator>
    without issues. What did you do ?

    regards,

    Steffen
  • Shubha
    Shubha New Altair Community Member
    Thanks for your reply...

    I was doing

    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSource" class="ExampleSource">
            <parameter key="attributes" value="C:\Documents and Settings\shubhak\My Documents\rm_workspace\del.aml"/>
        </operator>
        <operator name="HyperHyper" class="HyperHyper">
        </operator>
    </operator>

    I was taking del.aml(attached) but was reading polynomial.dat. Thought that there is no differnce in the del.aml and polynomial.aml (attached). But the label(continuous values) is only created for the polynomial.aml. I am sure i missed some basics. But, this is an off topic issue...

    Coming to my question, there should be a 'label' attribute in SVM. What is this label attribute? Is it a column of +1 and -1's? What are negative and positive examples in SVM? I am doing a classification with SVM (and not regression). I am confused whether the column of (+1,-1) will be an input to SVM or is it an output...

    Thanks for your time,

    Shubha

    [attachment deleted by admin]
  • steffen
    steffen New Altair Community Member
    Hello Shubha

    I cannot avoid the feeling that you have not understood the whole concept of classification.  I suggest this article for the basic concept:
    http://en.wikipedia.org/wiki/Statistical_classification

    So... before you can learn anything, you need a label attribute. If you do not have an label attribute yet, you can build one manually which is equivalent to specify a target to learn.But normally the label attribute (especially for the examples delivered with RapidMiner) labels are prespecified.

    In RapidMiner a label attribute for classification can be either binominal (two arbitrary values of type nominal, not just -1 and 1) or multinomial. If you press F1 when selecting an Learner Operator (e.g. HypherHypher) you can see this requirement around others. To build a label attribute use the various transformation operators e.g. changeAttributeRole.

    regards,

    Steffen

    PS: If I have told you things, that you know already, I apologize. Otherwise I suggest to get a good book. As I always say, RapidMiner is a tool for performing Data Mining Tasks but not a subsitution for a well written data mining book.

  • Shubha
    Shubha New Altair Community Member
    Thanks for your reply Steffen.

    Yes, I have started the SVM technique two days back. I was reading the SVM links, all say to have a label attribute, the C class. (I tried with R software, which also did the same by taking a vector with binaries/categories). And Rapid Miner too asked for the label attribute.

    But since I had done some 'cluster analysis' before (again a classification technique), which does not take any label attributes, but creates the category(label) as the output, I expected the same here in SVM too, which is again a classification technique (any comments here?). So, wanted to clear my confusion.

    So in a SVM, one already knows which category does the vector belongs to in a training set and hence will learn from the model and can apply the learning on any test data(which doesn't have the label) and creates label for the test data. Is this correct?

    (One more stupid doubt is: Creating a binomial label attribute is based on some theory right? Like, Certain vectors belong to a Red box and rest belongs to a Green box. It is not just the random flow of binary values.)..

    Thanks for clearing the issues I have.

    Shubha
  • haddock
    haddock New Altair Community Member
    With respect, I feel you should work through the examples and stop posting until you have. Why? Because you are getting the most basic things wrong. Surely you noticed that SVMs are supervised learners, and that the cluster operators are unsupervised. But you did not understand even that fundamental difference  ::)

    A quick Google on "Shubha Karanth" indicates that you, or your named double, work for computer consultancies, should we send them the bill for our time? Or would it be better for you to read the manuals and do the exercises?

  • Shubha
    Shubha New Altair Community Member
    I sincerely apologize if this thread caused any problems or waste of time...

    BR, Shubha