Am I cannot use

YB_Choi
YB_Choi New Altair Community Member
edited November 5 in Community Q&A
Hello, I'm newbie & using Rapidminer Studio 9.6 version. 
(Sorry to my poor english.) 

I did download titanic data set from Kaggle.com, and analyzed it with Rapidminer.
There were no problem when I using the data which stored in my local drive. 

Then I uploaded the same data into AWS S3 and imported it.
I used "Read Amazon S3" and "Read CSV". "Read CSV" showed me right result, but "Select Attributes" displayed nothing.

*refer the attached picture please. 
 - In A (local drive file), Select Attributes displayed the lists ~ Age, Cabin, Embarked, ..... 
 - But B (AWS S3), Select Attributes displayed nothing in dialog box. 
   There is no security or access problem. When I directly connected the "Read CSV"'s out port with result,
   it displayed right output.  

   PS. Perhaps, Don't mind the yellow warning icon in the "Read Amazon S3". That's only because I'm not 
        accessed to AWS now. 

Am I cannot use "Select Attributes" with Amazon S3 file?
or I did some mistake? 

Thank you. 



Best Answer

  • varunm1
    varunm1 New Altair Community Member
    edited March 2020 Answer ✓
    @YB_Choi

    In general, rapidminer will propagate metadata information (ATtribute names etc.) when we use most of the operators. This is enabled by selecting PROCESS --> VALIDATE AUTOMATICALLY. Sometimes it cannot, I think this is one of those cases. I am tagging @David_A and @mschmitz if they have any information for you.

Answers

  • varunm1
    varunm1 New Altair Community Member
    edited March 2020
    Hello @YB_Choi

    You can use it. I guess its not pre-fetching attribute names to be displayed in Select attributes operator. Can you set a breakpoint before Select Attribute operator? This can be done by right-clicking on select attributes and then select "breakpoint before". Once you do this, run the process to see if it is displaying the data. If this is displaying data, then it fine, you just need to add the attribute names manually in "Selected Attributes" section and press the PLUS symbol.
  • YB_Choi
    YB_Choi New Altair Community Member
    Thank you, varunm. 

    I cannot understand the "add the attribute names manually in "Selected Attributes" section and press the PLUS symbol." 
    Can you explain it more detail, please? 

    fig1. Read CSV from S3 is success 


    fig2. but Select Attributes display nothing 

  • YB_Choi
    YB_Choi New Altair Community Member
    Sorry, I got it. 
    Press the green + button & type the column names manually!! 
    (Oh my gosh. Real manually??) 

    Thank you very much. 
  • varunm1
    varunm1 New Altair Community Member
    edited March 2020 Answer ✓
    @YB_Choi

    In general, rapidminer will propagate metadata information (ATtribute names etc.) when we use most of the operators. This is enabled by selecting PROCESS --> VALIDATE AUTOMATICALLY. Sometimes it cannot, I think this is one of those cases. I am tagging @David_A and @mschmitz if they have any information for you.
  • David_A
    David_A New Altair Community Member

    as Varun has pointed out, normally RapidMiner will propagate the meta-data and then will know the attribute names, so you can for example add them more simply to the attribute selection.

    In your case the data are picked from a remote S3 bucket, so until the process is running, there is no chance to know the layout of the data. What you can do (if the data in the S3 bucket are not changing regularly), is simply run the first part of the process, store the result of the Read CSV operator in your RapidMiner repository and retrieve them in another process for further work. Then you also have the full meta data and the more convenient user interface.

    Best,
    David
  • YB_Choi
    YB_Choi New Altair Community Member
    edited March 2020
    David_A 
    Now I'm doing that. ^^  Thank you.