🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

"Error messages using stringtextinput"

User: "rdmckinney"
New Altair Community Member
Updated by Jocelyn
Here is my code:
<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSource" class="ExampleSource">
        <parameter key="attributes" value="C:\Documents and Settings\rkenney\My Documents\rm_workspace\Comments09.aml"/>
    </operator>
    <operator name="StringTextInput" class="StringTextInput" expanded="yes">
        <parameter key="remove_original_attributes" value="true"/>
        <parameter key="default_content_language" value="english"/>
        <list key="namespaces">
        </list>
        <operator name="StringTokenizer" class="StringTokenizer">
        </operator>
        <operator name="EnglishStopwordFilter" class="EnglishStopwordFilter">
        </operator>
        <operator name="TokenLengthFilter" class="TokenLengthFilter">
            <parameter key="min_chars" value="3"/>
        </operator>
        <operator name="PorterStemmer" class="PorterStemmer">
        </operator>
    </operator>
    <operator name="SVDReduction" class="SVDReduction">
        <parameter key="keep_example_set" value="true"/>
        <parameter key="return_preprocessing_model" value="true"/>
        <parameter key="dimensions" value="15"/>
    </operator>
    <operator name="EMClustering" class="EMClustering">
        <parameter key="k" value="5"/>
    </operator>
    <operator name="ExcelExampleSetWriter" class="ExcelExampleSetWriter">
        <parameter key="excel_file" value="C:\Projects\Memb Sat Survey\2009\Data\RapidMinerOutput\RMClusters.xls"/>
    </operator>
</operator>

While running the stringtextinput operators I get an error message for each one of my text documents. For example this one: P Jul 14, 2009 3:03:44 PM: [Warning] StringTextInput: File C:\Program Files\Rapid-I\RapidMiner\RE BILLING; I GET THE FORM SHOWING WHAT CC HAS PAID, BUT UNDER  PATIENT RESPONSIBILITY,  IT SHOWS 0.00 WHICH ISN'T TRUE. I DON'T RECALL EVER SEEING ONE THAT HAD A FIGURE.  not found. Assuming the text is directly encoded as document source...

My input file has about 300 examples with three columns. Col 1 is comments from surveys set as a string variable; col 2 is the member number set as an ID variable, which I use to attach demograhic data, and col 3 is a grouping variable set as a label. The part in the error message in all caps is the actual text that I want to analyze. The output file looks fine, but I'm worried about the warnings. What do you think?

Find more posts tagged with