"text filter"

helen
helen New Altair Community Member
edited November 5 in Community Q&A
could somebody give me a help regarding text filter. Here is my data loaded in RapidMiner.

key           description
1               loan to help me to start up a new company
2               loan to help to make a new hous
3               loan to help me to pay down credit card debt
4               loan to help me to pay down credit card debt
.                 .................
.                 ................
25             loan to help me to pay down credit card debt
.                 ...................
n               ..................

I'd like to use attibute filter to get the data which match in the description "credit card".  As result sould show me like below:
key           description
3               loan to help me to pay down credit card debt
4               loan to help me to pay down credit card debt
25             loan to help me to pay down credit card debt

should I uese the attribute filter in the preprocessing operater on Root Process? and which category should I chose in the condition_class? should I type "credit card" in the fild of parameter_sting?

many thinks in advance.

Answers

  • land
    land New Altair Community Member
    Hi,
    if I understood you correctly, then you want to filter your examples (=rows) and not the attributes (=columns). Hence you would need the example filter.

    Here's an example:

    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleFilter" class="ExampleFilter">
            <parameter key="condition_class" value="attribute_value_filter"/>
            <parameter key="parameter_string" value="description=.*credit card.*"/>
        </operator>
    </operator>
    If you need help on regular expressions, you could take a look in the tutorial or just use google.

    Greetings,
      Sebastian
  • helen
    helen New Altair Community Member
    highly appreciated ! Thanls a lot Sebastian  ;D
  • helen
    helen New Altair Community Member
    Could I ask a further question on this? Because sometimes I need to find out in the description some texts, which don&#146;t standing together. E.g if I want to file out "pay debt " in the description. It should give me the same output as above:
    key          description
    3              loan to help me to pay down credit card debt
    4              loan to help me to pay down credit card debt
    25            loan to help me to pay down credit card debt
    What should I type on the code? value = &#147;description=.*pay. debt*  ???
    Thanks a lot in advance
  • Ryujakk
    Ryujakk New Altair Community Member
    Hello!

    That sounds like you need some more... regular expressions! I learned most of it here: http://www.regular-expressions.info/

    For your particular example, I'd use this:

    <operator name="ExampleFilter" class="ExampleFilter">
      <parameter key="condition_class" value="attribute_value_filter"/>
      <parameter key="parameter_string" value="description = .*pay.*debt.*"/>
    </operator>
    - R
  • land
    land New Altair Community Member
    Hi,
    I don't know if the mentioned above website contains something equal, but I always found this website helpful:
    http://www.fileformat.info/tool/regex.htm
    It provides the possibility to test regular expressions on arbitrary text.

    By the way: The new RapidMiner 5 contains a builder for regular expressions. For all of us, always forgetting those damned group names...

    Greetings,
      Sebastian