An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
how to extract a sentence using a keyword with help of operators?
Are you referring to Tokenization? The Tokenize operator lets you tokenize based on "Linguistic Sentences." Just select that in the paramter window.
Maybe he's referring to the "extract information" operator?
You have to place it inside of a "process documents" operator, to feed it your documents. Once there, select your extraction options and run. Just make sure that "add meta information" is checked. Here's a sample process.
<?xml version="1.0" encoding="UTF-8"?><process version="7.2.000"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="7.2.000" expanded="true" name="Process"> <process expanded="true"> <operator activated="true" class="generate_nominal_data" compatibility="7.2.000" expanded="true" height="68" name="Generate Nominal Data" width="90" x="112" y="85"/> <operator activated="true" class="nominal_to_text" compatibility="7.2.000" expanded="true" height="82" name="Nominal to Text" width="90" x="246" y="85"/> <operator activated="true" class="text:process_document_from_data" compatibility="7.2.000" expanded="true" height="82" name="Process Documents from Data" width="90" x="380" y="85"> <parameter key="create_word_vector" value="false"/> <parameter key="keep_text" value="true"/> <list key="specify_weights"/> <process expanded="true"> <operator activated="true" class="text:extract_information" compatibility="7.2.000" expanded="true" height="68" name="Extract Information" width="90" x="179" y="34"> <list key="string_machting_queries"> <parameter key="test" value="value1.value3"/> </list> <list key="regular_expression_queries"/> <list key="regular_region_queries"/> <list key="xpath_queries"/> <list key="namespaces"/> <list key="index_queries"/> <list key="jsonpath_queries"/> <description align="center" color="green" colored="true" width="126">Define attribute names here</description> </operator> <connect from_port="document" to_op="Extract Information" to_port="document"/> <connect from_op="Extract Information" from_port="document" to_port="document 1"/> <portSpacing port="source_document" spacing="0"/> <portSpacing port="sink_document 1" spacing="0"/> <portSpacing port="sink_document 2" spacing="0"/> </process> </operator> <connect from_op="Generate Nominal Data" from_port="output" to_op="Nominal to Text" to_port="example set input"/> <connect from_op="Nominal to Text" from_port="example set output" to_op="Process Documents from Data" to_port="example set"/> <connect from_op="Process Documents from Data" from_port="example set" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator></process>
Hope that helps!
Hi,
Thanks frnds for your input, i actually used cut document operator and in it specified regular expression as "([^.]*?apple[^.]*\.)" and was able to extract the sentence.
Thanks and Regards,
Sachin
leaving this here for future users,
Here is a KB article describing other techniques
http://community.rapidminer.com/t5/Text-Analytics/Splitting-text-into-sentences/ta-p/31845
Are you working with Hadoop. Radoop is free now!! Try it here http://bit.ly/RadoopDL