How use Read XML Operator
Hello.
I have troubles to use Read XML Operator.
I read this xml file:
<?xml version="1.0" encoding="UTF-8"?>
<ROOT>
<RECORD>
<ID>id1</ID>
<TEXT name="text1">
<KEYWORD>kw1</KEYWORD>
<KEYWORD>kw2</KEYWORD>
<KEYWORD>kw3</KEYWORD>
</TEXT>
</RECORD>
<RECORD>
<ID>ID2</ID>
<TEXT name="text2">
<KEYWORD>kw4</KEYWORD>
<KEYWORD>kw5</KEYWORD>
<KEYWORD>kw6</KEYWORD>
</TEXT>
</RECORD>
</ROOT>
and want to get this table:
ID | name | KEYWORD |
id1 | text1 | kw1 |
id1 | text1 | kw2 |
id1 | text1 | kw3 |
ID2 | text2 | kw4 |
ID2 | text2 | kw5 |
ID2 | text2 | kw6 |
How I can do it?
Xml version of process is here:
<?xml version="1.0" encoding="UTF-8"?><process version="7.2.003">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="advanced_file_connectors:read_xml" compatibility="7.2.003" expanded="true" height="68" name="Read XML (2)" width="90" x="45" y="136">
<parameter key="file" value="D:\1\new 4.xml"/>
<parameter key="xpath_for_examples" value="//ROOT/RECORD"/>
<enumeration key="xpaths_for_attributes">
<parameter key="xpath_for_attribute" value="TEXT[1]/attribute::name"/>
<parameter key="xpath_for_attribute" value="TEXT[1]/KEYWORD[1]/text()"/>
</enumeration>
<list key="namespaces"/>
<parameter key="use_default_namespace" value="false"/>
<list key="annotations"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="TEXT[1]/attribute::name.true.attribute_value.attribute"/>
<parameter key="1" value="TEXT[1]/KEYWORD[1]/text().true.attribute_value.attribute"/>
</list>
</operator>
<operator activated="true" class="advanced_file_connectors:read_xml" compatibility="7.2.003" expanded="true" height="68" name="Read XML" width="90" x="45" y="34">
<parameter key="file" value="D:\1\new 4.xml"/>
<parameter key="xpath_for_examples" value="//ROOT/RECORD"/>
<enumeration key="xpaths_for_attributes">
<parameter key="xpath_for_attribute" value="ID[1]/text()"/>
<parameter key="xpath_for_attribute" value="TEXT[1]/attribute::name"/>
</enumeration>
<list key="namespaces"/>
<parameter key="use_default_namespace" value="false"/>
<list key="annotations"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="ID[1]/text().true.attribute_value.attribute"/>
<parameter key="1" value="TEXT[1]/attribute::name.true.attribute_value.attribute"/>
</list>
</operator>
<operator activated="true" class="join" compatibility="7.2.003" expanded="true" height="82" name="Join" width="90" x="313" y="34">
<parameter key="use_id_attribute_as_key" value="false"/>
<list key="key_attributes">
<parameter key="TEXT[1]/attribute::name" value="TEXT[1]/attribute::name"/>
</list>
</operator>
<connect from_op="Read XML (2)" from_port="output" to_op="Join" to_port="right"/>
<connect from_op="Read XML" from_port="output" to_op="Join" to_port="left"/>
<connect from_op="Join" from_port="join" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Find more posts tagged with
Sort by:
1 - 5 of
51
Hi, yyhuang
Thank you! It's really helpful.
But I made the xml file is more complex, and again difficulties arose.
I added some levels of nesting.
Because of this, an error occurs in the de-pivot operator
May be I need use more then one Read xml operator and then join tables? When I try this way I haven't any attribute make join correctly.
Here is new xml:
<?xml version="1.0" encoding="windows-1251"?>
<message>
<datetime>
<day>20160101</day>
</datetime>
<area>
<name>AreaName1</name>
<measuringpoint name="Device1">
<measuringchannel desc="active">
<period start="0000" end="0030">
<value>3456</value>
</period>
<period start="0030" end="0100">
<value>1287</value>
</period>
<period start="0100" end="0130">
<value>4565</value>
</period>
</measuringchannel>
<measuringchannel desc="passive">
<period start="0000" end="0030">
<value>1234</value>
</period>
<period start="0030" end="0100">
<value>1345</value>
</period>
<period start="0100" end="0130">
<value>1232</value>
</period>
</measuringchannel>
</measuringpoint>
</area>
<area>
<name>AreaName2</name>
<measuringpoint name="Device2">
<measuringchannel desc="active">
<period start="0000" end="0030">
<value>1343</value>
</period>
<period start="0030" end="0100">
<value>6745</value>
</period>
<period start="0100" end="0130">
<value>8767</value>
</period>
</measuringchannel>
<measuringchannel desc="passive">
<period start="0000" end="0030">
<value>5455</value>
</period>
<period start="0030" end="0100">
<value>2345</value>
</period>
<period start="0100" end="0130">
<value>1234</value>
</period>
</measuringchannel>
</measuringpoint>
</area>
</message>
this table I need to make:
day | name | name2 | desc | start | end | value |
20160101 | AreaName1 | Device1 | active | 0 | 30 | 3456 |
20160101 | AreaName1 | Device1 | active | 30 | 100 | 1287 |
20160101 | AreaName1 | Device1 | active | 100 | 130 | 4565 |
20160101 | AreaName1 | Device1 | passive | 0 | 30 | 1234 |
20160101 | AreaName1 | Device1 | passive | 30 | 100 | 1345 |
20160101 | AreaName1 | Device1 | passive | 100 | 130 | 1232 |
20160101 | AreaName2 | Device2 | active | 0 | 30 | 1343 |
20160101 | AreaName2 | Device2 | active | 30 | 100 | 6745 |
20160101 | AreaName2 | Device2 | active | 100 | 130 | 8767 |
20160101 | AreaName2 | Device2 | passive | 0 | 30 | 5455 |
20160101 | AreaName2 | Device2 | passive | 30 | 100 | 2345 |
20160101 | AreaName2 | Device2 | passive | 100 | 130 | 1234 |
This proccess I made:
<?xml version="1.0" encoding="UTF-8"?><process version="7.2.003">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.2.003" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="advanced_file_connectors:read_xml" compatibility="7.2.003" expanded="true" height="68" name="Read XML" width="90" x="45" y="34">
<parameter key="file" value="D:\1\new 10.xml"/>
<parameter key="xpath_for_examples" value="//message/area"/>
<enumeration key="xpaths_for_attributes">
<parameter key="xpath_for_attribute" value="name[1]/text()"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/attribute::name"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/attribute::desc"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[1]/attribute::end"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[1]/attribute::start"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[1]/value[1]/text()"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[2]/attribute::end"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[2]/attribute::start"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[2]/value[1]/text()"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[3]/attribute::end"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[3]/attribute::start"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[3]/value[1]/text()"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/attribute::desc"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[1]/attribute::end"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[1]/attribute::start"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[1]/value[1]/text()"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[2]/attribute::end"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[2]/attribute::start"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[2]/value[1]/text()"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[3]/attribute::end"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[3]/attribute::start"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[3]/value[1]/text()"/>
</enumeration>
<list key="namespaces"/>
<parameter key="use_default_namespace" value="false"/>
<list key="annotations"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="name[1]/text().true.attribute_value.attribute"/>
<parameter key="1" value="measuringpoint[1]/attribute::name.true.attribute_value.attribute"/>
<parameter key="2" value="measuringpoint[1]/measuringchannel[1]/attribute::desc.true.attribute_value.attribute"/>
<parameter key="3" value="measuringpoint[1]/measuringchannel[1]/period[1]/attribute::end.true.attribute_value.attribute"/>
<parameter key="4" value="measuringpoint[1]/measuringchannel[1]/period[1]/attribute::start.true.attribute_value.attribute"/>
<parameter key="5" value="measuringpoint[1]/measuringchannel[1]/period[1]/value[1]/text().true.attribute_value.attribute"/>
<parameter key="6" value="measuringpoint[1]/measuringchannel[1]/period[2]/attribute::end.true.attribute_value.attribute"/>
<parameter key="7" value="measuringpoint[1]/measuringchannel[1]/period[2]/attribute::start.true.attribute_value.attribute"/>
<parameter key="8" value="measuringpoint[1]/measuringchannel[1]/period[2]/value[1]/text().true.attribute_value.attribute"/>
<parameter key="9" value="measuringpoint[1]/measuringchannel[1]/period[3]/attribute::end.true.attribute_value.attribute"/>
<parameter key="10" value="measuringpoint[1]/measuringchannel[1]/period[3]/attribute::start.true.attribute_value.attribute"/>
<parameter key="11" value="measuringpoint[1]/measuringchannel[1]/period[3]/value[1]/text().true.attribute_value.attribute"/>
<parameter key="12" value="measuringpoint[1]/measuringchannel[2]/attribute::desc.true.attribute_value.attribute"/>
<parameter key="13" value="measuringpoint[1]/measuringchannel[2]/period[1]/attribute::end.true.attribute_value.attribute"/>
<parameter key="14" value="measuringpoint[1]/measuringchannel[2]/period[1]/attribute::start.true.attribute_value.attribute"/>
<parameter key="15" value="measuringpoint[1]/measuringchannel[2]/period[1]/value[1]/text().true.attribute_value.attribute"/>
<parameter key="16" value="measuringpoint[1]/measuringchannel[2]/period[2]/attribute::end.true.attribute_value.attribute"/>
<parameter key="17" value="measuringpoint[1]/measuringchannel[2]/period[2]/attribute::start.true.attribute_value.attribute"/>
<parameter key="18" value="measuringpoint[1]/measuringchannel[2]/period[2]/value[1]/text().true.attribute_value.attribute"/>
<parameter key="19" value="measuringpoint[1]/measuringchannel[2]/period[3]/attribute::end.true.attribute_value.attribute"/>
<parameter key="20" value="measuringpoint[1]/measuringchannel[2]/period[3]/attribute::start.true.attribute_value.attribute"/>
<parameter key="21" value="measuringpoint[1]/measuringchannel[2]/period[3]/value[1]/text().true.attribute_value.attribute"/>
</list>
</operator>
<operator activated="true" class="rename_by_replacing" compatibility="7.2.003" expanded="true" height="82" name="Rename by Replacing (2)" width="90" x="246" y="34">
<parameter key="replace_what" value="/text\(\)|measuringpoint\[1\]|/|attribute::"/>
<description align="center" color="transparent" colored="false" width="126">dity up the column names</description>
</operator>
<operator activated="true" class="de_pivot" compatibility="7.2.003" expanded="true" height="82" name="De-Pivot" width="90" x="447" y="34">
<list key="attribute_name">
<parameter key="name2" value="measuringchannel\[.*]desc"/>
<parameter key="start" value="measuringchannel\[*.]period\[*.]start"/>
<parameter key="end" value="measuringchannel\[.*]period\[.*]start"/>
<parameter key="value" value="measuringchannel\[.*]period\[.*]value\[1]"/>
</list>
<parameter key="index_attribute" value="id1"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="7.2.003" expanded="true" height="82" name="Select Attributes (2)" width="90" x="715" y="34">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="id1|id2|id3"/>
<parameter key="invert_selection" value="true"/>
</operator>
<connect from_op="Read XML" from_port="output" to_op="Rename by Replacing (2)" to_port="example set input"/>
<connect from_op="Rename by Replacing (2)" from_port="example set output" to_op="De-Pivot" to_port="example set input"/>
<connect from_op="De-Pivot" from_port="example set output" to_op="Select Attributes (2)" to_port="example set input"/>
<connect from_op="Select Attributes (2)" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<description align="center" color="yellow" colored="false" height="84" resized="true" width="126" x="31" y="120">May be I need use more then one Read XML operator to load?</description>
<description align="center" color="yellow" colored="false" height="120" resized="false" width="126" x="438" y="131">Somthing wrong here<br>What if in file edded measuringpoint name=&quot;Device3&quot;?</description>
</process>
</operator>
</process>
Thanks for giving me the new XML data. I fixed some of the function expressions for 'de-pivot'. It is always tricky to make the regular expressions work for that.
<?xml version="1.0" encoding="UTF-8"?><process version="7.3.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.3.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="advanced_file_connectors:read_xml" compatibility="7.3.000" expanded="true" height="68" name="Read XML" width="90" x="45" y="34">
<parameter key="file" value="C:\Users\YuanyuanHuang\Documents\RMCommunity\new 10.xml"/>
<parameter key="xpath_for_examples" value="//message/area"/>
<enumeration key="xpaths_for_attributes">
<parameter key="xpath_for_attribute" value="name[1]/text()"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/attribute::name"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/attribute::desc"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[1]/attribute::end"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[1]/attribute::start"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[1]/value[1]/text()"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[2]/attribute::end"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[2]/attribute::start"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[2]/value[1]/text()"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[3]/attribute::end"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[3]/attribute::start"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[1]/period[3]/value[1]/text()"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/attribute::desc"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[1]/attribute::end"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[1]/attribute::start"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[1]/value[1]/text()"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[2]/attribute::end"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[2]/attribute::start"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[2]/value[1]/text()"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[3]/attribute::end"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[3]/attribute::start"/>
<parameter key="xpath_for_attribute" value="measuringpoint[1]/measuringchannel[2]/period[3]/value[1]/text()"/>
</enumeration>
<list key="namespaces"/>
<parameter key="use_default_namespace" value="false"/>
<list key="annotations"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="name[1]/text().true.attribute_value.attribute"/>
<parameter key="1" value="measuringpoint[1]/attribute::name.true.attribute_value.attribute"/>
<parameter key="2" value="measuringpoint[1]/measuringchannel[1]/attribute::desc.true.attribute_value.attribute"/>
<parameter key="3" value="measuringpoint[1]/measuringchannel[1]/period[1]/attribute::end.true.attribute_value.attribute"/>
<parameter key="4" value="measuringpoint[1]/measuringchannel[1]/period[1]/attribute::start.true.attribute_value.attribute"/>
<parameter key="5" value="measuringpoint[1]/measuringchannel[1]/period[1]/value[1]/text().true.attribute_value.attribute"/>
<parameter key="6" value="measuringpoint[1]/measuringchannel[1]/period[2]/attribute::end.true.attribute_value.attribute"/>
<parameter key="7" value="measuringpoint[1]/measuringchannel[1]/period[2]/attribute::start.true.attribute_value.attribute"/>
<parameter key="8" value="measuringpoint[1]/measuringchannel[1]/period[2]/value[1]/text().true.attribute_value.attribute"/>
<parameter key="9" value="measuringpoint[1]/measuringchannel[1]/period[3]/attribute::end.true.attribute_value.attribute"/>
<parameter key="10" value="measuringpoint[1]/measuringchannel[1]/period[3]/attribute::start.true.attribute_value.attribute"/>
<parameter key="11" value="measuringpoint[1]/measuringchannel[1]/period[3]/value[1]/text().true.attribute_value.attribute"/>
<parameter key="12" value="measuringpoint[1]/measuringchannel[2]/attribute::desc.true.attribute_value.attribute"/>
<parameter key="13" value="measuringpoint[1]/measuringchannel[2]/period[1]/attribute::end.true.attribute_value.attribute"/>
<parameter key="14" value="measuringpoint[1]/measuringchannel[2]/period[1]/attribute::start.true.attribute_value.attribute"/>
<parameter key="15" value="measuringpoint[1]/measuringchannel[2]/period[1]/value[1]/text().true.attribute_value.attribute"/>
<parameter key="16" value="measuringpoint[1]/measuringchannel[2]/period[2]/attribute::end.true.attribute_value.attribute"/>
<parameter key="17" value="measuringpoint[1]/measuringchannel[2]/period[2]/attribute::start.true.attribute_value.attribute"/>
<parameter key="18" value="measuringpoint[1]/measuringchannel[2]/period[2]/value[1]/text().true.attribute_value.attribute"/>
<parameter key="19" value="measuringpoint[1]/measuringchannel[2]/period[3]/attribute::end.true.attribute_value.attribute"/>
<parameter key="20" value="measuringpoint[1]/measuringchannel[2]/period[3]/attribute::start.true.attribute_value.attribute"/>
<parameter key="21" value="measuringpoint[1]/measuringchannel[2]/period[3]/value[1]/text().true.attribute_value.attribute"/>
</list>
</operator>
<operator activated="true" class="rename_by_replacing" compatibility="7.3.000" expanded="true" height="82" name="Rename by Replacing (2)" width="90" x="179" y="34">
<parameter key="replace_what" value="/text\(\)|measuringpoint\[1\]|/|attribute::"/>
<description align="center" color="transparent" colored="false" width="126">dity up the column names</description>
</operator>
<operator activated="true" class="de_pivot" compatibility="7.3.000" expanded="true" height="82" name="De-Pivot" width="90" x="313" y="34">
<list key="attribute_name">
<parameter key="start1" value="measuringchannel\[\d\]period\[1\]start"/>
<parameter key="end1" value="measuringchannel\[\d\]period\[1\]end"/>
<parameter key="value1" value="measuringchannel\[\d\]period\[1\]value\[1\]"/>
<parameter key="desc" value="measuringchannel\[\d\]desc"/>
<parameter key="start2" value="measuringchannel\[\d\]period\[2\]start"/>
<parameter key="end2" value="measuringchannel\[\d\]period\[2\]end"/>
<parameter key="value2" value="measuringchannel\[\d\]period\[2\]value\[1\]"/>
<parameter key="start3" value="measuringchannel\[\d\]period\[3\]start"/>
<parameter key="end3" value="measuringchannel\[\d\]period\[3\]end"/>
<parameter key="value3" value="measuringchannel\[\d\]period\[3\]value\[1\]"/>
</list>
<parameter key="index_attribute" value="channel_id"/>
<description align="center" color="transparent" colored="false" width="126">first de-pivot, focus on the first digit for channel[d]</description>
</operator>
<operator activated="true" class="de_pivot" compatibility="7.3.000" expanded="true" height="82" name="De-Pivot (2)" width="90" x="514" y="34">
<list key="attribute_name">
<parameter key="start" value="start\d"/>
<parameter key="end" value="end\d"/>
<parameter key="value" value="value\d"/>
</list>
<parameter key="index_attribute" value="period_id"/>
<description align="center" color="transparent" colored="false" width="126">second de-pivot, focus on the second digit for period[d]</description>
</operator>
<connect from_op="Read XML" from_port="output" to_op="Rename by Replacing (2)" to_port="example set input"/>
<connect from_op="Rename by Replacing (2)" from_port="example set output" to_op="De-Pivot" to_port="example set input"/>
<connect from_op="De-Pivot" from_port="example set output" to_op="De-Pivot (2)" to_port="example set input"/>
<connect from_op="De-Pivot (2)" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Happy RapidMinining, :smileyvery-happy:
YY
Hi dimons91,
Have you tried to use the Import Wizard for Read XML? It will generate the xpaths for attributes automatically. All you need is to select the beans in the step 4 of configuration wizard for your wanted attributes.
I have the sample process here for you
HTH,
YY