"[SOLVED] How to create documentation for extension"
StaryVena
New Altair Community Member
Hello,
how can I create documentation for operators in our extension including tutorial process? I know equivalent to com.rapidminer.resources.i18n.OperatorsCoreDocumentation.xml and wiki.rapid-i.com, but there is net defined tutorial process. And do you have some automatic generator for wiki pages?
Thanks for tips.
Best,
Vaclav
how can I create documentation for operators in our extension including tutorial process? I know equivalent to com.rapidminer.resources.i18n.OperatorsCoreDocumentation.xml and wiki.rapid-i.com, but there is net defined tutorial process. And do you have some automatic generator for wiki pages?
Thanks for tips.
Best,
Vaclav
0
Answers
-
Hi Vaclav,
yes we have changed our mechanism for operator documentation and this is currently not documented very well.
Here is how it works for extensions:
First you have to create a folder with your extensions namespace in your resources folder (the namespace of your extension can be found in the build.xml property extension.namespace). For the Text Mining extension it would be a folder called 'text'.
Then you can create a XML file for each operator you want to document. The schema of the XML file is defined here: http://rapid-i.com/schemas/documentation/reference/1.0/documentation.xsd
The XML file for an operator should have the name of the key which is used to register the operator in the Operators.xml file and should be placed in a folder hierarchy which maps the hierarchy of the operators tree.
The create_document operator from the Text Mining extensions needs a XML file called create_document.xml which is placed at resources/text/Text Processing/create_document.xml.
I hope this helps,
Nils
0 -
Hi Nils,
yes, this helps. Than you.
Best,
Vaclav0 -
Hello Nils,
it works, I was able to add a process example to an operator. But I have one more question. Can I generate description of operator parameters somehow automatically or have to write it manually?
Thanks for advice.
Best,
Vaclav0 -
Hi Vaclav,
I'm afraid but at the moment there is no script to generate a template XML for every of your extensions' operators.
But we would appreciate any input for on this topic :-)
Best,
Nils0 -
Hmm, I will think about that
Can you post here example documentation for one operator, so I could create exactly same template? I mean with emphasis etc.
Thank you.
Vaclav0 -
Sure. here it is:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="../../../../documentation2html.xsl"?>
<p1:documents xmlns:p1="http://rapid-i.com/schemas/documentation/reference/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://rapid-i.com/schemas/documentation/reference/1.0 http://rapid-i.com/schemas/documentation/reference/1.0/documentation.xsd">
<!-- each operator should have a key that consists of "operator." plus the operator's key. -->
<operator key="operator.set_role" locale="en" version="5.1.012">
<title>Set Role</title>
<synopsis>This operator is used to change the role of one or more attributes.</synopsis>
<text>
<paragraph>The Role of an attribute reflects the part played by that attribute in an ExampleSet. Changing the role of an attribute may change the part played by that attribute in a process. One attribute can have exactly one role. This operator is used to change the role of one or more attributes of the input ExampleSet. This is a very simple operator, all you have to do is to select an attribute and select a new role for it. Different learning operators require attributes with different roles. This operator is frequently used to set the right roles for attributes before applying the desired operator. The change in role is only for the current process, i.e. the role of the attribute is not changed permanently in the ExampleSet. The Set Role operator should not be confused with the Rename operator or Type Conversion operators. The Rename operator is used to change the name of an attribute. Many Type Conversion operators are available (at Data Transformation/Type conversion/) to change the type of attributes e.g. the Nominal to Binominal operator, the Numerical to Polynomial operator and many more.</paragraph>
<paragraph>Broadly roles are classified into two types i.e. regular and special. Regular attributes simply describe the examples. Regular attributes are usually used during learning processes. One ExampleSet can have numerous regular attributes. Special attributes are those which identify the examples separately. Special attributes have some specific task. Special roles are: label, id, prediction, cluster, weight, and batch. An ExampleSet can have numerous special attributes but one special role cannot be repeated. If one special role is assigned to more than one attribute in an ExampleSet, all these attributes will be dropped except the last one. This concept can be easily understood by studying the attached Example Process. Explanation of various roles is given in the parameters section.</paragraph>
</text>
<inputPorts>
<port name="example set" type="com.rapidminer.example.ExampleSet">This input port expects an ExampleSet. It is output of the Retrieve operator in our Example Process. Output of other operators may also be used as input. It is essential that meta data should be attached with the data for the input because the role of an attribute is specified in the meta data of the ExampleSet. The Retrieve operator provides meta data along with the data.</port>
</inputPorts>
<outputPorts>
<port name="example set" type="com.rapidminer.example.ExampleSet">The ExampleSet with modified role(s) is output of this port.</port>
<port name="original" type="com.rapidminer.example.ExampleSet">The ExampleSet that was given as input is passed without changing to the output through this port. This is usually used to reuse the same ExampleSet in further operators or to view the ExampleSet in the Results Workspace.</port>
</outputPorts>
<parameters>
<!-- description of the parameters and the corresponding values -->
<parameter key="name" type="string">The name of the attribute whose role should be changed is specified through this parameter. You can select the attribute either from the drop down list or type it manually.</parameter>
<parameter key="target_role" type="string" default="regular">The target role of the selected attribute is the new role assigned to it. Following target roles are possible:
<values>
<value value="regular">Attributes without a special role, i.e. those which simply describe the examples are called regular attributes and just leave out the role designation in most cases. Regular attributes are used as input variables for learning tasks.</value>
<value value="id">This is a special role, it acts as id attribute for the ExampleSet and it is usually unique in every example of the ExampleSet. The id role is used to clearly identify the examples of concerned ExampleSet. In this case the attribute adopts the role of an identifier and is called ID for short. Unique ids can be given to all the examples using the Generate ID operator.</value>
<value value="label">This is a special role, it acts as a target attribute for learning operators e.g. the Decision Tree operator. Labels identify the examples in any way and they must be predicted for new examples that are not yet characterized in such a manner. The label is also called 'goal variable'.</value>
<value value="prediction">This is a special role, it acts as predicted attribute of a learning scheme. For example when a predictive model is learnt through any learning operator and then it is applied using the Apply Model operator, in the output we have a new attribute with role <em>prediction</em> which holds the values of <em>label</em> predicted by the given model. The <em>label</em> and <em>prediction</em> attributes are also used for evaluating the performance of a model.</value>
<value value="cluster">This is a special role, it indicates the membership of an example of the ExampleSet to a particular cluster. For example, the output of the k-Mean operator adds a column with <em>cluster</em> role.</value>
<value value="weight">This is a special role, it indicates the weight of the examples with regard to the <em>label</em>. Weights are used in learning processes to give different importance to examples with different weights. Attribute weights are used in numerous operators e.g. the Select By Weights operator. Weights can also be used in evaluating the performance of models e.g. the Performance operator has a <em>use example weights</em> parameter to consider the <em>weight</em> of examples during the performance evaluation process.</value>
<value value="batch">This is a special role, it indicates the membership to an example batch.</value>
<value value="user defined">Any role can be provided by directly typing in the textbox instead of selecting a role from the dropdown menu. If 'ignore' is written in the textbox, that attribute will be ignored by the coming operators in the process. This is also a special role, thus it needs to be unique. To ignore multiple attributes unique roles can be assigned like ignore01, ignore02, igonre03 and so on.</value>
</values>
</parameter>
<parameter key="set_additional_roles" type="menu">Click this button to modify roles of more than one attribute. A click on this button opens a new menu which allows you to select any attribute and assign any role to it. It also allows assigning multiple roles to the same attribute. But, as an attribute can have exactly one role, only the last role assigned to that attribute is actually assigned to it and all previous roles assigned to it are ignored.</parameter>
<!-- ... -->
</parameters>
<tutorialProcesses>
<tutorialProcess key="process.set_role.setting_roles" title="Setting roles of attributes">
<description>
<paragraph>In this Example Process, the 'Labor-Negotiation' data set is loaded using the Retrieve operator. The roles of its attributes are changed using the Set Role operator. Here is an explanation of what happens when this process is executed:
<ul>
<li>
the attributes <em>name</em> and <em>shift-differential</em> are dropped because <em>standby-pay</em> is also given the <em>label</em> role. As <em>label</em> is a special role and only one attribute of the same special role can exist, the first attributes are dropped and the last attribute (standby-pay) is assigned to the <em>label</em> role.</li>
<li>
<em>duration</em> is assigned to <em>weight</em> role</li>
<li>
<em>wage-inc-1st, longterm-disability-assistance, pension, bereavement-assistance </em>and<em> wage-inc-2nd</em> are given a <em>regular</em> role. They were regular attributes even before the reassignment of the same role. Thus assigning the same role will not make any change. As there can be numerous regular attributes, no attribute is dropped.</li>
<li>
<em>wage-inc-3rd</em> and<em> working-hours</em> roles were not modified. Thus they retain their original roles i.e. <em> regular</em>.</li>
<li>
<em>col-adj</em> is assigned to <em>id</em> role.</li>
<li>
<em>education-allowance</em> is assigned to <em>batch</em> role.</li>
<li>
<em>statutory-holidays</em> and <em>vacations</em> are assigned to <em>ignore0</em> and <em>ignore1</em> roles respectively.</li>
<li>
<em>contrib-to-dental-plan</em> is assigned to <em>prediction</em> role.
<em>contrib-to-health-plan</em> is assigned to <em>cluster</em> role.</li>
</ul>
</paragraph>
<paragraph>Some attributes are dropped as explained earlier but note that the number of examples remains the same. Roles assigned in this Example Process were just to show how the Set Role operator works; in real scenarios such assignments of role may not be very useful. This also highlights another point that the Set Role operator is not context-aware. It assigns roles set by the users irrespective of its context. So users must have the knowledge of what role to be assigned in which scenario. Thanks to the Problems View and quick fixes, it becomes easy to set the right roles before applying different learning operators. Note that the Problems View displays two warnings even in this Example Process.</paragraph>
<!-- tutorialProcess description: What is done and shown here? You can use formated text here -->
</description>
<!-- Copy process from RapidMiner's XML view to here -->
<process version="5.1.011">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.1.011" expanded="true" name="Root">
<process expanded="true" height="584" width="596">
<operator activated="true" class="retrieve" compatibility="5.1.011" expanded="true" height="60" name="Retrieve" width="90" x="112" y="120">
<parameter key="repository_entry" value="//Samples/data/Labor-Negotiations"/>
</operator>
<operator activated="true" class="set_role" compatibility="5.1.011" expanded="true" height="76" name="Set Role" width="90" x="313" y="120">
<parameter key="name" value="bereavement-assistance"/>
<list key="set_additional_roles">
<parameter key="col-adj" value="id"/>
<parameter key="contrib-to-dental-plan" value="prediction"/>
<parameter key="contrib-to-health-plan" value="cluster"/>
<parameter key="duration" value="weight"/>
<parameter key="education-allowance" value="batch"/>
<parameter key="longterm-disability-assistance" value="regular"/>
<parameter key="pension" value="regular"/>
<parameter key="shift-differential" value="label"/>
<parameter key="standby-pay" value="label"/>
<parameter key="statutory-holidays" value="ignore0"/>
<parameter key="vacation" value="ignore1"/>
<parameter key="wage-inc-1st" value="regular"/>
<parameter key="wage-inc-2nd" value="regular"/>
</list>
</operator>
<connect from_op="Retrieve" from_port="output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_port="result 1"/>
<connect from_op="Set Role" from_port="original" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="90"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
</tutorialProcess>
</tutorialProcesses>
</operator>
</p1:documents>0 -
-
The icon is automatically retrieved via the operator key. It should look like this: operator.%NAMESPACE:%OPERATOR_KEY (e.g. 'operator.text:create_document').
Best,
Nils0 -
Hi,
thank you. I forgot the ''%NAMESPACE:' part.
Best,
Vaclav0