draw scatter plot for cluster using execute python
halaalrobassy
New Altair Community Member
i used kmeans cluster operator to cluster two columns and i want to draw the scatter plot for the resulting clusters. i tried using executer python operator but its input parameters doesn't accept text as cluster. can you tell me the way i can do it please
Tagged:
0
Answers
-
Hi,You really do not need to use Python for the plotting to be honest. I have attached a process below which generates two columns and clusters the data. At the end, you can simply click on Visualizations and set up the chart as desired (see below):The process is below.Hope this helps,
Ingo<?xml version="1.0" encoding="UTF-8"?><process version="9.3.000-SNAPSHOT"><br> <context><br> <input/><br> <output/><br> <macros/><br> </context><br> <operator activated="true" class="process" compatibility="9.3.000-SNAPSHOT" expanded="true" name="Process"><br> <parameter key="logverbosity" value="init"/><br> <parameter key="random_seed" value="2001"/><br> <parameter key="send_mail" value="never"/><br> <parameter key="notification_email" value=""/><br> <parameter key="process_duration_for_mail" value="30"/><br> <parameter key="encoding" value="UTF-8"/><br> <process expanded="true"><br> <operator activated="true" class="generate_data" compatibility="9.3.000-SNAPSHOT" expanded="true" height="68" name="Generate Data" width="90" x="45" y="34"><br> <parameter key="target_function" value="gaussian mixture clusters"/><br> <parameter key="number_examples" value="1000"/><br> <parameter key="number_of_attributes" value="2"/><br> <parameter key="attributes_lower_bound" value="-10.0"/><br> <parameter key="attributes_upper_bound" value="10.0"/><br> <parameter key="gaussian_standard_deviation" value="10.0"/><br> <parameter key="largest_radius" value="10.0"/><br> <parameter key="use_local_random_seed" value="false"/><br> <parameter key="local_random_seed" value="1992"/><br> <parameter key="datamanagement" value="double_array"/><br> <parameter key="data_management" value="auto"/><br> </operator><br> <operator activated="true" class="select_attributes" compatibility="9.3.000-SNAPSHOT" expanded="true" height="82" name="Select Attributes" width="90" x="179" y="34"><br> <parameter key="attribute_filter_type" value="single"/><br> <parameter key="attribute" value="label"/><br> <parameter key="attributes" value=""/><br> <parameter key="use_except_expression" value="false"/><br> <parameter key="value_type" value="attribute_value"/><br> <parameter key="use_value_type_exception" value="false"/><br> <parameter key="except_value_type" value="time"/><br> <parameter key="block_type" value="attribute_block"/><br> <parameter key="use_block_type_exception" value="false"/><br> <parameter key="except_block_type" value="value_matrix_row_start"/><br> <parameter key="invert_selection" value="true"/><br> <parameter key="include_special_attributes" value="true"/><br> </operator><br> <operator activated="true" class="concurrency:k_means" compatibility="9.3.000-SNAPSHOT" expanded="true" height="82" name="Clustering" width="90" x="313" y="34"><br> <parameter key="add_cluster_attribute" value="true"/><br> <parameter key="add_as_label" value="false"/><br> <parameter key="remove_unlabeled" value="false"/><br> <parameter key="k" value="4"/><br> <parameter key="max_runs" value="10"/><br> <parameter key="determine_good_start_values" value="true"/><br> <parameter key="measure_types" value="BregmanDivergences"/><br> <parameter key="mixed_measure" value="MixedEuclideanDistance"/><br> <parameter key="nominal_measure" value="NominalDistance"/><br> <parameter key="numerical_measure" value="EuclideanDistance"/><br> <parameter key="divergence" value="SquaredEuclideanDistance"/><br> <parameter key="kernel_type" value="radial"/><br> <parameter key="kernel_gamma" value="1.0"/><br> <parameter key="kernel_sigma1" value="1.0"/><br> <parameter key="kernel_sigma2" value="0.0"/><br> <parameter key="kernel_sigma3" value="2.0"/><br> <parameter key="kernel_degree" value="3.0"/><br> <parameter key="kernel_shift" value="1.0"/><br> <parameter key="kernel_a" value="1.0"/><br> <parameter key="kernel_b" value="0.0"/><br> <parameter key="max_optimization_steps" value="100"/><br> <parameter key="use_local_random_seed" value="false"/><br> <parameter key="local_random_seed" value="1992"/><br> </operator><br> <connect from_op="Generate Data" from_port="output" to_op="Select Attributes" to_port="example set input"/><br> <connect from_op="Select Attributes" from_port="example set output" to_op="Clustering" to_port="example set"/><br> <connect from_op="Clustering" from_port="clustered set" to_port="result 1"/><br> <portSpacing port="source_input 1" spacing="0"/><br> <portSpacing port="sink_result 1" spacing="0"/><br> <portSpacing port="sink_result 2" spacing="0"/><br> </process><br> </operator><br></process>
3 -
thank you so much
but i want to make 3 dimensional visualization, can i do it without using python code0 -
Yes, that is possible as well. You simply select "Scatter 3D" as the plot type.
1