Best Of
Re: The type of attribute is Chinese text, how to segment attribute value for word cloud analysis
Hi @ners,
For Chinese Mandarin text mining, there is an extension to the marketplace called HanMiner.
Best,
Cesar
For Chinese Mandarin text mining, there is an extension to the marketplace called HanMiner.
Best,
Cesar
Caperez
5
Re: Column flagged
Hi!
This means exactly what it says. If the attribute "race" (meaning groups of persons, not sport events) is built into a model, that could be used for discrimination against people.
Similar warnings exist for attributes named like Gender, Age etc.
This is just a message to the user. RapidMiner will process the data normally.
Regards,
Balázs
This means exactly what it says. If the attribute "race" (meaning groups of persons, not sport events) is built into a model, that could be used for discrimination against people.
Similar warnings exist for attributes named like Gender, Age etc.
This is just a message to the user. RapidMiner will process the data normally.
Regards,
Balázs
Altair RapidMiner AI Studio competition: Solar Panel Power Generation Forecast
Join us for a fun competition using Altair RapidMiner AI Studio. The task is to create a model predicting the amount of solar power energy which will be generated one day in advance based on the Weather forecast. Participants will be competing for a ranking based on the prediction error between the predicted value and the actual measured value.
This competition is hosted by Altair's long time Japanese partner KSK Analytics on the Kaggle community. You must verify your account in Kaggle to participate. All the details for the competition can be found here.
The competition is open until March 31, 2024 and there will be recognition and prizes for the highest ranking participants.
Participate Now
This competition is hosted by Altair's long time Japanese partner KSK Analytics on the Kaggle community. You must verify your account in Kaggle to participate. All the details for the competition can be found here.
The competition is open until March 31, 2024 and there will be recognition and prizes for the highest ranking participants.
Participate Now
Jocelyn
5
"Xpath - extract text from unordered list (ul)"
Hi All,
I want to extract all list items (li) from an unordered list (ul) by xpath query.
The website is: http://ec.europa.eu/sanco_pesticides/public/index.cfm?event=substance.info&;id=53 ; I have to extract the content below the footnotes.
With the following xpath queries I always get only the first list item ("(R) = The residue definition differs for..."):
- //h:div[@class='col60']/h:ul/text()
or
- //h:div[@class='col60']/h:ul/descendant::*/text()
Does anyone has an idea how I can get the whole list?
Thanks a lot in advance!
I want to extract all list items (li) from an unordered list (ul) by xpath query.
The website is: http://ec.europa.eu/sanco_pesticides/public/index.cfm?event=substance.info&;id=53 ; I have to extract the content below the footnotes.
With the following xpath queries I always get only the first list item ("(R) = The residue definition differs for..."):
- //h:div[@class='col60']/h:ul/text()
or
- //h:div[@class='col60']/h:ul/descendant::*/text()
Does anyone has an idea how I can get the whole list?
Thanks a lot in advance!
currant
5
Re: Operator Crawl: Process failed
Hi all,
I get exactly the same error message with version 10.2. (works with version 9.x)
May be there is something missing in the Java bundle ??
Any hint is appreciated.
Regards, Günther
The .rpm is as follows:
<?xml version="1.0" encoding="UTF-8"?><process version="10.2.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="10.2.000" expanded="true" name="Process">
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="2001"/>
<parameter key="send_mail" value="never"/>
<parameter key="notification_email" value=""/>
<parameter key="process_duration_for_mail" value="30"/>
<parameter key="encoding" value="UTF-8"/>
<process expanded="true">
<operator activated="true" class="web:crawl_web_modern" compatibility="10.0.000" expanded="true" height="68" name="Crawl Web" width="90" x="179" y="34">
<parameter key="url" value="https://osa.fh-potsdam.de"/>
<list key="crawling_rules"/>
<parameter key="max_crawl_depth" value="1"/>
<parameter key="retrieve_as_html" value="true"/>
<parameter key="enable_basic_auth" value="false"/>
<parameter key="add_content_as_attribute" value="false"/>
<parameter key="write_pages_to_disk" value="true"/>
<parameter key="include_binary_content" value="true"/>
<parameter key="output_dir" value="E:/_tmp"/>
<parameter key="output_file_extension" value="html"/>
<parameter key="max_pages" value="10"/>
<parameter key="max_page_size" value="1000"/>
<parameter key="delay" value="200"/>
<parameter key="max_concurrent_connections" value="100"/>
<parameter key="max_connections_per_host" value="50"/>
<parameter key="user_agent" value="rapidminer-web-mining-extension-crawler"/>
<parameter key="ignore_robot_exclusion" value="false"/>
</operator>
<connect from_op="Crawl Web" from_port="example set" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="10.2.000" expanded="true" name="Process">
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="2001"/>
<parameter key="send_mail" value="never"/>
<parameter key="notification_email" value=""/>
<parameter key="process_duration_for_mail" value="30"/>
<parameter key="encoding" value="UTF-8"/>
<process expanded="true">
<operator activated="true" class="web:crawl_web_modern" compatibility="10.0.000" expanded="true" height="68" name="Crawl Web" width="90" x="179" y="34">
<parameter key="url" value="https://osa.fh-potsdam.de"/>
<list key="crawling_rules"/>
<parameter key="max_crawl_depth" value="1"/>
<parameter key="retrieve_as_html" value="true"/>
<parameter key="enable_basic_auth" value="false"/>
<parameter key="add_content_as_attribute" value="false"/>
<parameter key="write_pages_to_disk" value="true"/>
<parameter key="include_binary_content" value="true"/>
<parameter key="output_dir" value="E:/_tmp"/>
<parameter key="output_file_extension" value="html"/>
<parameter key="max_pages" value="10"/>
<parameter key="max_page_size" value="1000"/>
<parameter key="delay" value="200"/>
<parameter key="max_concurrent_connections" value="100"/>
<parameter key="max_connections_per_host" value="50"/>
<parameter key="user_agent" value="rapidminer-web-mining-extension-crawler"/>
<parameter key="ignore_robot_exclusion" value="false"/>
</operator>
<connect from_op="Crawl Web" from_port="example set" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
gneher
5
Operator Crawl: Process failed
Hi,
I have installed the lastest version 10.1.001 and I have a problem with the operator Crwal.
The process fail and here the error message.
I have checked the version of Java and the version is 1.8.0_361
Thanks
I have installed the lastest version 10.1.001 and I have a problem with the operator Crwal.
The process fail and here the error message.
I have checked the version of Java and the version is 1.8.0_361
- Exception: java.lang.NoClassDefFoundError
- Message: org/apache/tika/parser/html/HtmlParser
- Stack trace:
- edu.uci.ics.crawler4j.parser.TikaHtmlParser.(TikaHtmlParser.java:34)
- edu.uci.ics.crawler4j.parser.Parser.(Parser.java:42)
- edu.uci.ics.crawler4j.crawler.CrawlController.(CrawlController.java:85)
- com.rapidminer.operator.web.crawler.CrawlerOperator.doWork(CrawlerOperator.java:269)
- com.rapidminer.operator.Operator.execute(Operator.java:1024)
- com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:77)
- com.rapidminer.operator.ExecutionUnit$2.run(ExecutionUnit.java:804)
- com.rapidminer.operator.ExecutionUnit$2.run(ExecutionUnit.java:799)
- java.base/java.security.AccessController.doPrivileged(Native Method)
- com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:799)
- com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:423)
- com.rapidminer.operator.Operator.execute(Operator.java:1024)
- com.rapidminer.Process.executeRoot(Process.java:1476)
- com.rapidminer.Process.lambda$executeRootInPool$5(Process.java:1452)
- com.rapidminer.studio.concurrency.internal.AbstractConcurrencyContext$AdaptedCallable.exec(AbstractConcurrencyContext.java:362)
- java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
- java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
- java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
- java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
- java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
Thanks
domo
5
Re: The attribute A) was already present in the example set.
Hi @akos_banar,
I think it is a known bug: the root cause is the parenthesis in the attribute name.
Please remove this parenthesis in the attribute(s) name before submitting your data to AutoModel.
Hope this helps,
Regards,
Lionel
I think it is a known bug: the root cause is the parenthesis in the attribute name.
Please remove this parenthesis in the attribute(s) name before submitting your data to AutoModel.
Hope this helps,
Regards,
Lionel
Re: Time Series Prediction (Forecast into the Future)
Hi @timothy_rij!
Time series forecasting processes are done differently from the usual classification or regression methods.
There's an entire course in the Academy about forecasting:
https://academy.rapidminer.com/courses/time-series-analytics
Please check this out.
Your process does a windowing for 7 days. This builds a model that can only predict the next day's value given the last 7 values. This could be extended in theory to learn a model from the last 200 days, with days -200 to -101 being the input attributes and days -100 to -1 the labels. These models could be applied to current data in a rolling manner. But this is cumbersome and you won't get good results from these models in the stock market setting, as stock prices depend on more factors than just the past values.
Regards,
Balázs
Time series forecasting processes are done differently from the usual classification or regression methods.
There's an entire course in the Academy about forecasting:
https://academy.rapidminer.com/courses/time-series-analytics
Please check this out.
Your process does a windowing for 7 days. This builds a model that can only predict the next day's value given the last 7 values. This could be extended in theory to learn a model from the last 200 days, with days -200 to -101 being the input attributes and days -100 to -1 the labels. These models could be applied to current data in a rolling manner. But this is cumbersome and you won't get good results from these models in the stock market setting, as stock prices depend on more factors than just the past values.
Regards,
Balázs
Model deployments for RapidMiner Studio 10
Hi,
I am new to RapidMiner products. I have gone through the tutorials to build predictive models in RapidMiner Studio 10. I want to use the best model to make predictions on a new dataset. I thought I need to deploy the model. But in RapidMiner Studio 10, not like in RapidMiner Studio 9, there is no Deployments tab (view). Could someone direct me on how to deploy the model and use it to make predictions in RapidMiner Studio 10?
Thanks,
xc
I am new to RapidMiner products. I have gone through the tutorials to build predictive models in RapidMiner Studio 10. I want to use the best model to make predictions on a new dataset. I thought I need to deploy the model. But in RapidMiner Studio 10, not like in RapidMiner Studio 9, there is no Deployments tab (view). Could someone direct me on how to deploy the model and use it to make predictions in RapidMiner Studio 10?
Thanks,
xc
chenx
5
Re: What are the most important attributes that distinguish 3 nominal labels from each other?
Hi!
I don't have a readily available sample workflow. This is a complex process. But you're trying to solve a complex problem, so that's expected.
The outer loop (for the samples) could be a plain Loop operator.
Inside that I would use Loop Attributes and select the three possible labels as the attributes to loop on.
Inside that loop, you could use another Loop and Select Subprocess with the different learning operators to get the weights.
At the end of most loops you'll receive a Collection of tables. You can use Append to convert these collections of tables to simpler tables.
Regards,
Balázs
I don't have a readily available sample workflow. This is a complex process. But you're trying to solve a complex problem, so that's expected.
The outer loop (for the samples) could be a plain Loop operator.
Inside that I would use Loop Attributes and select the three possible labels as the attributes to loop on.
Inside that loop, you could use another Loop and Select Subprocess with the different learning operators to get the weights.
At the end of most loops you'll receive a Collection of tables. You can use Append to convert these collections of tables to simpler tables.
Regards,
Balázs