How to convert image data to structured data
Hello all,
I am working on a project on image and text mining and I want to know how to convert the image data to structured data.
I already download the image process extension and I found some useful information in this website
The power of machine learning for image mining and analytics
http://www.simafore.com/blog/the-power-of-machine-learning-for-image-mining-and-analytics?success=true
and
New case study: Image mining and unstructured data science
http://www.simafore.com/blog/new-case-study-image-mining-and-unstructured-data-science?success=true
Please can anyone help me to figure out what is inside the loop file operator. I need to know how did they convert the image data to structured data. I have spent more than 4 months working on my final project but I couldn't finish it because I'm stuck on that point.
Thanks for any help,
I am working on a project on image and text mining and I want to know how to convert the image data to structured data.
I already download the image process extension and I found some useful information in this website
The power of machine learning for image mining and analytics
http://www.simafore.com/blog/the-power-of-machine-learning-for-image-mining-and-analytics?success=true
and
New case study: Image mining and unstructured data science
http://www.simafore.com/blog/new-case-study-image-mining-and-unstructured-data-science?success=true
Please can anyone help me to figure out what is inside the loop file operator. I need to know how did they convert the image data to structured data. I have spent more than 4 months working on my final project but I couldn't finish it because I'm stuck on that point.
Thanks for any help,
Find more posts tagged with
Sort by:
1 - 4 of
41
@sgenzer it is compatible with RM 7.
If you want to play with it you can get it here: http://www.burgsys.com/
I'll try out the Watson API, for much of my work sending data to cloud services isn't something that can be done, but perhaps it solves the original poster's problem.
If you want to play with it you can get it here: http://www.burgsys.com/
I'll try out the Watson API, for much of my work sending data to cloud services isn't something that can be done, but perhaps it solves the original poster's problem.
Thank you @sgenzer and @JEdward
Yes, Image Processing extension is no longer appears in the marketplace, but it is compatible with RM 7.
Thanks a lot @sgenzer for your suggestions. I am interesting to try them, but I think they work very well with data at web, while I need to work with images from my computer.
Thanks @JEdward for this useful website.
Actually, I found the B-Designer extension, which includes all features that I need. But I couldn't get it until they send it to me. So, I contacted them and I am still waiting for their response.
I am still looking for how can I do OCR on images to get the text.
Thank you again,
I have not had much need for OCR but again I would suggest using the RapidMiner "Enrich Data by Webservice" operator (under the Web Mining extension) to call an external API. There are very good sources out there - a quick search found that Google has a free OCR API: https://cloud.google.com/vision/
Here is an example of a Enrich Data by Webservice operator that connects with the Google Maps API. I have deleted my API key which you would need to replace with your own to see this working. But you should get the idea.
Here is an example of a Enrich Data by Webservice operator that connects with the Google Maps API. I have deleted my API key which you would need to replace with your own to see this working. But you should get the idea.
Scott
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="7.1.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.1.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="web:enrich_data_by_webservice" compatibility="7.0.000" expanded="true" height="68" name="Google Maps Distance Lookup" width="90" x="313" y="34">
<parameter key="query_type" value="XPath"/>
<list key="string_machting_queries"/>
<list key="regular_expression_queries"/>
<list key="regular_region_queries"/>
<list key="xpath_queries">
<parameter key="Distance" value="//distance/text/text()"/>
</list>
<list key="namespaces"/>
<parameter key="assume_html" value="false"/>
<list key="index_queries"/>
<list key="jsonpath_queries"/>
<parameter key="service_method" value="fgfgfgf"/>
<parameter key="body" value="text=<%title%>"/>
<parameter key="url" value=";"/>
<parameter key="delay" value="150"/>
<list key="request_properties">
<parameter key="key" value="mykey"/>
</list>
</operator>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
</process>
</operator>
</process>
Good luck.
Scott