Community & Support
Learn
Marketplace
Discussions
Categories
Discussions
General
Platform
Academic
Partner
Regional
User Groups
Documentation
Events
Altair Exchange
Share or Download Projects
Resources
News & Instructions
Programs
YouTube
Employee Resources
This tab can be seen by employees only. Please do not share these resources externally.
Groups
Join a User Group
Support
Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Altair RapidMiner
"[Text Mining] How to feed SGML format file into dictionary?"
geamksj
Hello, is there any way to load SGML-fomated file into TextInput operator? If not, are there some ways to convert SGML file into other formats to be loaed in TextInput operator? The loaed file will be combined with the dictionary as the following big picture of my experiment below:
<< Big Picture >>
(1) Documents ==> (2) Dictionary Creation ==> (3) Text Representation (based on either the number of the most frequently occurring words in the documents or Boolean, the exisistence of whether a specific topic words are appearing in the documents) ==> (4) Model Induction (e.g. rule-based induction) ==> (5) Document Classfication Rules
The input file is Reuters-21578 Text Catergorization Collection Data Set from UCI Machine Learning Repository, and the data set files are formated with SGM file tag.
Find more posts tagged with
AI Studio
Text Mining + NLP
Comments
There are no comments yet
Quick Links
All Categories
Recent Discussions
Activity
My Discussions
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups