Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Patents mining
pfb
Hello,
I'm a real newbie and am posting here to ask for help.
In the context of a patent set analysis, I got an extraction (csv/xlsx) of a list of patents, in a semi-structured format: in rows, I have patents, in columns, attributes (aka patent title, abstract, novelty, etc.).
Given the large size of the patent set (>6500 hits), I would like to automate the patent analysis as follows:
1- identify topics (keywords) for each patent
2- cluster patents based on these keywords
3- display clusters with their respective weights
I assume that 1 and 2 can be done through Rapidminer, while 3 could be done with Gephi. But it is only an assumption, as I am a real beginner here: I have never used Rapidminer.
Therefore, any indication on feasibility/guidance on how to start would be really appreciated.
Thank you,
Peter
Find more posts tagged with
AI Studio
Accepted answers
All comments
fras
Hi Peter,
indeed there exists a couple of projects
where RapidMiner is the key tool to analyse patent
data. Using the text mining extension documents can be tokenized and
clustered based on word vectors. It doesnt matter whether your
documents/patents are spread over a file system or already put into
an excel sheet/data base.
Especially TF-IDF transformation and n-Grams are used to segment patents effectivley.
We offer a training on this at 21./22.5.2014 in Dortmund.
- Frank
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups