🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Coreference resolution with RapidMiner: how to begin?

User: "maciej_ogrodnic"
New Altair Community Member
Updated by Jocelyn
Dear All,

I was playing with RM for some time, but it's time to do something real now – and I don't quite know how to proceed. The task is direct nominal coreference resolution, i.e. clustering together sets of mentions from the text given a series of documents with properly clustered mentions.

To make it as simple as possible, I guess we can exclude text processing from the whole process and have the data represented as a table with tokens in rows and attributes in columns (attributes containing the usual properties, starting with gender, number – up to some more complex ones).

Issue 1: does such representation make sense? How can we represent different documents (with another attribute, doc number?) and clusters (with cluster number?) How validation should be organized? If we have documents as samples, not just tokens, how should the clusters be represented? Please advise.

Issue 2: how should the process be organized to make it work? Can you suggest anything?

Best,
Andreas

Find more posts tagged with

Comments

No comments on this post.