"vectorize text features but do regression on a numerical column?"

User: "Legacy User"
New Altair Community Member
Updated by Jocelyn
I've looked through the forums, documentation, and examples, but can't figure out how to learn a numerical function on text features. That is, the data has text columns and a numeric column. I want to do feature extraction/vectorization on the text columns in order to predict the numerical column.

There is an example in the Text mining samples that shows how to create an example set that contains both text and other columns. Great.

The problem is how to set a regression learner to use the numeric column as the target?

I'm pulling data from a database, so use StringTextInput. If I set the numerical column as the 'label' (using operator attributes) so that it's the target for the learner, the text processor complains:

      [Fatal] Process failed: The label attribute (#3: views (integer/single_value)) must be nominal for wvtool.

If I don't set the 'label' attribute as the target column, then the learner complains:

      [Fatal] UserError occured in 1st application of JMySVMLearner (JMySVMLearner)
      [Fatal] Process failed: Input example set does not have a label attribute

So,
1) how do I get the text processor to let the numerical column pass without modification as the target?

2) how do I specify which column is used as the target for regression (not classification) learning?

Thanks,
Gary

Find more posts tagged with