prediction modeling for text analysis
I am trying to perform a prediction modeling of text resources. I chose 272 training resource and 116 as test ones. But only 190 from the training ones and 80 from the test ones got modeled and results about their accuracy, precision and recalls values were shown. But I want to get those results for all the data. Please help.
Best Answers
-
I don't understand exactly what you want to do and what you performed exactly.
Your training and test dataset are both labeled ?
But given the information given, I suggest you to perform a cross validation with your 272 training ressources to build a model ==>
you will have the performance (accuracy, recall, precision) of your model based on your 272 training ressources.
and then to apply this model to your 116 (labeled ?) test ressources with a performance operator. =>So you can measure the performance
of your builded model on "unseen" data. THe process looks like this :
or
you can perform a cross validation with your 388 ressources (272 training + 116 test) to build a better model ==>
you will have the performance (accuracy, recall, precision) of your model based on your 388 "training" ressources.
and you can apply this model to future unseen data. The process looks like that :
For a better response, can you share your process and your data source, please ?
Regards,
Lionel
1 -
Hi @lambamanika07 again,
to complete my response, the sub-process cross validation looks like that :
Regards,
Lionel
1
Answers
-
Are you using Cross Validation? Post your proess using the < / > option.
0 -
-
I don't understand exactly what you want to do and what you performed exactly.
Your training and test dataset are both labeled ?
But given the information given, I suggest you to perform a cross validation with your 272 training ressources to build a model ==>
you will have the performance (accuracy, recall, precision) of your model based on your 272 training ressources.
and then to apply this model to your 116 (labeled ?) test ressources with a performance operator. =>So you can measure the performance
of your builded model on "unseen" data. THe process looks like this :
or
you can perform a cross validation with your 388 ressources (272 training + 116 test) to build a better model ==>
you will have the performance (accuracy, recall, precision) of your model based on your 388 "training" ressources.
and you can apply this model to future unseen data. The process looks like that :
For a better response, can you share your process and your data source, please ?
Regards,
Lionel
1 -
Hi @lambamanika07 again,
to complete my response, the sub-process cross validation looks like that :
Regards,
Lionel
1 -
@lambamanika07 i would not build a text classification model as you've shown. I would do it like @lionelderkrikor shows. Also, if the LinearSVM doesn't show good results, I would try a Naive Bayes and/or Deep Learning. You could even use a Stacking or Voting operator.
1