How to Extract Numbers from Text Mining

danong
danong New Altair Community Member
edited November 5 in Community Q&A
Hi,

i have tokenize and filtered out some words which left only numbers and english words,

then my problem now is i want to extract out both numbers and english words seperately and putting them in different results,

how can i achieve that?

Btw, i'm using text mining tool here, the file is in .txt format and is semi-structured.


Thanks for helping.
Tagged:

Answers

  • IngoRM
    IngoRM New Altair Community Member
    Hi,

    sorry, I did not get your point. Can you give us an example, best of the data before the desired transformation and what you would like to achieve?

    Cheers,
    Ingo
  • danong
    danong New Altair Community Member
    hi, thanks for reply,

    i had solved the problem actually.

    okay i will rephrase my problem here:


    i had a text file, for example : "Bobbie goes to school today in the morning at 8 oclock with his 30 packs of noodles."

    i would like to filter out english words (bobbie, goes, to ... etc) and as well numberings (8, 30)

    but i found that the filter only allow to do one thing only, either english word or numberings,
    but does not allow for filtering both.


    i could not find other way,
    but lastly i load the file 2 times, and do filtering seperately and i got it solved.


    thanks.
  • LiZeyuan
    LiZeyuan New Altair Community Member
    Hey, Mate

    I am a beginner of Rapidminer
    i am facing a similar issue that i want to extract the numerics from the text, eg:
    " the task finished at the year 2018" 
    I just need the numeric information " 2018". how to filter the words when tokenizing?

    Thanks 
    much appreciate 

  • IngoRM
    IngoRM New Altair Community Member
    Hi,
    There was similar discussion recently on the community: https://community.rapidminer.com/discussion/54737/how-to-extract-year-from-a-string
    Maybe this can give you some hints.
    Cheers,
    Ingo
  • kayman
    kayman New Altair Community Member
    You could have done it in one load also, and use the multiply operator. One port you use to filter 'number style strings', the other to do the opposite. 

    Same outcome of course but only one time dataload.
  • Ahmedte1234
    Ahmedte1234 New Altair Community Member
    How can I post question in this forum I need help very much
  • varunm1
    varunm1 New Altair Community Member
    Hello @Ahmedte1234

    Please see below screenshots. You have a big icon "Ask Question" on the top right of this community window. If you click that you can read some quick tips on posting question. You need to provide the title of the question and give a detailed version of your process and issue.



    Once you click this, you get the below screen. Read the three steps provided in the below screen and provide your detailed explanation of the issue.