"How to impute missing values"
ccapra
New Altair Community Member
I have a survey dataset.
The survey design allows people to enter information about a single event more than once without repeating some details such as the host's contact info, and the event name & date.
This creates rows where some columns have missing data where the missing data is essentially the same data as in the same column's previous row.
Like this:
Sally Smith sally.smith@email.com Jan 1 2012 Special Event Downtown One cool thing about the event
Joe Shchoe joe.sch@email.com Feb 2 2012 Dumb Event Riverside One cool thing about the event
Another cool thing about the event
Joe had a lot to say about this event
Betty Boop betty.boop@email.com Jan 5 2012 Odd Event Out in the Boonies One mildly cool thing about the event
********
So as you can see - Joe Schloe entered 3 rows of data & only had to put in redundant info once - and now I need to impute the value of the missing cells to the data above. (i.e. copy Joe's contact & event data from the second row into the third & fourth rows.
I'm very new to RM & have only used some simple operators and never worked with either a subprocess nor with a 'learner' - but I think I need to use the 'impute missing values' process here - is that right?
And if so - how do I proceed? (and - I don't know how it's supposed to look, but when I go into the impute missing values operator, there is nothing 'inside' it - I sorta thought it would have the subprocesses contained within, but it does not - so, am I just mis-expecting, or is there something wrong with my program?
Or - should I create a macro? something like 'if a cell is empty, copy the cell from above'? If so, how would I do that?
Thanks!
The survey design allows people to enter information about a single event more than once without repeating some details such as the host's contact info, and the event name & date.
This creates rows where some columns have missing data where the missing data is essentially the same data as in the same column's previous row.
Like this:
Sally Smith sally.smith@email.com Jan 1 2012 Special Event Downtown One cool thing about the event
Joe Shchoe joe.sch@email.com Feb 2 2012 Dumb Event Riverside One cool thing about the event
Another cool thing about the event
Joe had a lot to say about this event
Betty Boop betty.boop@email.com Jan 5 2012 Odd Event Out in the Boonies One mildly cool thing about the event
********
So as you can see - Joe Schloe entered 3 rows of data & only had to put in redundant info once - and now I need to impute the value of the missing cells to the data above. (i.e. copy Joe's contact & event data from the second row into the third & fourth rows.
I'm very new to RM & have only used some simple operators and never worked with either a subprocess nor with a 'learner' - but I think I need to use the 'impute missing values' process here - is that right?
And if so - how do I proceed? (and - I don't know how it's supposed to look, but when I go into the impute missing values operator, there is nothing 'inside' it - I sorta thought it would have the subprocesses contained within, but it does not - so, am I just mis-expecting, or is there something wrong with my program?
Or - should I create a macro? something like 'if a cell is empty, copy the cell from above'? If so, how would I do that?
Thanks!
Tagged:
0
Answers
-
If you always want to replace missing values with the first non-empty row above the current row you can follow these steps:
1. install the Series Extension from the marketplace (Tools -> Updates and Extensions)
2. use the operator Replace Missing Values (Series) with replacement set to "previous value"
Best regards,
Marius0