how to use output from proc optimalbin

Michael_O_Neil
Michael_O_Neil Altair Community Member

I want to use the output datasets from proc optimalbin as input to woe encoding data prior to training a logistic model. I also need to use this for woe encoding incoming data for to prepare it for scoring using the logistic regression scorecode

Has anyone developed code to process the output datasets from proc optimalbin to generate if then else code or proc sql case statement code to create new variables by using the bin boundaries and the stats for each bin.

Any code you are willing to share ?

Tagged:

Answers

  • Nico Chart_21517
    Nico Chart_21517
    Altair Employee

    Hi Michael,

    I think our support team are working on a reply in our ticketing system.

    Nico

  • Michael_O_Neil
    Michael_O_Neil Altair Community Member

    Thanks Nico,

    Support are working on an issue with proc optimalbin reporting different numbers of bins across the different datasets it produces. I have isolated the cause to be some underlying behaviour of the processing by the proc when the dataset has no format attributes associated with the variable to be binned.

    What I was seeking here was code any member of this discussion group might have that could produce woe encoding if then else or case statements from the output datasets produced by proc optimalbin.

    The procedure does have the code file= option but this produces the bin encoding if then else statements. It would be nice to have an option that could tell this option to produce the woe encoding statements instead.

    I have since gone on to create a rough working version of a program to do this, but there are a lot of variations of the content in two output datasets that are used and I don't think I have them all covered yet.

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.