Saving a data step as a view

IanBD
IanBD
Altair Employee
edited September 2022 in Altair RapidMiner

A DATA step view is a DATA step that has been saved in a format that enables it to be run and used as a data source by other DATA steps or procedures. The data output by the DATA step view is read as if it is a dataset.

There are many reasons you might want to use a DATA step view. You can use DATA step views to create algorithmic sources of data; you could, for example, create a view that return numbers in the Fibonacci sequence. Alternatively, you might want to transform data in some way before use but prefer not to store the transformed data permanently.

To create a DATA step view, you specify the /VIEW keyword to the DATA statement. The statement required to create a view has the format:

DATA ds_name /VIEW=ds_name

where ds_name specifies both the output dataset and the name of the view. The name of the view must be the same as name of the dataset that would be created by the DATA step. The dataset cannot be a null dataset created using the _NULL_ keyword, because the name of the DATA step is used to recall the view and read the data, and _NULL_ is not a valid data source name.

For example, the following DATA step creates a view called myview in the default Work library:

DATA myview /VIEW=myview;   a = 1; RUN;

You can save the view in any folder or directory by using the LIBNAME global statement to specify a library. For example, you could specify:

LIBNAME temp 'c:\temp'; DATA temp.myview /VIEW=temp.myview;   a = 1; RUN;

The view is stored in the folder c:\temp.

You can then use the DATA step view as an input dataset. For example:

DATA newds;   SET temp.myview;
RUN;

This DATA step locates the DATA step view temp.myview and then executes it to generate the required data. It creates a new dataset newds that contains the data created in the DATA step view.

You can also use the view as the data source for a procedure:

PROC PRINT DATA=temp.myview; RUN;

This prints the data generated in the view temp.myview to the listing file created by the procedure.

In Workbench, the data generated by a DATA step view can be viewed in the dataset viewer. Find the view in the in the Work library, or the library you specified, and double-click it. The dataset viewer opens and displays the data created by the DATA step view.

The examples above show how the data generated in the DATA step is then used as the input to other DATA steps or procedures. In the examples, the data was a value specified to a variable. You can, however, create a DATA step of any complexity to create the data generated by a view.

In the following example, a Microsoft Excel file is read. Only rows where the value of the variable meets a specified condition are read into the DATA step. The view is stored in the specified library temp.

LIBNAME books XLSX 'c:\temp\books\lib_books.xlsx; LIBNAME temp 'c:\temp';  DATA temp.myview /view=temp.myview;   SET books.libbooks;   WHERE dewey_decimal_number EQ '823'; RUN;

Only a row in the workbook that has a Dewey Decimal number equal to 823 is read at each iteration of the DATA step.

You could then use this DATA step view to create further subsets of the data:

LIBNAME books XLSX 'c:\temp\books\lib_books.xlsx';
LIBNAME temp 'c:\temp'; DATA out; SET temp.myview;
WHERE author EQ 'Rendell, Ruth'; RUN;

Note that you need to execute the LIBNAME statements again if you run the DATA set view in a new session, such as from the command line, to make the Excel file available to the DATA step view, and the DATA step view available to this DATA step.

Because a DATA step view can be used to create data algorithmically, it is possible that the stream of data created might never end (for example, the Fibonacci series is infinite); you should therefore be careful when reading data from such a view that only the required data is read and that the DATA step can terminate.