SAS JSON libname engine support

LexJansen
LexJansen Altair Community Member
edited October 11 in Community Q&A

Executing on  Altair SLC 2023 - Community Edition (05.23.01.00.000106), trying to use the JSON libname engine in SAS.
I get an error. Is this engine not supported? Are there plans to support?

4         libname xxx '/altair/dataset-json'; /* SAS library */
NOTE: Library xxx assigned as follows:
      Engine:        WPD
      Physical Name: C:\altair\dataset-json

5         libname ex32 json '/altair/dataset-json/json_out/adam\adtte.json'
                  ^
ERROR: * Line 1, Column 32758
         Syntax error: value, object or array expected.

6         map='user32.map' automap=create ordinalcount=all; /* JSON file */
7         proc copy in=ex32 out=xxx;
                            ^
ERROR: Library "ex32" cannot be found
NOTE: Procedure COPY was not executed because of errors detected
8         run;

 

 

Answers

  • Nico Chart_21517
    Nico Chart_21517
    Altair Employee
    edited June 2023

    Hi Lex,

    You should find that PROC JSON and LIBNAME JSON are both working in WPS/SLC 5.23.1.

    The error message you are seeing "value, object or array expected" says that the format of the JSON file does not meet the requirements for being read by our LIBNAME JSON engine. Are you able to show us some of the top level structure of this JSON file? Does the opening line begin with "{" left brace character for example?

    Best Regards,
    Nico

  • Philip Holland
    Philip Holland Altair Community Member
    edited June 2023

    Hi Lex,

    What happens if you put LIBNAME EX32 above LIBNAME XXX, and are more explicit about your locations by prefixing them with 'C:'?

    The default folder can't be changed in the Workbench and defaults to your current Workbench folder.

    ............Phil

  • LexJansen
    LexJansen Altair Community Member
    edited June 2023

    Hi Lex,

    What happens if you put LIBNAME EX32 above LIBNAME XXX, and are more explicit about your locations by prefixing them with 'C:'?

    The default folder can't be changed in the Workbench and defaults to your current Workbench folder.

    ............Phil

    Thanks, Phil.

    No that is not the issue. I found the issue. See my other response.

    Lex

  • LexJansen
    LexJansen Altair Community Member
    edited June 2023

    Hi Lex,

    You should find that PROC JSON and LIBNAME JSON are both working in WPS/SLC 5.23.1.

    The error message you are seeing "value, object or array expected" says that the format of the JSON file does not meet the requirements for being read by our LIBNAME JSON engine. Are you able to show us some of the top level structure of this JSON file? Does the opening line begin with "{" left brace character for example?

    Best Regards,
    Nico

    Hi Nico,

    Thanks! I found the issue.
    The JSON file is valid (see: https://github.com/lexjansen/dataset-json-sas/blob/master/json/adam/adtte.json).

    I did not see earlier that the code actually continues after the error, and does create the datasets.
    The error seems to happen because the JSON file is just one long record.
    When I  format and indent the JSON file, I do not get the issue.
    I think this should be considered a bug.

    Thanks again!

    Best,
    Lex

  • Nico Chart_21517
    Nico Chart_21517
    Altair Employee
    edited June 2023

    Thanks Lex. I'll raise a bug report with our developers and let you know how it is received.
    Best Regards,
    Nico

  • Nico Chart_21517
    Nico Chart_21517
    Altair Employee
    edited June 2023

    Hi Lex,

    The original JSON file is a single line with 56694 chars. With shorter JSON examples consisting of one long line, I can get SLC to read it in by setting LRECL=32767. However we can't set LRECL to 56694.

    So, it appears that SLC's JSON engine is limited by LRECL line length, and you were able to read your JSON file by reformatting it and thereby ensuring that no lines were longer than LRECL.

    Best Regards,
    Nico

     

  • LexJansen
    LexJansen Altair Community Member
    edited June 2023

    Hi Nico.

    Reformatting is not a solution, especially now that the FDA is starting a Dataset-JSON pilot, and they have indicated that file sizes are an issue. So, we need to make the JSON files as small as possible, by using the NOPRETTY option on PROC JSON.

    Thanks, Lex

  • LexJansen
    LexJansen Altair Community Member
    edited October 10

    Hi,

    Was this ever solved?

    Especially now that Dataset-JSON is gaining traction, it is important that Altair fully supports the JSON libname engine, that is supporting record lengths larger than LRECL=32767.

    Thanks,
    Lex

  • Lorenz Albert Koh
    Lorenz Albert Koh
    Altair Employee
    edited October 11

    Hi Lex,

    I hope you're doing well!

    I wanted to let you know that in our more recent SLC maintenance versions, we've made improvements that allow you to set a higher value for the LRECL option. This must be specified in the filename statement when working with files.

    Here's a sample for your reference:

    filename tx disk "C:\temp\pretty.json" lrecl=1073741823;
    libname bean JSON fileref=tx;

    proc contents data=bean._all_;
    run;

    proc datasets lib=bean;
    quit;

     

    Regards

    Lorenz

  • LexJansen
    LexJansen Altair Community Member
    edited October 11

    Hi Lorenz,

    Thank you for the prompt reply!
    Also thank you for the great news about this fix.
    I will certainly try it out.

    Best,
    Lex

     

  • LexJansen
    LexJansen Altair Community Member
    edited October 11

    Here is some feedback from trying out JSON libname. The record length limitations seems solved. That's great!

    I tried the following:

    filename tx disk "C:\_github\lexjansen\dataset-json-sas\json\sdtm\dm.json" lrecl=1073741823; libname bean JSON fileref=tx;  proc contents data=bean._all_; run;  proc datasets lib=bean; quit;  proc copy in=bean out=work; run; 

    The JSON file can be found at: 
    https://github.com/lexjansen/dataset-json-sas/blob/master/json/sdtm/dm.json

    115       proc copy in=bean out=work; 116       run; NOTE: Copying Member "BEAN.root" to WORK.root (memtype=DATA)  WARNING: Copy is overwriting existing dataset WORK.root NOTE: 1 observations read from input dataset BEAN.root NOTE: Output dataset WORK.root has 1 observations and 14 variables NOTE: Member "BEAN.root.DATA" (memtype=DATA) copied NOTE: Copying Member "BEAN.columns" to WORK.columns (memtype=DATA)  WARNING: Copy is overwriting existing dataset WORK.columns NOTE: 27 observations read from input dataset BEAN.columns NOTE: Output dataset WORK.columns has 27 observations and 8 variables NOTE: Member "BEAN.columns.DATA" (memtype=DATA) copied NOTE: Copying Member "BEAN.rows" to WORK.rows (memtype=DATA)  WARNING: Copy is overwriting existing dataset WORK.rows NOTE: 0 observations read from input dataset BEAN.rows NOTE: Output dataset WORK.rows has 0 observations and 29 variables NOTE: Member "BEAN.rows.DATA" (memtype=DATA) copied NOTE: Copying Member "BEAN.sourceSystem" to WORK.sourceSystem (memtype=DATA)  WARNING: Copy is overwriting existing dataset WORK.sourceSystem NOTE: 1 observations read from input dataset BEAN.sourceSystem NOTE: Output dataset WORK.sourceSystem has 1 observations and 4 variables NOTE: Member "BEAN.sourceSystem.DATA" (memtype=DATA) copied NOTE: Copying Member "BEAN.alldata" to WORK.alldata (memtype=DATA)  WARNING: Copy is overwriting existing dataset WORK.alldata NOTE: 648 observations read from input dataset BEAN.alldata NOTE: Output dataset WORK.alldata has 648 observations and 6 variables NOTE: Member "BEAN.alldata.DATA" (memtype=DATA) copied NOTE: 5 members copied NOTE: Procedure copy step took :       real time : 0.030       cpu time  : 0.031 

    I was expecting the WORK.rows dataset to have 18 observations, as I get with SAS, but it has 0 observations.
    Below is the SAS log.

    Also, the WORK.ALLDATA dataset is structured quite different from SAS, although it does seem to have all data.

    10   proc copy in=bean out=work; 11   run;  NOTE: Copying BEAN.ALLDATA to WORK.ALLDATA (memtype=DATA). NOTE: BUFSIZE is not cloned when copying across different engines. System Option for BUFSIZE was used. INFO: Data set block I/O cannot be used because: INFO:   -  The data sets use different engines, have different variables or have attributes that may differ. NOTE: There were 674 observations read from the data set BEAN.ALLDATA. NOTE: The data set WORK.ALLDATA has 674 observations and 5 variables. NOTE: Copying BEAN.COLUMNS to WORK.COLUMNS (memtype=DATA). NOTE: BUFSIZE is not cloned when copying across different engines. System Option for BUFSIZE was used. INFO: Data set block I/O cannot be used because: INFO:   -  The data sets use different engines, have different variables or have attributes that may differ. NOTE: There were 27 observations read from the data set BEAN.COLUMNS. NOTE: The data set WORK.COLUMNS has 27 observations and 8 variables. NOTE: Copying BEAN.ROOT to WORK.ROOT (memtype=DATA). NOTE: BUFSIZE is not cloned when copying across different engines. System Option for BUFSIZE was used. INFO: Data set block I/O cannot be used because: INFO:   -  The data sets use different engines, have different variables or have attributes that may differ. NOTE: There were 1 observations read from the data set BEAN.ROOT. NOTE: The data set WORK.ROOT has 1 observations and 14 variables. NOTE: Copying BEAN.ROWS to WORK.ROWS (memtype=DATA). NOTE: BUFSIZE is not cloned when copying across different engines. System Option for BUFSIZE was used. INFO: Data set block I/O cannot be used because: INFO:   -  The data sets use different engines, have different variables or have attributes that may differ. NOTE: There were 18 observations read from the data set BEAN.ROWS. NOTE: The data set WORK.ROWS has 18 observations and 29 variables. NOTE: Copying BEAN.SOURCESYSTEM to WORK.SOURCESYSTEM (memtype=DATA). NOTE: BUFSIZE is not cloned when copying across different engines. System Option for BUFSIZE was used. INFO: Data set block I/O cannot be used because: INFO:   -  The data sets use different engines, have different variables or have attributes that may differ. NOTE: There were 1 observations read from the data set BEAN.SOURCESYSTEM. NOTE: The data set WORK.SOURCESYSTEM has 1 observations and 4 variables. NOTE: PROCEDURE COPY used (Total process time):       real time           0.05 seconds       cpu time            0.01 seconds