Federalist Papers
btibert
New Altair Community Member
Has anyone had success in bringing in the federalist papers dataset? The JSON form can be found here: http://ptrckprry.com/course/ssd/data/federalist.json
These are the following steps I have attempted:
- Parse the json into a csv, but the new line character seems to be getting stuck when using read csv
- Using python extension operator configured to a local conda environment. Same result
Regarding point 2 above, in pandas outside of the RM, the dataframe is exactly what I wanted.
For context, I use this in class to show we can use text and the similarity within to classify the author.
These are the following steps I have attempted:
- Parse the json into a csv, but the new line character seems to be getting stuck when using read csv
- Using python extension operator configured to a local conda environment. Same result
Regarding point 2 above, in pandas outside of the RM, the dataframe is exactly what I wanted.
For context, I use this in class to show we can use text and the similarity within to classify the author.
import pandas as pd
URL = "http://ptrckprry.com/course/ssd/data/federalist.json"fed = pd.read_json(URL, lines=True)
fed.head()
Tagged:
0
Best Answer
-
Thanks, I was actually able to port another file from SAS using Read SAS which did the trick.1
Answers
-
Hi @btibert,
As a partial answer, have you tried Read Document (after downloading the federalist.json file on your computer) and JSON to Data operators ?
Below, the process.
Hope this helps,
Regards,
Lionel<?xml version="1.0" encoding="UTF-8"?><process version="9.5.000"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="9.5.000" expanded="true" name="Process"> <parameter key="logverbosity" value="init"/> <parameter key="random_seed" value="2001"/> <parameter key="send_mail" value="never"/> <parameter key="notification_email" value=""/> <parameter key="process_duration_for_mail" value="30"/> <parameter key="encoding" value="SYSTEM"/> <process expanded="true"> <operator activated="true" breakpoints="after" class="text:read_document" compatibility="8.2.000" expanded="true" height="68" name="Read Document" width="90" x="179" y="136"> <parameter key="file" value="C:\Users\Lionel\Desktop\json.json"/> <parameter key="extract_text_only" value="true"/> <parameter key="use_file_extension_as_type" value="true"/> <parameter key="content_type" value="txt"/> <parameter key="encoding" value="SYSTEM"/> </operator> <operator activated="true" class="text:json_to_data" compatibility="8.2.000" expanded="true" height="82" name="JSON To Data" width="90" x="380" y="136"> <parameter key="ignore_arrays" value="false"/> <parameter key="limit_attributes" value="false"/> <parameter key="skip_invalid_documents" value="false"/> <parameter key="guess_data_types" value="true"/> <parameter key="keep_missing_attributes" value="false"/> <parameter key="missing_values_aliases" value=", null, NaN, missing"/> </operator> <connect from_op="Read Document" from_port="output" to_op="JSON To Data" to_port="documents 1"/> <connect from_op="JSON To Data" from_port="example set" to_port="result 1"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> </process> </operator> </process>
1 -
Thanks for this, but right now that only parses the first entry. There are ~85 or so entries0
-
Thanks, I was actually able to port another file from SAS using Read SAS which did the trick.1