How to work with a very large .csv file?
Hi folks, it looks like there has been a similar post or two in the past, but years old at this point so I thought it would be helpful to refresh...
I need to load a huge .csv file (4.72GB, ~23MM lines), and I need to break it up into smaller .csv files according to one polynomial attribute. This is public State of Texas data, so the attribute by which I want to split into smaller data sets is "County", and I want those new .csv files to be kicked out onto my local disk.
What's the best, most computationally efficient way to do this?
Thanks!!
cc @sgenzer