UTF-16 vs System Coding in document processing?
I used the operator of "processing data from files" to perform some text mining tasks. There have 1800 short files, which are generally emails. I found that when I use "system coding", it works just fine. However, when I choose "UTF-16" coding, the warning message of "memory usage" will occur. How different is as "system coding" against "UTF-16 coding"? Are there any guidelines for choosing different coding schemes for document processing?
Tagged:
0