Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Altair RapidMiner
"Crawl Web", "Get Page[s]","Extract Content" and document encoding/charset
avk
Hi all
I crawl web sites in Russian. Some of them return content in UTF-8, other use Windows-1251 encoding. Is there a way to convert retrieved pages to any single (preferably UTF-8) encoding based on Content-Type server headers and META tags in the document?
Find more posts tagged with
AI Studio
Comments
There are no comments yet
Quick Links
All Categories
Recent Discussions
Activity
My Discussions
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups