"Problems with special characters in server logs"
Are
New Altair Community Member
Hey community,
I have an issue with a Logfile I want to read in with the LogFileSource - operator.
Here the XML-Code of the operator:
<operator name="xxx" class="LogFileSource">
<parameter key="config_file" value="/Users/xxx/ConfigurationFile.xml"/>
<parameter key="log_dir" value="/Users/xxx"/>
<parameter key="filetype_filter" value="ico|gif|jpg|jpeg|css|js|GIF|JPG|png|PNG|flash|xml|Xml|DropIT|Default|Login|axd|404|edit|robots|util|css|NotFound|Util|PlugIn|Sites|admin|Templates|templates|bmp|pdf"/>
<parameter key="only_HTTP_200" value="true"/>
<list key="browser_matcher">
</list>
<list key="os_matcher">
</list>
<list key="language_matcher">
</list>
</operator>
Here the line in the file which causes the problem:
2010-03-09 00:37:48 141.76.45.35 - W3SVC9 SEAREWS002 192.168.97.8 80 GET /NotFound.aspx 404;http://www.are360.com/sv/upplevelser/Skidakning-Alpint/Are/[glow=yellow,2,300]ctl00_ã≤∂Êr,w+HÕèæ30ûfi®8Ø?∏95?.ÜCMÖ^iâGy∞uˇÜTõd¡´•´∑˚Q‡⁄¬Ñ7ù˜Æ∆}∆ü,fm[/glow] 500 0 1596 281 16 HTTP/1.1 www.are360.com - - -
As you can see there are many special characters in this log. For the URI should be valid (at least the underlined part) these characters should not be allowed.
When reading in the file the Message viewer always writes the "WARNING: could not read line" message.
Can anyone help?
Regards,
Are
I have an issue with a Logfile I want to read in with the LogFileSource - operator.
Here the XML-Code of the operator:
<operator name="xxx" class="LogFileSource">
<parameter key="config_file" value="/Users/xxx/ConfigurationFile.xml"/>
<parameter key="log_dir" value="/Users/xxx"/>
<parameter key="filetype_filter" value="ico|gif|jpg|jpeg|css|js|GIF|JPG|png|PNG|flash|xml|Xml|DropIT|Default|Login|axd|404|edit|robots|util|css|NotFound|Util|PlugIn|Sites|admin|Templates|templates|bmp|pdf"/>
<parameter key="only_HTTP_200" value="true"/>
<list key="browser_matcher">
</list>
<list key="os_matcher">
</list>
<list key="language_matcher">
</list>
</operator>
Here the line in the file which causes the problem:
2010-03-09 00:37:48 141.76.45.35 - W3SVC9 SEAREWS002 192.168.97.8 80 GET /NotFound.aspx 404;http://www.are360.com/sv/upplevelser/Skidakning-Alpint/Are/[glow=yellow,2,300]ctl00_ã≤∂Êr,w+HÕèæ30ûfi®8Ø?∏95?.ÜCMÖ^iâGy∞uˇÜTõd¡´•´∑˚Q‡⁄¬Ñ7ù˜Æ∆}∆ü,fm[/glow] 500 0 1596 281 16 HTTP/1.1 www.are360.com - - -
As you can see there are many special characters in this log. For the URI should be valid (at least the underlined part) these characters should not be allowed.
When reading in the file the Message viewer always writes the "WARNING: could not read line" message.
Can anyone help?
Regards,
Are
0
Answers
-
Hi,
if I understand you correctly the problem already is inside the log files? Then I would try to open them with a text editor, if they are still there, try open it with another encoding. Might be the log file is stored as UTF-16 and you are reading it as ANSI.
Greetings,
Sebastian0 -
Hi Sebastian,
thanks for your help.
unfortunately it was not the solution to my problem
But due to the fact that these entries are less than 0.01% of all entries, we decided to just leave them out.
Having a closer look at them they all turned out to have status 404 which makes them not interesting for our analyses anyways.
Best,
Edin0