"Problems with special characters in server logs"

Are
Are New Altair Community Member
edited November 5 in Community Q&A
Hey community,

I have an issue with a Logfile I want to read in with the LogFileSource - operator.

Here the XML-Code of the operator:

    <operator name="xxx" class="LogFileSource">
                <parameter key="config_file" value="/Users/xxx/ConfigurationFile.xml"/>
                <parameter key="log_dir" value="/Users/xxx"/>
                <parameter key="filetype_filter" value="ico|gif|jpg|jpeg|css|js|GIF|JPG|png|PNG|flash|xml|Xml|DropIT|Default|Login|axd|404|edit|robots|util|css|NotFound|Util|PlugIn|Sites|admin|Templates|templates|bmp|pdf"/>
                <parameter key="only_HTTP_200" value="true"/>
                <list key="browser_matcher">
                </list>
                <list key="os_matcher">
                </list>
                <list key="language_matcher">
                </list>
            </operator>




Here the line in the file which causes the problem:

2010-03-09 00:37:48 141.76.45.35    - W3SVC9 SEAREWS002 192.168.97.8 80 GET /NotFound.aspx 404;http://www.are360.com/sv/upplevelser/Skidakning-Alpint/Are/[glow=yellow,2,300]ctl00_ã≤&#6;∂Êr,&#26;w+HÕè&#21;&#127;æ30ûfi®&#21;8Ø?∏95&#29;?.ÜCMÖ^iâGy&#28;∞uˇÜTõd¡´•´&#6;∑˚Q‡⁄¬Ñ7ù˜Æ∆}∆&#3;&#23;ü,f&#17;m[/glow] 500 0 1596 281 16 HTTP/1.1 www.are360.com - - -




As you can see there are many special characters in this log. For the URI should be valid (at least the underlined part) these characters should not be allowed.
When reading in the file the Message viewer always writes the "WARNING: could not read line" message.

Can anyone help?

Regards,
Are
Tagged:

Answers

  • land
    land New Altair Community Member
    Hi,
    if I understand you correctly the problem already is inside the log files? Then I would try to open them with a text editor, if they are still there, try open it with another encoding. Might be the log file is stored as UTF-16 and you are reading it as ANSI.

    Greetings,
    Sebastian
  • Are
    Are New Altair Community Member
    Hi Sebastian,

    thanks for your help.

    unfortunately it was not the solution to my problem :(

    But due to the fact that these entries are less than 0.01% of all entries, we decided to just leave them out.

    Having a closer look at them they all turned out to have status 404 which makes them not interesting for our analyses anyways.

    Best,

    Edin