Special characters encoding in Panopticon Dataset

Michal Gaida
Michal Gaida New Altair Community Member
edited August 2021 in Community Q&A

Hello,

I've pulled some JSON data into Panopticon - on desktop version the characters are properly encoded. However after publishing to Visualisation Server the encoding is broken:

image

Every Polish character is replaced by a placeholder character. Is there a way I can change the character encoding with JSON dataset on the server?

Best Answer

  • Adam Marchewka
    Adam Marchewka New Altair Community Member
    edited August 2021 Answer ✓

    Hey!

    That will be a bit of explanation, but I have found solution to this problem. First of all, the problem was with character encodings being incorrectly declared in second or third step of (source-of-data) -> (write-to-http) -> (display-in-browser), resulting in peculiar encoding paths of UTF-8 -> ISO-8859-1 -> ISO-8859-1 or UTF-8 -> UTF-8 -> ISO-8859-1, and those result in broken characters being displayed.

    To resolve the issue, we need to force our setup to use only UTF-8 (or our chosen encoding):

    • in .\Apache Software Foundation\Tomcat 9.0\conf\web.xml force fileEncoding to UTF-8
    • add filter setCharacterEncodingFilter to force UTF-8 (if it is not present be default!), listed at the end of this reply
    • in .\Apache Software Foundation\Tomcat 9.0\conf\server.xml modify every used Connector to enforce UTF-8 like so: <Connector [...] URIEncoding="UTF-8" />
    • in .\Apache Software Foundation\Tomcat 9.0\conf\Catalina\localhost\panopticon.xml enforce UTF-8 if not yet done so (<?xml version="1.0" encoding="UTF-8"?>)
    • in .\Apache Software Foundation\Tomcat 9.0\webapps\panopticon enforce UTF-8 in files: index.html; workbook\index.html; WEB-INF\index.html;

    Also, based on our instance of Tomcat (whether it is running as service or not):

    • if it is NOT running as service, we should create script setenv.bat in 
      .\Apache Software Foundation\Tomcat 9.0\bin containing code as follows:
      set JAVA_OPTS=%JAVA_OPTS% -Dfile.encoding=UTF-8
    • if IT IS running as service, we should use 
      .\Apache Software Foundation\Tomcat 9.0\bin\Tomcat9w.exe" to add Java option 
      -Dfile.encoding=UTF-8

    After all of that, we need to restart our Tomcat instance/service.

     

    We were succesful in resolving this encoding issue on Windows Server 2016, with Tomcat running as Service. On side note, some changes and instances of enforcing UTF-8 may be redundant. I guess that in most cases just adding Java option (either to service or in setenv.bat) could be sufficient.

     

    Cheers!

     

    Filter listing:

    <filter>    <filter-name>setCharacterEncodingFilter</filter-name>    <filter-class>org.apache.catalina.filters.SetCharacterEncodingFilter</filter-class>    <init-param>       <param-name>encoding</param-name>       <param-value>UTF-8</param-value>    </init-param>    <async-supported>true</async-supported> </filter>

Answers

  • Theodor Stenevang Klemming_21338
    edited August 2021

    Hi Michal

     

    Is the JSON data loading from a local JSON-file or from a URL? The problem is caused by a character encoding error. If you can share some or all of the data, it will help us understand what the solution is.

  • Michal Gaida
    Michal Gaida New Altair Community Member
    edited August 2021

    Hi Michal

     

    Is the JSON data loading from a local JSON-file or from a URL? The problem is caused by a character encoding error. If you can share some or all of the data, it will help us understand what the solution is.

    Hi!

    This is loaded from URL, the API interface of the tool enforces UTF-8 in the headers:

    • Content-Type: application/json;charset=UTF-8

    Unfortunately this is a CRM system with sensitive data so sharing the request would expose some critical information.

    I'll try to recreate on some other source with Polish characters that is simillar and then if I get the same results I'll post the connection info / data dump.

    EDIT: I'll mention that the column on the screenshot is actually a calculated column with some string values posted inside "IF" statement that goes like this:

     

    IF([CDAT Tag Count] =1,"CDAT",
    IF([EBOOK Tag Count]=1,"EBOOK",
    IF([NEXT Tag Count] =1,"NEXT",
    "POZOSTAŁE"
    )
    )
    )

    So as you can see this is based on a numeric counter column and the "POZOSTAŁE" string contains the special character. This is not a part of the JSON dataset and is calculated but still gives the erroneous output (and works fine in desktop designer version)

  • Theodor Stenevang Klemming_21338
    edited August 2021

    Hi

    I suspect that the problem is introduced when publishing from Desktop Designer to the Visualization server. I am not able to reproduce this encoding error when I create a calculated text column from scratch in the data table editor of the visualization server. 

    Since Desktop Designer has reached end of life and all maintenance and support will have ended by December 31, 2021, a general recommendation is to always work directly in the web interface of the server.

    Best regards, Theo

  • Adam Marchewka
    Adam Marchewka New Altair Community Member
    edited August 2021 Answer ✓

    Hey!

    That will be a bit of explanation, but I have found solution to this problem. First of all, the problem was with character encodings being incorrectly declared in second or third step of (source-of-data) -> (write-to-http) -> (display-in-browser), resulting in peculiar encoding paths of UTF-8 -> ISO-8859-1 -> ISO-8859-1 or UTF-8 -> UTF-8 -> ISO-8859-1, and those result in broken characters being displayed.

    To resolve the issue, we need to force our setup to use only UTF-8 (or our chosen encoding):

    • in .\Apache Software Foundation\Tomcat 9.0\conf\web.xml force fileEncoding to UTF-8
    • add filter setCharacterEncodingFilter to force UTF-8 (if it is not present be default!), listed at the end of this reply
    • in .\Apache Software Foundation\Tomcat 9.0\conf\server.xml modify every used Connector to enforce UTF-8 like so: <Connector [...] URIEncoding="UTF-8" />
    • in .\Apache Software Foundation\Tomcat 9.0\conf\Catalina\localhost\panopticon.xml enforce UTF-8 if not yet done so (<?xml version="1.0" encoding="UTF-8"?>)
    • in .\Apache Software Foundation\Tomcat 9.0\webapps\panopticon enforce UTF-8 in files: index.html; workbook\index.html; WEB-INF\index.html;

    Also, based on our instance of Tomcat (whether it is running as service or not):

    • if it is NOT running as service, we should create script setenv.bat in 
      .\Apache Software Foundation\Tomcat 9.0\bin containing code as follows:
      set JAVA_OPTS=%JAVA_OPTS% -Dfile.encoding=UTF-8
    • if IT IS running as service, we should use 
      .\Apache Software Foundation\Tomcat 9.0\bin\Tomcat9w.exe" to add Java option 
      -Dfile.encoding=UTF-8

    After all of that, we need to restart our Tomcat instance/service.

     

    We were succesful in resolving this encoding issue on Windows Server 2016, with Tomcat running as Service. On side note, some changes and instances of enforcing UTF-8 may be redundant. I guess that in most cases just adding Java option (either to service or in setenv.bat) could be sufficient.

     

    Cheers!

     

    Filter listing:

    <filter>    <filter-name>setCharacterEncodingFilter</filter-name>    <filter-class>org.apache.catalina.filters.SetCharacterEncodingFilter</filter-class>    <init-param>       <param-name>encoding</param-name>       <param-value>UTF-8</param-value>    </init-param>    <async-supported>true</async-supported> </filter>
  • Adam Marchewka
    Adam Marchewka New Altair Community Member
    edited August 2021

    Hey!

    That will be a bit of explanation, but I have found solution to this problem. First of all, the problem was with character encodings being incorrectly declared in second or third step of (source-of-data) -> (write-to-http) -> (display-in-browser), resulting in peculiar encoding paths of UTF-8 -> ISO-8859-1 -> ISO-8859-1 or UTF-8 -> UTF-8 -> ISO-8859-1, and those result in broken characters being displayed.

    To resolve the issue, we need to force our setup to use only UTF-8 (or our chosen encoding):

    • in .\Apache Software Foundation\Tomcat 9.0\conf\web.xml force fileEncoding to UTF-8
    • add filter setCharacterEncodingFilter to force UTF-8 (if it is not present be default!), listed at the end of this reply
    • in .\Apache Software Foundation\Tomcat 9.0\conf\server.xml modify every used Connector to enforce UTF-8 like so: <Connector [...] URIEncoding="UTF-8" />
    • in .\Apache Software Foundation\Tomcat 9.0\conf\Catalina\localhost\panopticon.xml enforce UTF-8 if not yet done so (<?xml version="1.0" encoding="UTF-8"?>)
    • in .\Apache Software Foundation\Tomcat 9.0\webapps\panopticon enforce UTF-8 in files: index.html; workbook\index.html; WEB-INF\index.html;

    Also, based on our instance of Tomcat (whether it is running as service or not):

    • if it is NOT running as service, we should create script setenv.bat in 
      .\Apache Software Foundation\Tomcat 9.0\bin containing code as follows:
      set JAVA_OPTS=%JAVA_OPTS% -Dfile.encoding=UTF-8
    • if IT IS running as service, we should use 
      .\Apache Software Foundation\Tomcat 9.0\bin\Tomcat9w.exe" to add Java option 
      -Dfile.encoding=UTF-8

    After all of that, we need to restart our Tomcat instance/service.

     

    We were succesful in resolving this encoding issue on Windows Server 2016, with Tomcat running as Service. On side note, some changes and instances of enforcing UTF-8 may be redundant. I guess that in most cases just adding Java option (either to service or in setenv.bat) could be sufficient.

     

    Cheers!

     

    Filter listing:

    <filter>    <filter-name>setCharacterEncodingFilter</filter-name>    <filter-class>org.apache.catalina.filters.SetCharacterEncodingFilter</filter-class>    <init-param>       <param-name>encoding</param-name>       <param-value>UTF-8</param-value>    </init-param>    <async-supported>true</async-supported> </filter>

    Forgot to mention, that filter (if it is missing, but it is built-in, so should be there) must also be mapped as follows:

     

    <filter-mapping>     <filter-name>setCharacterEncodingFilter</filter-name>     <url-pattern>/*</url-pattern> </filter-mapping>