Altair RISE

A program to recognize and reward our most engaged community members

Nominate Yourself Now!

Querying DB Logs Using GDI in Graph Studio

Kedar

Introduction

This article provides a developer-friendly view of how to:

Access and interpret GDI logs
Use the Query Context tab for safe query debugging
Execute Elasticsearch queries against indexed unstructured data
Tune performance using concurrency, limits, and highlights

_____________________________________________________________________________________

DB Log Querying (CLI + Query Builder)

Log Location and Cron Job Cleanup

GDI logs can be accessed only via the command line, as pods older than 8 hours are cleaned up by a cron job to reduce costs.

kubectl get all -n <namespace>

kubectl -n <namespace> exec -it <anzograph-pod> -- bash


cd /opt/

ls -al

Navigate to lib/jar/ or check for log configuration files.
Look for log.config and ensure @udx =true is enabled for debugging user-defined extensions.

Query Context Tab & Secure Variables

The Query Context tab in Graph Studio helps test database connections securely.

URL hashes or sensitive params are obfuscated (s.url uses hashed identifiers).
XML Schema–based types like xsd:string, xsd:dateTime are returned from upstream sources and interpreted natively by GDI.

When creating a Graph Mart, Graph Studio automatically encrypts the database credentials used for querying, enabling secure, repeatable connections in downstream pipelines.

GDI Parameterization & Limits

GDI lets you control query behavior using parameters like:

s:limit 5000

Use this to:

Limit query results (default behavior fetches everything)
Prevent downstream overloads

Also applicable when partitioning tables with >1 million rows, or to manage concurrency.

Furthermore, when working with multilingual datasets or unstructured text, we can specify the language with:

s:locale "en"

________________________________________________________________________________________

Troubleshooting & Log Analysis

Scenario	What to Look For in Logs
Connection fails	Credential issues, upstream DB down
Query times out	Partitioning needed, row count too large
Ingestion too slow	Concurrency limits too low
Highlighting not working	Highlight fields not set or not indexed
Log data missing in UI	Pod cleaned up (check via CLI)

______________________________________________________________________________________

Best Practices

Task	Recommended Action
Querying large datasets	Use `s:limit`, partitioning
Preventing overload	Tune concurrency by CPU/node
Highlighting results	Enable `es:highlight` and use `fullText`
Checking logs	SSH into pod and tail logs
Testing	Use mock APIs or public HTTP endpoints

Performance Quick Tips

Use s:limit to reduce data transfer volume.
Avoid partitioning for small tables (<1M rows).
Monitor concurrency with pod CPU usage.
Use logs to tune highlighting and paging settings.

_______________________________________________________________________________________

Find more posts tagged with

Access

Auto Tagged

Comments

There are no comments yet