Introduction
This article provides a developer-friendly view of how to:
- Access and interpret GDI logs
- Use the Query Context tab for safe query debugging
- Execute Elasticsearch queries against indexed unstructured data
- Tune performance using concurrency, limits, and highlights
_____________________________________________________________________________________
DB Log Querying (CLI + Query Builder)
Log Location and Cron Job Cleanup
GDI logs can be accessed only via the command line, as pods older than 8 hours are cleaned up by a cron job to reduce costs.
kubectl get all -n <namespace>
kubectl -n <namespace> exec -it <anzograph-pod> -- bash
cd /opt/
ls -al
- Navigate to
lib/jar/
or check for log configuration files. - Look for
log.config
and ensure @udx =true is enabled for debugging user-defined extensions.
Query Context Tab & Secure Variables
The Query Context tab in Graph Studio helps test database connections securely.
- URL hashes or sensitive params are obfuscated (
s.url
uses hashed identifiers). - XML Schema–based types like
xsd:string
, xsd:dateTime
are returned from upstream sources and interpreted natively by GDI.
When creating a Graph Mart, Graph Studio automatically encrypts the database credentials used for querying, enabling secure, repeatable connections in downstream pipelines.
GDI Parameterization & Limits
GDI lets you control query behavior using parameters like:
s:limit 5000
Use this to:
- Limit query results (default behavior fetches everything)
- Prevent downstream overloads
Also applicable when partitioning tables with >1 million rows, or to manage concurrency.
Furthermore, when working with multilingual datasets or unstructured text, we can specify the language with:
s:locale "en"
________________________________________________________________________________________
Troubleshooting & Log Analysis
Scenario | What to Look For in Logs |
---|
Connection fails | Credential issues, upstream DB down |
Query times out | Partitioning needed, row count too large |
Ingestion too slow | Concurrency limits too low |
Highlighting not working | Highlight fields not set or not indexed |
Log data missing in UI | Pod cleaned up (check via CLI) |
______________________________________________________________________________________
Best Practices
Task | Recommended Action |
---|
Querying large datasets | Use s:limit , partitioning |
Preventing overload | Tune concurrency by CPU/node |
Highlighting results | Enable es:highlight and use fullText |
Checking logs | SSH into pod and tail logs |
Testing | Use mock APIs or public HTTP endpoints |
Performance Quick Tips
- Use
s:limit
to reduce data transfer volume. - Avoid partitioning for small tables (<1M rows).
- Monitor concurrency with pod CPU usage.
- Use logs to tune highlighting and paging settings.
_______________________________________________________________________________________
Further Reading