{"id":962,"date":"2024-10-14T17:06:04","date_gmt":"2024-10-15T00:06:04","guid":{"rendered":"http:\/\/184.72.63.26\/?p=962"},"modified":"2024-12-09T22:04:54","modified_gmt":"2024-12-10T05:04:54","slug":"dynatrace-grail-a-revolutionary-data-platform-for-observability","status":"publish","type":"post","link":"https:\/\/www.wallacel.com\/index.php\/2024\/10\/14\/dynatrace-grail-a-revolutionary-data-platform-for-observability\/","title":{"rendered":"Dynatrace Grail: A Revolutionary Data Platform for Observability"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">In my previous blogs, I explored how to enhance observability in AWS environment by ingesting logs and metrics to Dynatrace. Now I will dive deeper into one of Dynatrace\u2019s most revolutionary features: <strong>Dynatrace Grail<\/strong>. This cutting-edge observability data lakehouse not only unifies logs, metrics, and traces into a single platform but also leverages <strong>AI-driven analytics<\/strong> to deliver real-time insights and automated root cause analysis. In this blog, I\u2019ll go beyond basic integrations and focus on the unique value Grail brings to observability, especially when compared to AWS CloudWatch and OpenSearch. I\u2019ll explore how Grail\u2019s <strong>Dynatrace Query Language (DQL)<\/strong>, <strong>real-time ingestion<\/strong>, and <strong>AI-powered automation<\/strong> can dramatically improve operational efficiency and reduce the time to resolve incidents.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>What is Dynatrace Grail?<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Dynatrace Grail is the backbone of Dynatrace\u2019s observability platform, designed to unify all your observability data\u2014logs, metrics, traces, events, and security information\u2014into a single, queryable data lakehouse. Unlike traditional log management tools that handle only logs, Grail integrates <strong>all observability data types<\/strong> into one system. This allows for <strong>real-time ingestion<\/strong>, <strong>instant querying<\/strong>, and <strong>AI-powered root cause analysis<\/strong>, all from a single platform.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"626\" height=\"393\" src=\"http:\/\/184.72.63.26\/wp-content\/uploads\/2024\/10\/dt-grail.png\" alt=\"\" class=\"wp-image-996\" srcset=\"https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/dt-grail.png 626w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/dt-grail-300x188.png 300w\" sizes=\"auto, (max-width: 626px) 100vw, 626px\" \/><figcaption class=\"wp-element-caption\"><em>Image source: https:\/\/dynatrace.com\/news\/blog\/new-approach-to-software-intelligence<\/em><\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Key Features of Dynatrace Grail:<\/strong><\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Real-Time Ingestion:<\/strong> Seamlessly ingests logs, metrics, and traces from multiple sources in real time.<\/li>\n\n\n\n<li><strong>AI-Powered Insights:<\/strong> Uses <strong>Davis AI<\/strong> for automated anomaly detection and root cause analysis.<\/li>\n\n\n\n<li><strong>Querying and Analysis:<\/strong> Leverages the <strong>Dynatrace Query Language<\/strong> (DQL) to query across logs, metrics, and traces.<\/li>\n\n\n\n<li><strong>Scalable Log Analytics:<\/strong> Designed for dynamic, cloud-native environments, offering real-time scalability without manual reconfiguration.<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Let\u2019s break down these features to have a deeper understanding of Dynatrace Grail with a comparison of similar services from AWS.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>1. Real-Time Ingestion and Unified Observability<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>AWS CloudWatch:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>CloudWatch ingests logs and metrics from AWS services, applications, and on-prem systems, but it splits metrics and logs into separate services (CloudWatch Logs and CloudWatch Metrics).<\/li>\n\n\n\n<li>While AWS CloudWatch is powerful for monitoring individual AWS resources, correlating logs and metrics often requires multiple tools and a fair amount of manual effort. For instance, you need to link CloudWatch with <strong>X-Ray<\/strong> for tracing, or with <strong>AWS Lambda<\/strong> for serverless monitoring.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Dynatrace Grail:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Grail ingests <strong>logs, metrics, and traces<\/strong> in real time and stores them in a unified data lake.<\/li>\n\n\n\n<li>With <strong>automatic correlation<\/strong> between these data types, Grail eliminates the need for manual integration between different monitoring services. For example, a spike in CPU usage is automatically linked to related logs and traces, giving you an immediate holistic view of the issue.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">To see this in action, let&#8217;s deploy a Dynatrace OneAgent to an AWS EC2 instance to collect the host metrics using AWS CDK:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"typescript\" class=\"language-typescript\">import * as cdk from 'aws-cdk-lib';\nimport { Stack, StackProps } from 'aws-cdk-lib';\nimport { Vpc, Instance, InstanceType, AmazonLinuxImage, AmazonLinuxGeneration, SecurityGroup, Subnet } from 'aws-cdk-lib\/aws-ec2';\nimport { Role } from 'aws-cdk-lib\/aws-iam';\nimport { Construct } from 'constructs';\nimport * as secretsmanager from 'aws-cdk-lib\/aws-secretsmanager';\nimport * as custom_resources from 'aws-cdk-lib\/custom-resources';\nimport * as iam from 'aws-cdk-lib\/aws-iam';\n\nexport class DynatraceOneAgentAndDashboardStack extends cdk.Stack {\n  constructor(scope: Construct, id: string, props?: StackProps) {\n    super(scope, id, props);\n\n    const region = \"us-west-2\"; \/\/ Replace with your region\n\n    \/\/ Import the existing VPC\n    const vpc = Vpc.fromLookup(this, 'vpc', {\n      vpcId: 'vpc-xxxxxxxxxxxxxxxxx', \/\/ Replace with your VPC ID\n    });\n\n    \/\/ Lookup the existing security group by ID\n    const securityGroup = SecurityGroup.fromSecurityGroupId(this, 'MySG', \n    'sg-xxxxxxxxxxxxxxxxx'); \/\/ Replace with your security group ID\n\n    \/\/ Use an existing IAM role for the EC2 instance\n    const role = Role.fromRoleArn(this, 'InstanceRole', 'arn:aws:iam::xxxxxxxxxxxx:instance-profile\/YourRoleName'); \/\/ Replace with your IAM role ARN\n\n    \/\/ Define the EC2 instance with Amazon Linux\n    const instance = new Instance(this, 'OneAgentInstance', {\n      vpc,\n      instanceType: new InstanceType('t2.micro'),\n      machineImage: new AmazonLinuxImage({\n        generation: AmazonLinuxGeneration.AMAZON_LINUX_2,\n      }),\n      securityGroup: securityGroup,\n      role: role,\n      vpcSubnets: {\n        subnets: [\n          Subnet.fromSubnetAttributes(this, 'MySubnet', {\n            subnetId: 'subnet-xxxxxxxxxxxxxxxxx', \/\/ Replace with your subnet ID\n            availabilityZone: 'us-west-2a',\n          }),\n        ],\n      },\n    });\n\n    \/\/ User Data script for installing Dynatrace OneAgent\n    const userDataScript = `\n    #!\/bin\/bash\n    yum update -y\n    yum install -y wget aws-cli\n    mkdir -p \/opt\/dynatrace\n    cd \/opt\/dynatrace\n  \n    # Retrieve Dynatrace download token from Secrets Manager and extract it using sed\n    DT_TOKEN=\\$(aws secretsmanager get-secret-value --secret-id dynatrace-secret --query SecretString --output text --region us-west-1 | sed 's\/.*\"token\":\"\\\\([^\"]*\\\\)\".*\/\\\\1\/')\n  \n    # Proceed with the download and installation\n    wget -O Dynatrace-OneAgent-Linux-1.299.45.20240924-123410.sh \"https:\/\/your-dynatrace-url.com\/api\/v1\/deployment\/installer\/agent\/unix\/default\/latest?arch=x86\" --header=\"Authorization: Api-Token \\$DT_TOKEN\"\n    sudo \/bin\/bash Dynatrace-OneAgent-Linux-1.299.45.20240924-123410.sh --set-monitoring-mode=fullstack --set-app-log-content-access=true --set-host-group=my-host-group --set-host-tag=environment:prod\n    `;\n\n    instance.addUserData(userDataScript);\n\n    new cdk.CfnOutput(this, 'InstanceId', {\n      value: instance.instanceId,\n      description: 'The ID of the EC2 instance running Dynatrace OneAgent',\n    });\n  }\n}\n<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"426\" src=\"http:\/\/184.72.63.26\/wp-content\/uploads\/2024\/10\/image-1024x426.png\" alt=\"\" class=\"wp-image-963\" srcset=\"https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/image-1024x426.png 1024w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/image-300x125.png 300w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/image-768x319.png 768w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/image.png 1183w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Once the OneAgent is deployed, it automatically discovers and collects all relevant monitoring data like cpu usage, network health, processes, and services on your host.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"585\" src=\"http:\/\/184.72.63.26\/wp-content\/uploads\/2024\/10\/awsclassic-1024x585.png\" alt=\"\" class=\"wp-image-965\" srcset=\"https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/awsclassic-1024x585.png 1024w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/awsclassic-300x171.png 300w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/awsclassic-768x439.png 768w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/awsclassic.png 1073w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Navigate to <strong>Smartscape<\/strong> <strong>Topology<\/strong> in the Dynatrace UI to view your AWS host, processes and its dependencies in real time, without manual configuration.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"772\" src=\"http:\/\/184.72.63.26\/wp-content\/uploads\/2024\/10\/image-2-1024x772.png\" alt=\"\" class=\"wp-image-1032\" srcset=\"https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/image-2-1024x772.png 1024w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/image-2-300x226.png 300w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/image-2-768x579.png 768w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/image-2.png 1138w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>2. AI-Powered Root Cause Analysis with Davis AI<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>AWS CloudWatch:<\/strong> <\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>In AWS, you can use <strong>CloudWatch alarms<\/strong> and basic anomaly detection to alert you when metrics deviate from thresholds. However, identifying the root cause still involves manually piecing together data from <strong>CloudWatch Logs<\/strong>, <strong>X-Ray<\/strong>, and other AWS services.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>With Dynatrace Grail<\/strong>:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Dynatrace\u2019s <strong>Davis AI<\/strong> goes beyond basic alerting by automatically correlating data across logs, metrics, and traces to identify the <strong>root cause<\/strong> of an issue. This is a massive time-saver, especially in complex environments where problems often span multiple services.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">To simulate a performance issue in my application, I will run a simple python script (<strong>cpu.py<\/strong>) to stress the CPU usage to 90% on a EC2 instance:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"python\" class=\"language-python\">import multiprocessing\nimport time\n\ndef cpu_load():\n    while True:\n        # Busy-wait for 0.9 seconds (90% CPU usage)\n        end = time.time() + 0.9\n        while time.time() &lt; end:\n            pass\n        # Sleep for 0.1 seconds (10% idle)\n        time.sleep(0.1)\n\nif __name__ == \"__main__\":\n    # Run the load on multiple CPU cores (adjust as necessary)\n    for i in range(multiprocessing.cpu_count()):\n        process = multiprocessing.Process(target=cpu_load)\n        process.start()\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Open Dynatrace Hosts Classic and after awhile the Davis AI will automatically detect the anomaly. It provides a <strong>root cause analysis<\/strong>, showing the metric and host that contributed to the issue.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"597\" src=\"http:\/\/184.72.63.26\/wp-content\/uploads\/2024\/10\/davis-ai-1024x597.png\" alt=\"\" class=\"wp-image-978\" srcset=\"https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/davis-ai-1024x597.png 1024w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/davis-ai-300x175.png 300w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/davis-ai-768x448.png 768w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/davis-ai.png 1429w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">From the same page, we can easily identify the python script &#8216;cpu.py&#8217; is causing the high CPU usage. Davis AI automates what would otherwise require hours of manual log parsing and metric correlation and drastically reducing your <strong>MTTR (Mean Time to Resolution)<\/strong>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"583\" src=\"http:\/\/184.72.63.26\/wp-content\/uploads\/2024\/10\/cpu-issue-1024x583.png\" alt=\"\" class=\"wp-image-979\" srcset=\"https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/cpu-issue-1024x583.png 1024w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/cpu-issue-300x171.png 300w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/cpu-issue-768x437.png 768w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/cpu-issue.png 1480w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>3. Dynatrace Query Language (DQL): Unlocking the Power of Unified Data<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>AWS CloudWatch:<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">When using AWS CloudWatch Logs, you have <strong>CloudWatch Insights<\/strong> for querying logs, <strong>CloudWatch Metrics<\/strong> for querying metrics, and <strong>X-Ray<\/strong> for traces. Each of these requires its own interface and query language, making cross-data queries difficult.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Dynatrace Grail:<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Dynatrace introduces <strong>DQL (Dynatrace Query Language)<\/strong>, a single language to explore, query and process all data persisted in Dynatrace Grail. DQL is a pipeline based data processing language where you can define a set of commands that follow each other where the data is processed step by step. Each command returns an output containing a set of records and that output becomes the input of the next command. You continue this process until you are satisfied with the analysis results.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"199\" src=\"http:\/\/184.72.63.26\/wp-content\/uploads\/2024\/10\/dql-1024x199.png\" alt=\"\" class=\"wp-image-981\" srcset=\"https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/dql-1024x199.png 1024w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/dql-300x58.png 300w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/dql-768x150.png 768w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/dql.png 1191w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\"><em>Image source: https:\/\/docs.dynatrace.com\/docs\/platform\/grail\/dynatrace-query-language\/dql-guide<\/em><\/figcaption><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Let&#8217;s do a simple DQL to fetch the number of problem events that happened over time for the aforementioned EC2 host. First, I go to Smartscape Topology to find out the host id from the URL:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"794\" src=\"http:\/\/184.72.63.26\/wp-content\/uploads\/2024\/10\/hostid-1-1024x794.png\" alt=\"\" class=\"wp-image-983\" srcset=\"https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/hostid-1-1024x794.png 1024w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/hostid-1-300x232.png 300w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/hostid-1-768x595.png 768w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/hostid-1.png 1231w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"> Then go to Notebook and create a DQL. This query fetches the number of problematic events that happened on my EC2 host with id <strong>HOST-480E4F2A782B08B6<\/strong> and their status. With the visualization option, I can choose to visualize the results in a bar chart.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"sql\" class=\"language-sql\">fetch events\n| filter event.kind == \"DAVIS_PROBLEM\" and in(affected_entity_ids, \"HOST-480E4F2A782B08B6\")\n| summarize count = count(), by: {bin(timestamp, 15min), event.status}<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"559\" src=\"http:\/\/184.72.63.26\/wp-content\/uploads\/2024\/10\/notebook-1024x559.png\" alt=\"\" class=\"wp-image-984\" srcset=\"https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/notebook-1024x559.png 1024w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/notebook-300x164.png 300w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/notebook-768x419.png 768w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/notebook.png 1439w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>4. Scalable Log Analytics Without Indexes or Schemas<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>AWS OpenSearch:<\/strong> <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In a log management system like <strong>AWS OpenSearch<\/strong>, data ingestion requires effort in setup to achieve efficient querying and analysis. OpenSearch is based on Elasticsearch and uses <strong>indexing<\/strong> and <strong>schema definitions<\/strong> to store and organize data. While OpenSearch is a powerful, fully managed service, users must often manually configure <strong>index mappings<\/strong> and <strong>schemas<\/strong> to optimize search performance and ensure logs, metrics, and traces are stored in the correct format.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Indexing:<\/strong> OpenSearch requires building indexes to structure and store data. These indexes are created based on the schema you define, which must be updated as new data sources or formats are introduced.<\/li>\n\n\n\n<li><strong>Schema Definition:<\/strong> In OpenSearch, defining the correct schema upfront is important to ensure that fields are correctly indexed for efficient querying. If the schema isn\u2019t set properly, queries can become slow or return incomplete results.<\/li>\n\n\n\n<li><strong>Limitations:<\/strong> While AWS OpenSearch is fully managed, managing indexes and schemas still requires significant planning and ongoing maintenance. This can be cumbersome in dynamic, cloud-native environments where services scale frequently, and new data types constantly emerge.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Dynatrace Grail:<\/strong> <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Dynatrace Grail eliminates the need to build <strong>indexes<\/strong> or define <strong>schemas<\/strong> manually. It offers <strong>schema-on-read<\/strong> functionality, allowing you to ingest logs, metrics, and traces and immediately query the data without the overhead of defining how that data should be stored. This makes Grail a more <strong>scalable<\/strong> and <strong>flexible<\/strong> solution, especially in environments where data structures frequently change.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"480\" src=\"http:\/\/184.72.63.26\/wp-content\/uploads\/2024\/10\/dynatrace-grail-1024x480.png\" alt=\"\" class=\"wp-image-977\" srcset=\"https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/dynatrace-grail-1024x480.png 1024w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/dynatrace-grail-300x141.png 300w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/dynatrace-grail-768x360.png 768w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/dynatrace-grail-1536x721.png 1536w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/dynatrace-grail.png 1592w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption class=\"wp-element-caption\"><em>Image source: https:\/\/www.dynatrace.com\/platform\/log-management-analytics<\/em><\/figcaption><\/figure>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>No Indexing or Schema Definition Required:<\/strong>\n<ul class=\"wp-block-list\">\n<li>Unlike OpenSearch, where users must predefine indexes and schemas, Grail dynamically understands and organizes your observability data (logs, metrics, and traces) as it ingests it. This makes it easier to handle new data types and instantly start querying without worrying about the underlying data structure.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Real-Time Data Ingestion:<\/strong>\n<ul class=\"wp-block-list\">\n<li>With Grail, data is <strong>immediately available<\/strong> for querying upon ingestion. You don\u2019t need to wait for indexing jobs to finish, which can be a bottleneck in OpenSearch when dealing with high data volumes or complex queries.<\/li>\n\n\n\n<li>Whether you&#8217;re ingesting logs from microservices or metrics from infrastructure, Grail ensures that you can access and query this data in <strong>real-time<\/strong>, enabling faster troubleshooting and performance analysis.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Unified Data Ingestion:<\/strong>\n<ul class=\"wp-block-list\">\n<li>Grail unifies logs, metrics, and traces into one<strong> Data lakehouse<\/strong>, allowing for cross-data querying without needing separate indexes for each data type. OpenSearch, on the other hand, requires separate index setups for each data type, with separate mapping and storage requirements for logs, metrics, and traces.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Seamless Scalability:<\/strong>\n<ul class=\"wp-block-list\">\n<li>Grail is built to scale automatically with your environment. As your data grows, the platform seamlessly adapts, providing <strong>fast query performance<\/strong> without manual reconfiguration. <\/li>\n<\/ul>\n<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Let\u2019s use <strong>DQL<\/strong> to demonstrate how to query logs in <strong>JSON format<\/strong> and extract specific attributes, such as the user\u2019s question and the chatbot&#8217;s answer, without the need to build indexes or define schemas. In this example, I&#8217;ve already ingested the logs from my <strong>chatbot Lambda function<\/strong> into Dynatrace. These logs capture the interaction between users and the chatbot, specifically showing what users ask and how the chatbot responds.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"sql\" class=\"language-sql\">fetch logs\n| filter matchesValue(aws.log_group, \"\/aws\/lambda\/chatbot\")\n| sort timestamp desc<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"577\" src=\"http:\/\/184.72.63.26\/wp-content\/uploads\/2024\/10\/log-all-1024x577-1.png\" alt=\"\" class=\"wp-image-1002\" srcset=\"https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/log-all-1024x577-1.png 1024w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/log-all-1024x577-1-300x169.png 300w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/log-all-1024x577-1-768x433.png 768w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"544\" src=\"http:\/\/184.72.63.26\/wp-content\/uploads\/2024\/10\/log-1024x544.png\" alt=\"\" class=\"wp-image-988\" style=\"width:883px;height:auto\" srcset=\"https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/log-1024x544.png 1024w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/log-300x159.png 300w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/log-768x408.png 768w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/log.png 1460w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Since the content is in JSON format, I can parse the content using <code>\"json:chat\"<\/code> and extract only the attributes <strong>question<\/strong> and <strong>answer<\/strong> I am interested by creating objects representing those attributes like <strong>chat[question] <\/strong>and <strong>chat[answer]<\/strong>. The <code>parse json<\/code> command dynamically understands the structure of the log data so you don&#8217;t have to define a schema upfront.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code lang=\"sql\" class=\"language-sql\">fetch logs\n| filter matchesValue(aws.log_group, \"\/aws\/lambda\/chatbot\")\n| parse content, \"json:chat\"\n| fields timestamp, question=chat[question], answer=chat[answer]\n| filter isNotNull(question)\n| sort timestamp desc<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"537\" src=\"http:\/\/184.72.63.26\/wp-content\/uploads\/2024\/10\/logsevents-1024x537.png\" alt=\"\" class=\"wp-image-993\" srcset=\"https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/logsevents-1024x537.png 1024w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/logsevents-300x157.png 300w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/logsevents-768x402.png 768w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/logsevents.png 1185w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"531\" src=\"http:\/\/184.72.63.26\/wp-content\/uploads\/2024\/10\/attrib2-1024x531.png\" alt=\"\" class=\"wp-image-994\" srcset=\"https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/attrib2-1024x531.png 1024w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/attrib2-300x156.png 300w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/attrib2-768x398.png 768w, https:\/\/www.wallacel.com\/wp-content\/uploads\/2024\/10\/attrib2.png 1203w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Final Thoughts<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Dynatrace Grail offers a powerful solution for modern cloud observability, enabling organizations to unlock deep insights from large volumes of unstructured log data. Its ability to index, store, and query logs in real time, combined with AI-powered analytics, streamlines troubleshooting and enhances decision-making. By integrating seamlessly with existing cloud environments and providing actionable intelligence across applications, infrastructure, and user experiences, Grail helps teams reduce downtime, optimize performance, and gain full visibility into their systems. Its flexibility and performance make it a key tool for driving efficiency and innovation in dynamic, cloud-native environments.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Thank you for reading my blog and I hope you like it!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In my previous blogs, I explored how to enhance observability in AWS environment by ingesting logs and metrics to Dynatrace. Now I will dive deeper into one of Dynatrace\u2019s most revolutionary features: Dynatrace Grail. This cutting-edge observability data lakehouse not only unifies logs, metrics, and traces into a single platform but also leverages AI-driven analytics [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1000,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1,70],"tags":[71,64,65,63,42,62],"class_list":["post-962","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-aws","category-dynatrace","tag-aws-cdk","tag-data-lakehouse","tag-davis-ai","tag-dql","tag-dynatrace","tag-grail"],"_links":{"self":[{"href":"https:\/\/www.wallacel.com\/index.php\/wp-json\/wp\/v2\/posts\/962","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.wallacel.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.wallacel.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.wallacel.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.wallacel.com\/index.php\/wp-json\/wp\/v2\/comments?post=962"}],"version-history":[{"count":32,"href":"https:\/\/www.wallacel.com\/index.php\/wp-json\/wp\/v2\/posts\/962\/revisions"}],"predecessor-version":[{"id":1033,"href":"https:\/\/www.wallacel.com\/index.php\/wp-json\/wp\/v2\/posts\/962\/revisions\/1033"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.wallacel.com\/index.php\/wp-json\/wp\/v2\/media\/1000"}],"wp:attachment":[{"href":"https:\/\/www.wallacel.com\/index.php\/wp-json\/wp\/v2\/media?parent=962"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.wallacel.com\/index.php\/wp-json\/wp\/v2\/categories?post=962"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.wallacel.com\/index.php\/wp-json\/wp\/v2\/tags?post=962"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}