Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splunk Log Driver #454

Closed
smugcloud opened this issue Jul 14, 2016 · 14 comments
Closed

Splunk Log Driver #454

smugcloud opened this issue Jul 14, 2016 · 14 comments

Comments

@smugcloud
Copy link

smugcloud commented Jul 14, 2016

Hi Guys,

First, thanks for including the Splunk log driver in ECS. I am running agent v. 1.11.0 with Docker v. 1.11.2 on the ami-241bd844.

If I try to run the task definition below, I am getting an error. Is there something else I need to specify in order for the Splunk log driver to begin being used? I am able to run the same docker run command with no issues on the same host.

Error:

service example-rest-splunk-dev was unable to place a task because no container instance met all of its requirements. The closest matching container-instance 4792b206-19a3-47e6-822b-70ef075cf3d2 is missing an attribute required by your task. For more information, see the Troubleshooting section.

Task Definition:

{
    "taskDefinition": {
        "status": "ACTIVE",
        "family": "splunk_log_driver_test-dev",
        "requiresAttributes": [
            {
                "name": "com.amazonaws.ecs.capability.ecr-auth"
            },
            {
                "name": "com.amazonaws.ecs.capability.logging-driver.splunk"
            },
            {
                "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
            }
        ],
        "volumes": [],
        "taskDefinitionArn": "arn:aws:ecs:us-west-2:539783510382:task-definition/splunk_log_driver_test-dev:4",
        "containerDefinitions": [
            {
                "environment": [
                    {
                        "name": "spring_profiles_active",
                        "value": "dev"
                    }
                ],
                "name": "example-rest",
                "mountPoints": [],
                "image": <MY_IMAGE>,
                "cpu": 128,
                "portMappings": [
                    {
                        "protocol": "tcp",
                        "containerPort": 80,
                        "hostPort": 80
                    }
                ],
                "logConfiguration": {
                    "logDriver": "splunk",
                    "options": {
                        "splunk-token": <MY_TOKEN>,
                        "splunk-url": <MY_URL>,
                        "splunk-index": <MY_INDEX>
                    }
                },
                "memory": 4096,
                "essential": true,
                "volumesFrom": []
            },
            {
                "environment": [],
                "name": "nginx",
                "mountPoints": [],
                "image": "nginx",
                "cpu": 128,
                "portMappings": [
                    {
                        "protocol": "tcp",
                        "containerPort": 80,
                        "hostPort": 8000
                    }
                ],
                "memory": 128,
                "essential": true,
                "volumesFrom": []
            }
        ],
        "revision": 4
    }
}
@aaithal
Copy link
Contributor

aaithal commented Jul 14, 2016

@smugcloud can you please paste the output of ecs describe-container-instances --cluster <your-cluster-name> --container-instance <container-instance-iarn>? It looks like you're missing an attribute on your container instance.

Thanks,
Anirudh

@smugcloud
Copy link
Author

I was wondering how to get all that data @aaithal :D It does indeed look like Splunk is missing from here. Is this supposed to be a part of the default agent?

{
    "failures": [],
    "containerInstances": [
        {
            "status": "ACTIVE",
            "registeredResources": [
                {
                    "integerValue": 2048,
                    "longValue": 0,
                    "type": "INTEGER",
                    "name": "CPU",
                    "doubleValue": 0.0
                },
                {
                    "integerValue": 7986,
                    "longValue": 0,
                    "type": "INTEGER",
                    "name": "MEMORY",
                    "doubleValue": 0.0
                },
                {
                    "name": "PORTS",
                    "longValue": 0,
                    "doubleValue": 0.0,
                    "stringSetValue": [
                        "22",
                        "2376",
                        "2375",
                        "51678",
                        "51679"
                    ],
                    "type": "STRINGSET",
                    "integerValue": 0
                },
                {
                    "name": "PORTS_UDP",
                    "longValue": 0,
                    "doubleValue": 0.0,
                    "stringSetValue": [],
                    "type": "STRINGSET",
                    "integerValue": 0
                }
            ],
            "ec2InstanceId": "i-dbbf2306",
            "agentConnected": true,
            "containerInstanceArn": "arn:aws:ecs:us-west-2:539783510382:container-instance/1b52f3d8-e038-49cb-8cfe-023a1b5c4c0c",
            "pendingTasksCount": 0,
            "remainingResources": [
                {
                    "integerValue": 2048,
                    "longValue": 0,
                    "type": "INTEGER",
                    "name": "CPU",
                    "doubleValue": 0.0
                },
                {
                    "integerValue": 7986,
                    "longValue": 0,
                    "type": "INTEGER",
                    "name": "MEMORY",
                    "doubleValue": 0.0
                },
                {
                    "name": "PORTS",
                    "longValue": 0,
                    "doubleValue": 0.0,
                    "stringSetValue": [
                        "22",
                        "2376",
                        "2375",
                        "51678",
                        "51679"
                    ],
                    "type": "STRINGSET",
                    "integerValue": 0
                },
                {
                    "name": "PORTS_UDP",
                    "longValue": 0,
                    "doubleValue": 0.0,
                    "stringSetValue": [],
                    "type": "STRINGSET",
                    "integerValue": 0
                }
            ],
            "runningTasksCount": 0,
            "attributes": [
                {
                    "name": "com.amazonaws.ecs.capability.privileged-container"
                },
                {
                    "name": "com.amazonaws.ecs.capability.docker-remote-api.1.17"
                },
                {
                    "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
                },
                {
                    "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19"
                },
                {
                    "name": "com.amazonaws.ecs.capability.docker-remote-api.1.20"
                },
                {
                    "name": "com.amazonaws.ecs.capability.docker-remote-api.1.21"
                },
                {
                    "name": "com.amazonaws.ecs.capability.docker-remote-api.1.22"
                },
                {
                    "name": "com.amazonaws.ecs.capability.logging-driver.json-file"
                },
                {
                    "name": "com.amazonaws.ecs.capability.logging-driver.syslog"
                },
                {
                    "name": "com.amazonaws.ecs.capability.logging-driver.awslogs"
                },
                {
                    "name": "com.amazonaws.ecs.capability.ecr-auth"
                },
                {
                    "name": "com.amazonaws.ecs.capability.task-iam-role"
                }
            ],
            "versionInfo": {
                "agentVersion": "1.11.0",
                "agentHash": "c9aefeb",
                "dockerVersion": "DockerVersion: 1.11.2"
            }
        }
    ]
}

@aaithal
Copy link
Contributor

aaithal commented Jul 14, 2016

Yeah, that does seem to be the issue. You'd have to override the ECS Agent config to add splunk logging driver using the ECS_AVAILABLE_LOGGING_DRIVERS environment variable config. More details on doing that can be found here.

For example, the following user-data registers the instance with splunk and awslogs logging drivers:

#!/bin/bash
echo ECS_AVAILABLE_LOGGING_DRIVERS='["splunk","awslogs"]' >> /etc/ecs/ecs.config

@smugcloud
Copy link
Author

Thanks @aaithal . For any AWS employees reading this, it would be wonderful if this was included in the recommended AMI's so we don't need to account for this as well.

@aaithal
Copy link
Contributor

aaithal commented Jul 20, 2016

@smugcloud Is there an easy way where we can test the functionality of the splunk logging driver? If we can write a functional test for the splunk log driver (for example, awslogs driver is being tested here), we can look at modifying ecs-init to register the agent with splunk logging driver on startup, by default. Please let us know.

Thanks,
Anirudh

@MaerF0x0
Copy link

@aaithal can we just add that ECS_AVAILABLE_LOGGING_DRIVERS line to the /etc/ecs/ecs.config after the fact? We dont want to have to rebuild the cluster.

@smugcloud
Copy link
Author

@aaithal That test could be modified to validate the existence of the Splunk log driver in the base config. It shouldn't need to be added out of band, if that functionality is provided by default. Just like fluentd, gelf, etc. are exposed as available options in the UI, Splunk should be too.

@MaerF0x0 yes, you can update the ecs.config file on a running cluster.

@aaithal
Copy link
Contributor

aaithal commented Jul 21, 2016

@MaerF0x0 as per @smugcloud's comment, you can just update the /etc/ecs/ecs.config file, then restart ECS Agent by running sudo stop ecs && sudo start ecs on an existing container instance.

@smugcloud we are working on enabling that in the UI. My question was more about writing an end-to-end test in the ECS Agent for splunk logging driver. If you look at the test that I mentioned earlier, we start a task (with container using awslogs driver) and validate that container's logs can be read from the Cloudwatch Logstream. We were wondering if we could something similar with splunk.

@smugcloud
Copy link
Author

@aaithal I don't have a great suggestion for that. Validating that all of the available drivers are active for the agent is a valuable test as it would have saved me some time, and prevent regressions in the future.

Outside of that, I supposed you'd have to do something similar and ensure calls are being made to the HTTP endpoint.

@ericwush
Copy link

@aaithal We are having the same issue until the env variable is manually configured in agent. Since we are able to specify the log driver in task definition, should the agent take care of the env variable without it being explicitly specified?

@aaithal
Copy link
Contributor

aaithal commented Sep 20, 2016

@ericwush Since the logging drivers could be streaming logs to an external endpoint, which could require additional configuration on the instance, we chose not to enable all logging drivers by default on the ECS Agent as there isn't a straight forward/generic scheme for the ECS Agent to validate such configurations, which could lead to task start failures.

@aaithal
Copy link
Contributor

aaithal commented May 5, 2017

Hi All, we are closing this issue as the only missing item here is enabling the Splunk logging driver by default. Since it's hard for the ECS Agent to verify the driver configuration, we are not in favor of this being enabled by default. You can still override the environment variables and use the Splunk logging driver in your application. Please let us know if you have any additional comments.

Thanks,
Anirudh

@aaithal aaithal closed this as completed May 5, 2017
@sahil-zymr
Copy link

I am also facing the same issue with my setup. Is there a permanent fix provided by AWS or do we have to run command : echo ECS_AVAILABLE_LOGGING_DRIVERS='["splunk","awslogs"]' >> /etc/ecs/ecs.config on all the ECS nodes?
Looking forward for a response.

@dmarginian
Copy link

dmarginian commented Sep 6, 2024

It's frustrating that the ECS task definition provides an option to stream logs to Splunk, but doesn't enable the necessary configuration on the EC2 instance by default. I wasted hours troubleshooting why my tasks were stuck in the provisioning state, only to fail with the vague "TaskFailedToStart: Attribute" error. If additional setup is required (it shouldn't be), the UI should clearly indicate that. It would save a lot of time and frustration.

"we are closing this issue as the only missing item here is enabling the Splunk logging driver by default."
I don't think this is the correct decision. It doesn't look like the same step is required if you are using Fargate, per https://repost.aws/knowledge-center/ecs-task-fargate-splunk-log-driver. If it isn't required in Fargate, it shouldn't be required if you are using EC2.

For anyone encountering this issue, you can also adjust the Launch Template used by your Auto Scaling Group, instead of updating each EC2 instance. In the AWS Management Console:

  • Navigate to the Auto Scaling Group section.
  • Select the Launch Template.
  • Choose Actions -> Modify Template.
  • Scroll down to the Advanced Details section and add the following to the User Data section:
    echo ECS_AVAILABLE_LOGGING_DRIVERS='["splunk","awslogs"]' >> /etc/ecs/ecs.config
  • After updating the template, restart your EC2 instances.

This should resolve the issue and allow your tasks to start successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants