Lab Overview

In this lab, you will build a working web application, served from within your VPC and complete with a logging back end provided by Amazon Elasticsearch Service and with real-time monitoring using Kibana. The application provides a movie search experience across 5,000 movies, powered by Amazon ES and served with Apache httpd and PHP. The logging infrastructure sends the httpd web logs to Amazon ES via Amazon ElastiCache for Redis, which we use to buffer the log lines, and Logstash, which transforms and delivers records to Amazon ES.

All components of the solution reside in a VPC. In this lab, we explore how to use Amazon ES in a VPC for scalable log handing as well as for full text search. In addition to the application and logging infrastructure, you will deploy an internet gateway to allow traffic to flow to your application via an Application Load Balancer, and a proxy/bastion instance to allow administrative and Kibana access.

For the logging infrastructure, we use Filebeat and Logstash on EC2, Amazon ElastiCache for Redis and of course Amazon Elasticsearch Service. Filebeat is a host-based log shipper that remembers its location if interrupted. Logstash collects, transforms and pushes your data to your desired store which in this case is an Amazon Elasticsearch Service Domain. The combination of these items gives a flexible, configurable, private networked option within VPC that will allow you to scale as your volume increases.

Lab Goals

  • Deploy a secure end to end solution within VPC Private Networking
  • Host two indexes (movies and logs) with which the solution interacts
  • Leverage managed services from AWS and popular tools from the Elasticsearch ecosystem
  • Visualize the log interactions with Kibana

Lab Materials

The majority of this lab will be controlled with nested CloudFormation templates. The templates will enable you to create the necessary resources needed to achieve the goals of the lab without worrying about the details of getting the components set up to create the solution.

The organization of the templates are as follows:

  1. bootcamp-aes-moas – This template wraps the other templates below to provide a single template that you can execute to deliver all of the infrastructure.
  2. bootcamp-aes-network – builds the VPC, subnets, and NAT gateway and bastion used for the lab activities and hosting the SSH Tunnel and proxy to the Amazon ES domain.
  3. bootcamp-aes-redis – builds the Amazon ElastiCache for Redis cluster.
  4. bootcamp-aes-domain – builds the Amazon Elasticsearch Service domain
  5. bootcamp-aes-logstash – builds a logstash deployment behind an Auto Scaling Group that pulls from Redis and pushes into the Amazon Elasticsearch Domain.
  6. bootcamp-aes-servers – builds the final layer, the web application. From this layer, requests are logged each time the user interacts with the website; an IMDB search engine.

Amazon Elasticsearch Service Feature Details

Placing an Amazon ES domain within a VPC enables secure communication between Amazon ES and other services without the need for an Internet gateway, NAT device, or VPN connection. All traffic remains securely within the AWS Cloud. Domains that reside within a VPC have an extra layer of security when compared to domains that use public endpoints: you can use security groups as well as IAM policies to control access to the domain.

To support VPCs, Amazon ES places an endpoint into either one or two subnets of your VPC. A subnet is a range of IP addresses in your VPC. If you enable zone awareness for your domain, Amazon ES places an endpoint into two subnets. The subnets must be in different Availability Zones in the same region. If you don't enable zone awareness, Amazon ES places an endpoint into only one subnet.

The following illustration shows the VPC architecture if zone awareness is not enabled.

The following illustration shows the VPC architecture if zone awareness is enabled.

Amazon ES also places elastic network interfaces (ENIs) in the VPC for each of your data nodes. Amazon ES assigns each ENI a private IP address from the IPv4 address range of your subnet and also assigns a public DNS hostname (which is the domain endpoint) for the IP addresses. You must use a public DNS service to resolve the endpoint (which is a DNS hostname) to the appropriate IP addresses for the data nodes:

  • If your VPC uses the Amazon-provided DNS server by setting the enableDnsSupport option to true (the default value), resolution for the Amazon ES endpoint will succeed.
  • If your VPC uses a private DNS server and the server can reach the pubic authoritative DNS servers to resolve DNS hostnames, resolution for the Amazon ES endpoint will also succeed.

Preparing Key Pairs

In order to access the EC2 instances deployed by the lab, you need an SSH key pair. You can register a key pair for use by EC2 by following the instructions below.

Using existing SSH Key

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html#how-to-generate-your-own-key-and-import-it-to-aws

Creating a new SSH Key with EC2

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html#having-ec2-create-your-key-pair

Additional Instructions for Windows

Windows - http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/putty.html#putty-private-key

Building the solution

Create a service linked role for Amazon Elasticsearch Service

If you have never created an Amazon Elasticsearch Service Domain in a VPC in your account, you will need to create a new Role. Check to see if it exist by navigating to IAM, then go to the Role link and search for "Elastic". Look for AWSServiceRoleForAmazonElasticsearchService.

If the role exists, you can go ahead and proceed to the main lab.

Ensure you have a user with a secret / access key that can execute IAM commands

If you have not created an AWS IAM user on your account, please go to the following link to create the user. You will need the secret / access key from the user to leverage the CLI unless you are running on an EC2 instance with a role for IAM.

https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html

Give the user permissions to (create a policy, attach to user or role with user):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "esclass",
            "Effect": "Allow",
            "Action": "iam:*",
            "Resource": "*"
        }
    ]
}

Create the secret / access key. https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html

Ensure you have the latest version of the AWS CLI installed on your machine

Please navigate to the following link to install the CLI if does not exist. https://docs.aws.amazon.com/cli/latest/userguide/installing.html

If you already have the CLI, you will need to ensure it is the latest. Leverage "pip install", "yum" or grab the latest .msi from the link above.

Once the CLI is at the latest and you have it configured using aws configure command (add secret / access key, region –since iam is agnostic of region can be any – choose us-east-1, and choose json format), you can then execute the following command:

aws iam create-service-linked-role --aws-service-name es.amazonaws.com

Install the CloudFormation templates

Sign into your AWS account and navigate to the CloudFormation service

Click on the CloudFormation service to get into the service console.

Once you click on the Create Stack button, you will be presented with the following options set.

Select the "Specify an Amazon S3 template URL" and enter the following path:

https://search-sa-log-solutions.s3-us-east-2.amazonaws.com/logstash/templates/json/bootcamp-aes-moas

Click on the next button to navigate to the parameters needed to enact the CloudFormation template.

Populate the parameters needed to create the stack

Most of the parameters are pre-populated and you will not need to change them. The stack name, your SSH key and the email address fields are the only options you will need to change for this template if you are not using a shared account. If you are using a shared account, make sure you vary the stack name, domain name and environment tag as these things differentiate your deployment in a shared account.

Let's review the inputs and their meaning:

  1. 1)Stack Name – The name for this CloudFormation stack. You will find the details on the Amazon ES domain, The IP address for the bastion, and the URL for the web server in the Outputs section of this stack. Your initials will suffice.
  2. 2)CIDRPrefix – (use default) - this B block is used for the seed to create a /21 VPC with 2 - /24 public and 2 - /24 private subnets across 2 AZs.
  3. 3)ElasticsearchDomainName – (use default) - the name for your Amazon ES domain.
  4. 4)EnvironmentTag – (use default) – used to tag your resources
  5. 5)KeyName – they Key Pair name you created
  6. 6)OperatorEMail – Email address to receive autoscaling notifications

Click Next.

Leave the options blank on the Options screen and click Next.

Click the check box by I acknowledge that AWS CloudFormation might create IAM resources with custom names and then click Create.

CloudFormation will kick off the deployment of the other templates to their own stacks. It can take up to 30 minutes for the whole process to complete. You will see notations called NESTED that indicate child stacks were created by the parent template. Click the name of your stack ( aes in my case) to see the details of the creation.

When the aes stack is done, you will see it marked CREATE_COMPLETE.

Click the check box next to aes to reveal details.

Then click the Outputs tab.

Launch the Application

Go to the output value called ApplicationLoadBalancerURL.

Let's go ahead and hit the website. Using the URL from the outputs section of the web server stack, navigate to the home page.

If you then navigate to the "Search IMDb for Movies" button and click on it, you will be presented with the search page. In the search box for "Search movies", enter in something like "ship", "car", etc.

Type in a couple of words, click Search , and to get some search results.

Now that we have the solution up and running, let's start visualizing our data.

Please use the following instructions for creating a tunnel and a proxy to the Amazon Elasticsearch Domain.

Linux / Mac:

In order to load Kibana on your laptop’s browser, you need to send the traffic to your Amazon ES domain. Since the domain is in your VPC, and your Amazon ES cluster is in a private subnet, you must pass the traffic through the linux management portal. You need the public IP address of the Linux management portal instance. Navigate to the CloudFormation, AWS console. Click the parent stack to view the details.

Then click the Outputs tab

Find the LinuxAndMacPortForwardingCommand Open the Terminal app and use the command found in the value to set up SSH tunnel forwarding. For example, in mine value looks like so:

ssh -i </path/to/your/key.pem> -N -L 9200:vpc-labdomain-awsawsawsaws.us-east2.es.amazonaws.com:80 ec2-user@11.22.33.44

Be sure to use the location of your pem file (created in prior instructions for the lab) as the replacement </path/to/your/key.pem>.

Now you can access Kibana via http://localhost:9200/_plugin/kibana

For Windows - See this document : https://search-sa-log-solutions.s3-us-east-2.amazonaws.com/logstash/docs/Kibana_Proxy_SSH_Tunneling_Windows.pdf