Technical Challenges for Large Enterprises using AWS

Large enterprises must exercise caution and make informed decisions prior to planning retirement of Data centers and hosting completely out of Public Cloud providers.Though Cloud offers numerous well-known benefits, it also exposes vulnerabilities within existing systems and requires significant investment to keep up the automation and operations.

For example, centralized functions of a large enterprise (Splunk Administration) calls for immediate decentralization and would need training and duplication of the setup across accounts/vpcs. Similarly services that existed without authentication within a hardened data center, must now implement authentication and in turn have its numerous clients update the authentication scheme.

Though AWS offers a very clean solution for a number of use cases, it is just not there yet as a full blown enterprise wide solution. AWS Platform needs significant design iterations to meet the complex needs of an enterprise. In this post, I want to call out some of the technical challenges with AWS that enterprises have to deal with.

VPC and Subnets
VPC CIDR range is limited to /16. This means only 64K IPv4 addresses are available within a given VPC. With increased adoption of  Cloud Native technologies, 64K IP addresses would not suffice for a large enterprise implementation and the enterprise have to spin up multiple VPCs.

In addition, either VPC or Subnet CIDRs cannot be changed or expanded after initial CIDR definition. It is quite possible to not get it right through the first implementation. Any changes for VPC/Subnets would require a complete tear down and re-build of the VPC. Some portions of the VPC/Subnet creation is quite tedious to fully automate.

Inter VPC Connectivity
VPC peering has it limitations as the VPCs have to be in the same region and should not have over lapping IP space. Currently it is not possible to have private connectivity across 2 VPCs in different accounts. When enterprise is within a Data Center, private connectivity is implied and readily available. As the Enterprise starts its journey to Cloud, private connectivity options are limited and the services mus harden security for cloud hosting.

Segmentation
AWS does not offer clear segmentation across security zones (dev, lab and prod). This means, it is quite challenging or would require extensive automation framework to support continous deployment pipelines across security zones.

IAM and AD
AWS IAM provides some powerful functionality through IAM roles which eliminates the need for key exchange. However, there is no direct correlation between IAM and Active Directory. Enterprises have lot of functions including service accounts wired up through AD and it is quite challenging to marry AD and IAM functionality.

IP Whitelisitng
IP whitelisting is a quite common practice for outbound connectivity for large enterprises. However AWS do not have IP white listing on its Internet/NAT Gateways. Implementers must implement squid or its equivalent which brings its own challenges.

EC2 Container Service
ECS events do not offer any useful information like as to whether a given deployment either a sucesss or failure or indeed completed.In addition, EC2 container service does not support either canary or blue/green deployments out of the box.

AMIs
Rolling out an updated AMI is a basic requirement for enterprises. In a on-premise setup, companies have well established processes to roll out updates to keep the systems up to date from security patches. However it is quite challenging to have an updated AMI applied through the EC2/ECS Clusters within a VPC.

API Gateway
Usage of API Gateway is untenable for HTTP Endpoints as it would require your HTTP endpoints to be fully public.

Elastic Search
AWS Elastic Search does not support plugins or dynamic scripts without which the usage of Elastic Search is limited. In addition, Access Policy configuration has issues  which can inadvertently expose sensitive data outside of the enterprise.

Lamda Limits
Lamda concurrency limits are at the Account Level. When multiple teams share the same account, it is quite possible for any team to choke the performance of other services inadvertently.

Multi-Region Support
Most of AWS services are region bound. However AWS does not offer any support for Multi Region connectivity or configuration out of the box. Implementations need to go through several hassles to have the multi region configuration working. With the recent S3 outage, reliance on 1 region/multi-az configuration is not tenable.

....The list goes on with almost every service that is offered by AWS. Customers of the Cloud platform provider either have to discover innovative ways and invest significantly to make the solution work OR can wait indefinitely for feature requests. On the AWS forums, you can see some compelling features for different services on the backlog for multiple years.

*** Commentary is based on AWS capabilities as of July 2017. Note that some of the gaps called out in this post may be closed by AWS in the future.

Comments