CHALLENGE
BlueAlly’s online retail client had failed an internal and external PCI DSS Audit for PCI compliance and was paying fines. An additional external audit failure would result in losing their ability to use credit cards with their highly profitable online eCommerce portal. The project had the attention of the CIO and other members of the C-Suite as their business was at significant risk if they failed the next audit.
What exactly is PCI DSS?
The Payment Card Industry (PCI) Data Security Standard (DSS) identifies Card Holder Data (CHD) and defines how to protect it. The standard defines three categories of systems for CHD:
- Category 1: any system that handles or stores CHD (1a) or a system so tightly linked with a 1a that it cannot be separated (1b).
- Category 2: systems that manage or send/receive data from Category 1 systems. This would include systems management, logging, NOC, and SOC access.
- Category 3: systems with no access to CHD and which cannot access Category 1 systems.
It also defines the communications between these categories:
Figure 1: PCI DSS Communications. Source https://www.pcisecuritystandards.org
STRATEGY
Summary: BlueAlly started with a review of the failed audits which led to an assessment of the environment. The assessment uncovered that there were several hundred systems involved and much of it was still in transition from bare metal to VMware. The overall project involved network, security, server and application teams who were operating with no clear and coordinated direction from management.
There were multiple paths to take for a solution so BlueAlly presented an abstract on the workload associated with each and proposed leading the effort on an integrated solution.
Audit Failure Analysis
The audit failures looked random at first, but analysis actually showed a pattern. Well known PCI Category 1 systems were often reachable by Category 3 systems or unreachable from other Category 1 systems after change windows associated with data center security. Adding or changing any Category 1 and 2 systems required work to be done across up to a dozen firewall pairs which led to a lot of security holes.
Scope of Required Firewall Rules
We quickly estimated that they were using over 500 PCI Category 1 (a & b) application systems that communicated among themselves and with over 100 PCI Category 2 systems. Compliance required creating and testing well over 100,000 IP address-based firewall rules using their existing firewall systems. Further complicating the task, adding or changing any Category 1 and 2 systems was difficult to automate since each subsystem was different (some had only three firewall pairs while others used as many as 12).
Competing Problems
There were additional complications surrounding the overall eCommerce environment.
- Applications – The Applications team was migrating from a waterfall development paradigm to agile, but this work had been very slow. The PCI DSS security issues were highly problematic for them, and they did not like having their process burdened by the requirements imposed on Category 1 systems.
- Systems – The Systems team dealt with servers and operating systems and worked independently from the Applications and Network teams. They had selected and partially implemented with VMware and had purchased VMware NSX but had not yet been trained on it when this project started.
- Network – The data centers were very large, having been built over a period of 10 years and having three distinct generations of switching equipment. Applications were placed on systems based on available rack space and power instead of security and connectivity requirements, greatly complicating firewall configurations.
- Security – There was no cohesive strategy; security rules were added and deleted on request, leaving unqualified application owners requesting the rules piecemeal.
A Path Forward
A consensus emerged to have a PCI team meeting with leads from each organization, and BlueAlly was able to facilitate cooperative work:
- Applications – After some simple discussion (and some pressure from their CIO), the team realized the systems could be modularized, and the number of Category 1 systems could be reduced dramatically without significantly impacting their schedules. The final count of Category 1 systems was reduced to under 100.
- Systems – BlueAlly engineers assisted with spinning up an NSX demonstration, and a decision was made to prioritize and migrate the eCommerce portal to VMware.
- Network – NSX required changes to the data center fabric, so steps were taken to reconfigure the data centers to support the VXLAN protocol it needed.
- Security – BlueAlly developed a security strategy to align with the compliance model.
SOLUTION
Summary: We believed that no single IT group could solve the issues. The solution was to engage all teams in a coordinated effort to meet the deadlines. This involved having the systems team accelerate the VMware conversion and bringing their network and security operations teams up to speed on the technology. In addition, BlueAlly worked with their compliance and applications teams on the importance of clearly identifying PCI-impacted systems.
Security Segmentation
The PCI standard documents the protected data and defines the communications permitted between categories (as shown in Figure 1 above).
So, to make this simple we created a new security strategy based on using a network overlay with the seven PCI DSS categories as our guide. Now, all of the rules associated with PCI DSS compliance could be enforced with firewall rules governing communications between the segments. This permitted the solution to move from over 100,000 IP address-based firewall rules to less than 1,000 rules, with the bulk of the security handled by just a few dozen rules within VMware NSX.
Furthermore, the attestation and audit processes were also greatly simplified:
Figure 2: PCI DSS Requirements and Security Assessment Procedures. Source https://www.pcisecuritystandards.org
Micro Segmentation
There were no regulatory requirements regarding micro-segmentation, but as a bonus, the NSX Security Group features used for security segmentation also permitted the capability to limit lateral spread within segments.
Why do this? In modern eCommerce environments, multiple VMs exist for a given function, and they can even spin up (and down) resources in response to load. The lateral spread of malware between networks is well understood, but not so much within a functional subnetwork. Micro-segmentation can prevent systems that run in parallel from infecting each other.
RESULTS
The customer passed their PCI audit and created systems, procedures, and processes to maintain compliance.
The security segmentation strategy permitted easy attestation by the external auditor that no PCI Category 1 systems could be reached by any Category 3 systems (and vice versa). Furthermore, the varying rules associated with PCI Type 2 (a, b, c, and x) were also greatly simplified. While PCI DSS did not require micro-segmentation within each virtual network, the auditor noted that it also ensured protection from lateral spread.