Review: Compellent Storage Center – Part I
This is a series of posts reviewing the Compellent Storage Center Storage Array.
Compellent Inc, founded in 2002, produces the Storage Center product, a SAN storage array build around commodity hardware. In addition to providing advanced features found on newer storage arrays (such as thin provisioning), the Compellent device has one unique (for now) feature that sets it apart from the competition and that’s the ability to tier storage at the block level, a feature known as Dynamic Block Architecture. Where traditional arrays place an entire LUN onto a single tier of storage, Storage Center breaks down the LUN into smaller chunks, allowing a finer granularity in the way data is written to disk. As we will see with this hardware and software feature review there’s more to the tiering than initially appears.
Company Background
Before we get into the technical specifications, let’s look at the company in more detail. As previously mentioned, Compellent Inc was founded in 2002 and is based in Minnesota, USA. The company is publicly traded, having filed for IPO in October 2007 and now claims over 1000 customers in 25 countries with over 2000 array deployments. Since IPO, the company has moved into profitability and increased revenue and margin consistently. See the embedded graph for more details. In recent months, Compellent has been seen as an acquisition target, following the bidding war for 3Par by HP and Dell. It remains perhaps one of the only independent SAN storage array vendors that target the tier 1 or Enterprise-class space.
In the remainder of this post, we’ll look at the hardware itself.
The Hardware
Compellent have provided one of their Storage Center Model 30 controllers (CT-SC030) with two disk shelves for the review. The disk shelves contain SSD, SAS and FC drives, enabling configurations of up to three tiers to be tested. We’ll look at those in a moment.
The controller itself is pretty straightforward and is a standard PC chassis and motherboard. It has the following specifications:
- 3GB onboard memory
- Intel Xeon 5160 – 3Ghz
- SuperMicro Motherboard
- Dual redundant power supplies
- Six on-board fans
- 4x PCI-Express expansion slots
- 1x PCI-X expansion slots
- 2x on-board ethernet GigE ports
The expansion slots are used to support external connectivity to hosts and disk shelves. The review model was supplied with Qlogic iSCSI HBAs in slots 1 and 2, a Qlogic QLE2464 in slot 3, providing both front and back-end Fibre Channel connectivity and a SAS controller in slot 6. The only non-commodity part of the hardware is a cache controller that sits in slot 5. This is directly manufactured by Compellent rather than a 3rd party supplier. Power supplies are hot-swappable components, however fans, cache cards and interface cards aren’t unless the array is part of a dual controller configuration and in the case of interface cards, have been configured in a redundant design. This factor is clearly a consideration when choosing a storage system as powering down systems for parts upgrade is both intrusive in maintenance slots as well as in terms of outages due to failed components.
Two disk shelves (termed enclosures) have been provided with the evaluation unit. One houses SSD and Fibre Channel drives and is FC connected from the controller, the other is SAS connected and holds large capacity (1TB) SAS drives. Each enclosure contains dedicated power supplies, fans and I/O modules with redundancy built in. This increases the overall availability of a single Compellent array solution. Fibre Channel enclosures hold up to 16 drives in a horizontal 4×4 configuration occupying 3U; SAS enclosures hold up to 12 drives horizontally in 2U. All drives are hot-swappable. Drive capacities and types currently supported include (excluding EOL models):
- Fibre Channel – 15K 300GB & 15K 450GB
- SATA – 500GB & 1TB
- SSD – 140GB
- SAS – 15K 450GB & 7.2K 1TB
In the evaluation equipment the SSD drives were supplied by STEC and the remaining drives were Seagate models, but presumably could be sourced from multiple manufacturers as the drives reported their standard model names in the Storage Center GUI.
Both the controller unit and enclosures look pretty non-descript (see the videos at the foot of this post showing the controller, with bezel removed). In my opinion, the look of hardware is much less important than the reliability and functionality it offers (HP storage products, for example, all look like servers and enclosures). All of the components of the Storage Center hardware – slots, power supplies, fans – are all visibly monitored from the central management tool, providing consistent reporting on the hardware status at any time. This level of detail is much more important than the colour of the front bezel, in my view. As we will see in the next few posts, the “secret sauce” is achieved through software, rather than bespoke hardware components. In the meantime, enjoy this brief video of the hardware as it was being installed.
[vimeo]http://vimeo.com/14623557[/vimeo]Review: Compellent Storage Center – Part II
In this post we’ll discuss the logical configuration, connectivity and protocols available on the Compellent Storage Center array, including the way disks are grouped and LUNs are created from the underlying storage.
Where’s My LUN?
The first thing that should be noted as we dive into the detail on how the Compellent array stores data is that it does not operate like traditional storage arrays with disks in fixed RAID groups and LUN configurations, but uses the previously mentioned Dynamic Block Architecture. RAID 10 RAID-10 DM (Dual Mirror), RAID-5 and RAID-6 configurations are supported (including both RAID-5 with 5 or 9 drives in a RAID set and RAID-6 with 6 or 10 drives), however that’s where the similarity with traditional arrays ends. The underlying physical disks are actually simply grouped together to provide raw disk capacity and LUNs are configured from that storage. RAID is applied to each individual block of a LUN and this can change over the lifetime of that block of data. At the outset this may seem like a complicated design but in reality it isn’t. By breaking down a LUN to the block level and then applying protection and performance criteria, Compellent can achieve higher performance from a system using less cache and crucially less high performance drives. Let’s start at the basic disk level and work up to define how the Compellent system works.
Disk Pools
Physical disks are classified by their rotational speeds, which effectively groups them by performance. In the review hardware, drives were classified as SLCSSD (for STEC SSD drives), 15K (the fibre channel drives) and 7K (the SAS drives). By default all disks are added to a single group (or folder) called “Assigned”. Disks that are not in use are assigned to a dummy folder called “Unassigned” from where they can be added to a new or existing disk folder. Compellent recommend keeping a single disk folder, as spares must be assigned within each group. Having multiple folders would both waste spares from a capacity perspective and reduce performance as I/O would be spread over less spindles. Of course it is possible to create separate groups if you wish.
As part of the disk folder definition, either single or dual parity must be specified for each tier. Screenshots 2 & 3 show the setup of the default “Assigned” group and a second “New Disk Folder 1″. There are two other things to say about disk folders. Firstly disks can be removed from a folder. This requires “evacuating the disk” which can be achieved by moving it to another dummy folder. If the disk contains no data, it can simply be removed from the folder. Second, as disks are added to a folder, there’s the risk of RAID imbalance, with all of the data existing on the disks already in the disk folder. Therefore as disks are added, the RAID configuration can be rebalanced to obtain optimum use of all spindles.
Storage Profiles
The concept of Storage Profiles is where the Compellent Storage Center “secret sauce” is to be found. These determine how the system writes data to disk for each LUN (or volume as they are known in Compellent terms) and how data ages over time – a feature called Data Progression. Let’s look first at the profiles.
For each volume/LUN in an array, the Storage Profile determines how data is written to disk. Storage profiles have two components, specification of where data should be written and specification on where Replay data should be located. It’s worth taking a moment to understand what Replays are as I’ve yet to mention them. Replays are essentially snapshots, used to return a volume to a previous point in time. By their nature, Replay data blocks are only ever used for reads as all writes made after a snapshot/replay is taken will be written to a new location in order to preserve the replay for a potential future restore. Replay blocks are therefore not part of the active write set of data being written to a LUN and don’t always need to reside on high performance storage; if they are being read frequently then they will reside in cache. Storage Profiles allow the administrator to indicate what should happen to both writable blocks and replay blocks for a volume. A high performance LUN could, for example, have its writable data on tier 1 storage with RAID-10 configuration and have replays on RAID-5 SAS. A medium performance volume could have writable data on tier 2 15K Fibre Channel and replay data on SATA.
The use of Storage Profiles provides some very important benefits in optimising the performance of the disks in a Storage Center array. They allow on a LUN basis, the exact performance criteria to be specified. In addition, only active data is retained on the highest performance storage with inactive data moved off to lower performing (and lower cost) devices. As RAID is established at the block level, this means a granularity on writes of 2MB in a standard configuration. If required, volumes can be created using 512KB blocks where writes are small.
Replays and Data Progression
Although I’ve touched on the subject of classifying data into active writes and replays, I haven’t explained the actual mechanism in which data moves between these groups. There are two methods by which data is migrated between tiers of storage; via Data Instant Replay and through Data Progression. Replays as we have discussed are Point-In-Time snapshots of volumes. When a replay is taken, all of the pages comprising a volume are frozen. Subsequent writes to the volume are made to new blocks on storage. This preserves the data at the point the Replay was taken and also quite helpfully allows the blocks that are being actively written to be distinguished from those which are inactive. The Replay blocks can then be moved to a lower tier of storage. Compellent recommend that every volume has a Replay taken on a daily basis.
Data Progression uses a similar technique to move blocks of data that are less frequently used, down to lower tiers of storage over time. Initially all writes are made to the highest tier of storage and over time, migrated to lower tiers based on frequency of access. This occurs at the block level and means Storage Center arrays can be configured with the optimal mix of different drive types. For instance, if more performance is required, SSD could be added; if more capacity is required, SATA can be added.
In summary, the Compellent design enables data placement to optimised to the block level, with less frequently accessed data moved to lower tiers of storage. In normal circumstances the default settings can be used but if specific high performance volumes are needed, this can be accommodated too.
Protocol Support
One final topic, as this is becoming a long post. Storage Center supports both Fibre Channel and iSCSI connectivity. Unusually for storage arrays it allows both protocols to be mixed to a single volume at the same time; so I/O can be actively using both fibre channel and iSCSI. If you have the right version of switch firmware, Fibre Channel also supports NPIV, which enables the creation of virtual ports on physical ports. I hope to go over this in a future post.
In the next post I’ll discuss performance and some of the other miscellaneous features such as replication and clustering.