We required a SAN for a company which had the resilience, reliability and performance at a reasonable price point to hold a server estate of around 30 servers including the usual Microsoft suite of Domain controllers, SQL and Exchange which would all be virtualised using VMware along with approximately 1TB of file shares. I Bench marked HP P4000 (left hand), Netapp’s FAS 2040 and the new EMC VNXe 3300 and the combination of great functionality, performance and a really great price point I opted for the EMC VNXe 3300.
The requirement was for 99.99% availability of the hardware so the connectivity between SAN and VMware estate were to be protect by a pair of Cisco 2960G’s powered via twin APC ups and a power transfer switch. This would protect the environment from power, network, server or cable failure.
As the EMC VNXe3300 is so new, the documentation that came with it, while simple to read didn’t cover a more complex environment, only a simple installation into a single switch so after several test scenarios and a learning curve on how the VNXe fails over and fails back (which is the key to the solution described below) I have devised the following example on how to configure a VNXe 3300 for a production environment.
How does the VNXe fail over? The VNXe 3300 is very clever on how it detects and fails over services between it processors (even better than it was sold to me). You set up a service on one or a team of network ports and should complete loss of connectivity occur on those network ports, it will fail over the server associated with those ports to the other processor module. For example, if you set up iSCSI on port 2 on SPA and CIFS on port 3 on SPA, if you pull the network cable out of Port 2 it will fail over iSCSI to SPB but keep the CIFS on SPA (very clever indeed). Once the cable is plugged back into SPA the iSCSI service then fails back over to SPA.
So the configuration I have used and tested is as follows. Two cables teamed together from SPA ports 2 and 3 go into your 1st switch and two cables teamed together from SPB ports 2 and 3 go into your 2nd switch. On the Cisco switch a LACP trunk is created for each of these connections. I then created a LACP trunk of two ports to connect the two switches together. The VMware host are connected to the switches via a single network cable to Switch 1 and a single network cable to switch 2. This configuration will allow for any 1 item to fail and connectivity from VMware hosts to SAN will remain operation.
The diagram below shows in more detail how this should be connected.
A correction.
ReplyDeleteWhen you pull a network cable, the VNXe detects link down and routes the network connection across the VNXe backend to the corresponding port on the other storage processor. The "service" itself stays on the same storage processor.
If a storage processor fails, then everything fails over.
This comment has been removed by the author.
ReplyDeleteHi,
ReplyDeletefirst, let me say that your posts are great.
can you detail some more info why you didn't choose NetApp and lefthand, and why you prefer the EMC solution that is limited to iSCSI only? (exclude the price issue...)
thanks,
Ariel
Hi Ariel,
DeleteAnd thank you for your comments and questions. Let me 1st say that there is nothing wrong with either the NetApp or the Lefthand, in fact they are two very good productions, but it’s a matter of finding the right product for the solution. The criteria for this particular solution was that it had to have VMware integration, support CIFS and NFS, be easy to manage and fit at a certain price point but still deliver the IOPS requirements.
The LeftHand is a very good SAN, especially if you have two different buildings (with good bandwidth and low latency) whereby you can locate half of the array in one building an half in the other, giving the end customer excellent DR continuity. The downsides for this particular solution were that LeftHand (at the time I did this project) would only support iSCSI. HP’s recommendation was that we use a “Bridge server”, which in reality was to have a VM or physical server with a RDM to a LUN on the LeftHand. I would have preferred they were honest about this and just said we don’t support CIFS. The VMware integration is there, but isn’t (in my opinion) as good as the NetApp or the EMC, management of the devices wasn’t as intuitive as the EMC and the dual location was a nice to have rather than a necessity.
The NetApp is a brilliant SAN and I don’t think if anyone purchased one of these they would be disappointed. The de-duplication features on the unit are brilliant, especially if you’re rolling out VMware view (lots of identical VM’s). The couple of negatives against the NetApp why it wasn’t selected for this implementation were that the user interface wasn’t a “friendly” for the end user as the EMC and the main killer was the price. At the time of this project the EMC came with 3 years warranty and the NetApp was only 1 year. To extend the warranty on the NetApp to be the same as the EMC was in the region of £6,000 per, year coupled with the fact the equivalent IOPS device was significantly more expensive than the EMC, the total solution would have been almost double that of the EMC VNXe.
The VNXe supports CIFS, NFS and iSCSI our of the box and delivered what the customer required at the right price point.
Thanks again for your comments and reading my blog.
Kind regards
Andy
Andy I just ran into you Blog. Very nice work by the way. I have a quick question in regards to the communication between your VMWare Hosts and the Stacked Switches:
ReplyDeleteDid you create a LACP between your VMware hosts and Switches?
-Luis
Hi Luis,
ReplyDeleteIn that senario the switches were non-stackable. The switches where LCAP'd together for resiliance however. The VMware hosts would have a VM LAN connection to Switch A and another one two Switch B, then a SAN connection to Switch A and another one to Switch B. The VM LAN would have a vSwitch in VMware with just vmNic1 (going to switch A) and vmNic2 (Going to switch B). So an LACP trunk from the VM host to the switches is not needed.
The only time you would need that is if you needed more than 1gbps and then trunk to each switch.
If the switches are in a stack (so for example a set of cisco 3650's) then you can create a LACP trunk across two switch (so gigabit ethernet 0/1 and Gigabit ethernet 1/1) covering you for switch failure.
Hope that helps and thanks for reading.
Kind regards
Andy
Right. So in the case that it were a stacked environment, then creating a LACP connection would be fine. Correct?
ReplyDeleteNow then, how about between stacked switches and SAN? LACP too? Or all ports on Switch A go to SPA and all ports to Switch B will go to SPB?
I appreciate your help and quick response.
In a stacked scenario you can create an LACP trunk over multiple switches. So in that case you can trunk all 4 ports out of SPA (if you have the 4 port card) and take 2 of them to switch A and 2 of them two switch B. Create a LACP port-group containing the two ports from switch A and the two from switch B.
ReplyDeleteThen do exactly the same for SPB. This will give you resilience from switch, cable and SP failure.
Hope this helps.
Andy
Did you create 2 subnetworks? One for each SP connections?
ReplyDeleteFrom extensive testing now, you'll find it works best with NFS as the LUN targest (saves a lot of space on snapshots) and doing it that way a single subnet with 1 NFS target on one SPA and a single NFS target on the other SP works fine with the luns split across them.
Delete