Platform-specific documentation/Sun Fire X4500 and X4540

From Wikitech

The X4500 and X4540 are non-clustered storage servers from Sun, with 48 SATA disks. We use them for upload storage, and will be using them for external storage. The X4500 is the original model, with 2 dual-core CPUs, 16GB RAM, and marvell88sx(7d) SATA controllers. The X4540 has 2 quad-core CPUs, 32GB RAM, and LSI mpt(7d) SAS/SATA controllers. Neither of these cards support hardware RAID; instead, we use Solaris with ZFS.

Both systems have hot-swappable disks, and the systems are racked to allow them to be fully extended out on their rails while remaining powered on with network connectivity.

Disk layout diagrams:

Sun X4540

---------------------SunFireX4540-------Rear----------------------------
 3:    7:   11:   15:   19:   23:   27:   31:   35:   39:   43:   47:   
c0t3  c0t7  c1t3  c1t7  c2t3  c2t7  c3t3  c3t7  c4t3  c4t7  c5t3  c5t7

 2:    6:   10:   14:   18:   22:   26:   30:   34:   38:   42:   46:   
c0t2  c0t6  c1t2  c1t6  c2t2  c2t6  c3t2  c3t6  c4t2  c4t6  c5t2  c5t6  

 1:    5:    9:   13:   17:   21:   25:   29:   33:   37:   41:   45:   
c0t1  c0t5  c1t1  c1t5  c2t1  c2t5  c3t1  c3t5  c4t1  c4t5  c5t1  c5t5  
 b           b   
 0:    4:    8:   12:   16:   20:   24:   28:   32:   36:   40:   44:   
c0t0  c0t4  c1t0  c1t4  c2t0  c2t4  c3t0  c3t4  c4t0  c4t4  c5t0  c5t4  
 b           b   
-------*-----------*-SunFireX4540---*---Front----*---------*--------

Sun X4500

Solaris

---------------------SunFireX4500------Rear----------------------------

36:   37:   38:   39:   40:   41:   42:   43:   44:   45:   46:   47:   
c5t3  c5t7  c4t3  c4t7  c7t3  c7t7  c6t3  c6t7  c1t3  c1t7  c0t3  c0t7  

24:   25:   26:   27:   28:   29:   30:   31:   32:   33:   34:   35:   
c5t2  c5t6  c4t2  c4t6  c7t2  c7t6  c6t2  c6t6  c1t2  c1t6  c0t2  c0t6  

12:   13:   14:   15:   16:   17:   18:   19:   20:   21:   22:   23:   
c5t1  c5t5  c4t1  c4t5  c7t1  c7t5  c6t1  c6t5  c1t1  c1t5  c0t1  c0t5  

 0:    1:    2:    3:    4:    5:    6:    7:    8:    9:   10:   11:   
c5t0  c5t4  c4t0  c4t4  c7t0  c7t4  c6t0  c6t4  c1t0  c1t4  c0t0  c0t4  
 b     b       
-------*-----------*-SunFireX4500--*---Front-----*-----------*----------

Linux

---------------------SunFireX4500------Rear----------------------------

36:   37:   38:   39:   40:   41:   42:   43:   44:   45:   46:   47:   
c3t3  c3t7  c2t3  c2t7  c5t3  c5t7  c4t3  c4t7  c1t3  c1t7  c0t3  c0t7  

24:   25:   26:   27:   28:   29:   30:   31:   32:   33:   34:   35:   
c3t2  c3t6  c2t2  c2t6  c5t2  c5t6  c4t2  c4t6  c1t2  c1t6  c0t2  c0t6  

12:   13:   14:   15:   16:   17:   18:   19:   20:   21:   22:   23:   
c3t1  c3t5  c2t1  c2t5  c5t1  c5t5  c4t1  c4t5  c1t1  c1t5  c0t1  c0t5  

 0:    1:    2:    3:    4:    5:    6:    7:    8:    9:   10:   11:   
c3t0  c3t4  c2t0  c2t4  c5t0  c5t4  c4t0  c4t4  c1t0  c1t4  c0t0  c0t4  
 b     b       
-------*-----------*-SunFireX4500--*---Front-----*-----------*----------
  • X4500: The OS drives are the two leftmost at the front (controller 5, disks 0 & 4), labelled 'b' in the diagram. No other drives show in the BIOS during post.
  • X4540: There are four OS disks, labelled 'b' in the diagram.

The disk serial numbers are readable when the top cover is off!


It helps to disable the IDE/PATA controllers in the BIOS to get more consistent SCSI device naming!


The disks are (currently) configured as a RAID-Z pool in 9 striped sets of 5; total space is ~8TB (250GB disks) or ~16TB (500G disks). It's mounted at /export.

Note: under linux we had some trouble determining which disks were which. Some speculation is available on the talk page.

Solaris disk names under Linux

http://lists.lustre.org/pipermail/linux_hpc_swstack/2008-June/000036.html has a non-working way to use Solaris like disk naming under Linux.

Mark rewrote this using valid udev syntax. This is only valid for the X4500 (Thumper).

First, make sure the two IDE controllers are not interfering with SCSI ID naming. Disable the pata_amd driver: make a file /etc/modprobe.d/blacklist-pata:

blacklist pata_amd

Then, run:

# depmod -ae
# update-initramfs -u

Solaris device name symlinks can be made by udev. Put the following in /etc/udev/rules.d/99-thumper-disks.rules:

#
# /etc/udev/rules.d/99-thumper-disks.rules
# Written on 2009/08/15 by Mark Bergsma <mark@nedworks.org>
#

# Disk rules
SUBSYSTEM=="block", KERNEL=="sd*[a-z]", KERNELS=="*:0:0:0", PROGRAM="/etc/udev/scripts/solaris-name.sh %b", ENV{SOLARIS_NAME}="$result", SYMLINK+="disk/by-cntrl/$env{SOLARIS_NAME}"

# Partition rules
SUBSYSTEM=="block", KERNEL=="sd*[a-z][0-9]*", KERNELS=="*:0:0:0", PROGRAM="/etc/udev/scripts/solaris-name.sh %b %n", ENV{SOLARIS_NAME}="$result", SYMLINK+="disk/by-cntrl/$env{SOLARIS_NAME}"

And this (executable!) helper script in /etc/udev/scripts/solaris-name.sh:

#!/bin/bash

seq=$(echo $1 | cut -d':' -f 1)
controller=$(($seq / 8))
disk=$((seq % 8))

if [ -n "$2" ]; then
	echo "c${controller}t${disk}d0s$(($2 - 1))"
else
	echo "c${controller}t${disk}d0"
fi

Replace a disk

X4540

Look at zpool status to find the failed disk:

       NAME          STATE     READ WRITE CKSUM
       export        DEGRADED     0     0     0
         raidz1      ONLINE       0     0     0
           c5t0d0    ONLINE       0     0     0
           c5t1d0    ONLINE       0     0     0
           c5t2d0    ONLINE       0     0     0
           c5t3d0    ONLINE       0     0     0
           c5t4d0    ONLINE       0     0     0
         raidz1      DEGRADED     0     0     0
           spare     DEGRADED     0     0     0
             c5t5d0  REMOVED      0     0     0
             c1t7d0  ONLINE       0     0     0
           c5t6d0    ONLINE       0     0     0
           c5t7d0    ONLINE       0     0     0
           c2t0d0    ONLINE       0     0     0
           c2t1d0    ONLINE       0     0     0

replace the physical disk, then run:

zpool replace export c5t5d0

(assuming c5t5d0 is the failed disk.) The disk will be resilvered, then the spare will be removed from the pool.

X4500

To replace a disk, it must first be removed from the zpool. The easiest way to do this is to replace it with the hot spare. For example, if c7t6d0 is being removed, and c0t2d0 is the hot spare:

# zpool replace export c7t6d0 c0t2d0

Wait for the resilver to complete (it'll take a few hours; see 'zpool status' for progress). Then remove the disk from the pool:

# zpool detach export c7t6d0

Identify the disk in cfgadm and unconfigure it:

# cfgadm -l | grep c7t6d0
sata5/6::dsk/c7t6d0            disk         connected    configured   ok
# cfgadm -c unconfigure sata5/6
Unconfigure the device at: /devices/pci@2,0/pci1022,7458@8/pci11ab,11ab@1:6
This operation will suspend activity on the SATA device
Continue (yes/no)? yes

The blue LED on the disk should illuminate. Replace the disk, re-configure it, and add the new disk to the zpool as a hot spare:

# cfgadm -c configure sata5/6
# zpool add export spare c7t6d0

OS setup

Press Ctrl-N during bootup for network boot.

Linux

Under Lucid, the install on the X4500s fail because grub doesn't understand device name /dev/sdac. On /dev/sdy it works.

Just go into a shell, chroot to /target, and use:

# grub-install '(hd1)'

Solaris

This is how the OS (Solaris) is set up on the image servers (ms1/ms5).

Solaris 10 is installed. Docs are at http://docs.sun.com/. All systems except ms1 are using ZFS root.

Disks c5t0d0 and c5t4d0 (X4500) or c0t0d0 and c1t0d0 (X4540) are mirrored for the root ZFS pool and swap.

For software not included in Solaris, we use the Toolserver software repository. This is in SVN at /trunk/tools/ts-specs. Get by:

 svn co https://svn.toolserver.org/svnroot/toolserver/trunk/ts-specs

These are RPM spec files that can be built into Solaris packages with pkgtool.

$ pkgtool -v build-only --download TSwhatever.spec
$ su
# pkgadd -d $HOME/packages TSwhatever

Most software from this repository installs into /opt/ts. PHP is in /opt/php for builds before July 2009. (Config file: /etc/opt/php/php.ini) For builds later than that it is in /opt/ts/php.

Service management is done with the Solaris SMF (Service Management Framework) (Reference: Solaris Administration Guide)

List services:

# svcs -a

Find out why a service is in 'maintenance' state instead of 'online':

# svcs -vx <name>

Disable a service:

# svcadm disable <name>

Enable a service:

# svcadm enable <name>

Disable a service but leave it enabled on the next boot:

# svcadm disable -t <name>

Enable a service but leave it disabled on the next boot:

# svcadm enable -t <name>

Service name can be abbreviated, e.g. 'lighttpd' instead of 'svc:/network/lighttpd:lighttpd'.

Lighttpd is the same as amane. Config: /etc/lighttpd/lighttpd.conf. It's built from TSlighttpd in the Toolserver repo.

For OS upgrades and patching, Live Upgrade should be used. This creates copies of the running OS. List current LU boot environments:

root@ms1:~# lustatus
Boot Environment           Is       Active Active    Can    Copy      
Name                       Complete Now    On Reboot Delete Status    
-------------------------- -------- ------ --------- ------ ----------
s10a                       yes      yes    yes       no     -  

Create a new boot environment as a copy of the current one:

# lucreate -n s10b

Mount the new environment:

# lumount s10b

List patches:

# /opt/ts/bin/pca -R /.alt.s10b

Apply patches to s10b and reboot:

# /opt/ts/bin/pca -R /.alt.s10b -ai
# luumount s10b
# luactivate s10b
# init 6

Lights Out Management

The Sun X4500 and X4540 uses iLOM.

The standard administrative account is root and the default password is changeme.

Common actions

Changing the root password

set /SP/users/root password=password

Serial console

start /SP/console

Power cycle

Reset:

reset /SYS

Power Off:

stop /SYS

Power On:

start /SYS


End the session with Esc-(

Setting an IP address for the network port of LOM

Login to the LOM over serial and run the following:

cd /SP/network 
set pendingipaddress=ipaddress
set pendingipnetmask=255.255.0.0
set pendingipgateway=10.1.0.1
set pendingipdiscovery=static
set commitpending=true

Once that is done, you should be able to connect over the management IP if it is plugged into the management network.

External links