The problems that we've had in the past with these have centered around "reset storms", where a single reset expands into a great number of resets, and I/O throughput quickly diminishes to zero.
The problem is that when a reset occurs on an expander, it aborts any in-flight operations, and they fail. Unfortunately, the *way* in which they fail is to generate a generic "hardware error". The problem is that the sd(7d) driver's response to this is to ... issue another reset, in a futile effort to hopefully correct things.
Now the problem is that this behavior is also performed, by default, for media errors as well. E.g. if you have a disk that has a bad sector on it. Of course, if your disk is mostly idle, it won't be a problem. But if you have a lot of I/O going on, its going to result mostly in a melt-down.
There is good news though, because of the way LSI's drivers are designed.
The LSI mptsas driver at least (and I suspect mpt as well, though I don't have code to look at it) treats "bus-level" resets and "target-level" resets as the same. Both of them do a reset, which will of course reset the expander.
But we can disable the most pernicious reset in sd with the following line in sd.conf:
allow-bus-device-reset=0;
This will allow bus-wide resets to occur, but it will most specifically disable the reset in response to generic hardware and media errors. The relevant section of code in sd.c is this:
if ((un->un_reset_retry_count != 0) &&
(xp->xb_retry_count == un->un_reset_retry_count)) {
mutex_exit(SD_MUTEX(un));
/* Do NOT do a RESET_ALL here: too intrusive. (4112858) */
if (un->un_f_allow_bus_device_reset == TRUE) {
boolean_t try_resetting_target = B_TRUE;
/*
* We need to be able to handle specific ASC when we are
* handling a KEY_HARDWARE_ERROR. In particular
* taking the default action of resetting the target may
* not be the appropriate way to attempt recovery.
* Resetting a target because of a single LUN failure
* victimizes all LUNs on that target.
*
* This is true for the LSI arrays, if an LSI
* array controller returns an ASC of 0x84 (LUN Dead) we
* should trust it.
*/
if (sense_key == KEY_HARDWARE_ERROR) {
switch (asc) {
case 0x84:
if (SD_IS_LSI(un)) {
try_resetting_target = B_FALSE;
}
break;
default:
break;
}
}
if (try_resetting_target == B_TRUE) {
int reset_retval = 0;
if (un->un_f_lun_reset_enabled == TRUE) {
SD_TRACE(SD_LOG_IO_CORE, un,
"sd_sense_key_medium_or_hardware_"
"error: issuing RESET_LUN\n");
reset_retval =
scsi_reset(SD_ADDRESS(un),
RESET_LUN);
}
if (reset_retval == 0) {
SD_TRACE(SD_LOG_IO_CORE, un,
"sd_sense_key_medium_or_hardware_"
"error: issuing RESET_TARGET\n");
(void) scsi_reset(SD_ADDRESS(un),
RESET_TARGET);
}
}
}
The savy folks here might notice that this is a wide setting, which is true. You can set it on a specific instance of sd, which requires more effort. There is also a better way to do this, by setting the reset_retry_count property to zero. However, setting the sd.conf property for that properly is considerably more complex, because of the byzantine syntax that sd uses to set up target-specific property values.
So, I still recommend avoiding these SATA expanders. But if you have no choice, then using this sd.conf tunable may be a reasonable workaround.
At the same time, I'm investigating the possibility of having this disabled by default for all of Nexenta's customers -- and possibly even in illumos. If you're a SCSI expert and have opinions on the matter, please let me know.
