718-888-8898 www.tcisystems.com Fax 718-888-8945
The
following information is “Confidential Work Product”. of
TCI Systems, Inc. It is made available on
this Web Site as a service to TCI’s customers.
This information is not for distribution, in any form, to any third
party without TCI Systems, Inc.’s written permission.
Although this information is believed to be accurate and has been tested in the TCI laboratory, all recipients of this information are advised to use their own judgment in utilizing this information. TCI Systems, Inc. will not be responsible for any damages that may result from the use of the procedures or information presented on this Web Site. By using the information contained herein, the customer agrees to hold TCI Systems, Inc. harmless
ARCSERVE
Warning
Do
Not Use the “FULL Erase” command!
“Full Erase” is a LongSCSI command that causes substantial delays until the
command is completed.
TCI has evaluated the
workarounds and has determined that the most reasonable course of action is to
avoid the “FULL ERASE”.
Here is an excerpt
from a CA Support Document:
During
"long SCSI commands", you will experience some or all of the following
symptoms:
What is a "long SCSI command"?
They are scsi commands that
take some time to complete such as LOAD, Full ERASE operations, REWIND from the
end of tape, and changer specific commands like MOVE MEDIUM and INIT ELEMENT
STAT. They are considered "long" because they can take minutes or
hours to run as opposed to fractions of a second for more regularly used
commands like READ, WRITE, etc.
Why does this happen?
This occurs on a NetWare server due to the use of Real Mode Interrupts. Because
NetWare sits on top of a DOS kernel (16 bit OS) it must use Real Mode
interrupts to accomplish certain time slicing of the CPU instead of Protected
Mode interrupts. For instance, when a Read command is sent to the SCSI bus, the
command is executed and an interrupt is returned to signal completion of that
command which in turn allows the execution of the next command queued for CPU
access. Since any single Read command may take very little time to complete the
server and the user do not notice the wait. If, however, a Rewind command is
sent to the SCSI bus and the tape is at the End Of
Media mark this command can take as long as 5 to 10 minutes depending on the
tape and drive used. During the time NetWare is waiting to receive such an
interrupt, the server hangs. After the command is completed the interrupt is
returned as with the completion of a read command and the server begins to
respond. With Protected Mode Interrupts, the control of the CPU would be shared
and the server would not hang.
Here is an excerpt from a Novell
Document that further explains some of the issues:
The problem occurs because the server switches to real mode
during a long scsi command.
This results in high utilization on the server with the loss of all IO to the
workstations. This is done so that requests made in protected mode are
preserved until the server switches back to protected mode. The call NPA_Squelch_All_IO in the nwpa.nlm
is what does this. Usually this does not constitute a problem except when
devices are used that use long scsi
commands such as a tape library. Users will lose access to the server and
appear to hang.
The fix is to cause the
server to switch to real mode less often. The most effective means is the dosfat.nss contained in this file. This will mount the
server's local drives as netware volumes. Since the
local drives are no longer in the dos enviornment,
the server does not have to switch to real mode to access them.
If
there is software on the server that causes the switch to real mode, then you
will still have problems.
Here is an excerpt that is the
basis for our recommendation:
Dosfat.nss recommends
that Auto Restart After Abend
= 0 to prevent corruption of the DOS Partition. To clarify just what was meant
by this, when auto restart after abend
is set to just suspend a thread and continue with normal server operations, you
run the risk of introducing corruption on a mounted DOS volume. (We are talking
about corruption of the DOS files here, not NetWare files.). Traditional
NetWare volumes are protected against Abends by TTS,
or corrected using VRepair. NSS volumes are protected
becausethey're journaled.
DOS, however, does not have that same protection. That is why the
recommendation is in place. (Also, nss has its own
cache apart from dos. When the server is in the debugger to get a coredump, the dump uses dos cache and not nss. If nss happend
to be in the middle of writing a file at the time the server abended, the coredump could
overwrite the nss file causing corruption, because
each cache system does not know about the other)
Now that the C partition is mounted as a NetWare
volume by DOSFAT.NSS, this chance of corruption now extends to the DOS
partition. In NetWare 5 and 6, the use of the C partition to store information
has increased. Not only does the server write the Abend.log
file there (as in the past) but that is where the server registry is located.
Additionally, other third party vendors utilize the C partition for their files
as well. These vendors include but are not limited to QLogic
and Compaq. Based on this information it was the engineers recommendation to
set auto restart after abend
= 0.
In the event of file corruption on NetWare / NSS
volumes, utilities exist to help correct problems with files should any be found. However, there aren't utilities / checks in place to
repair or prevent that corruption from affecting the files on the DOS
partition.
Other precautions should be taken if you choose to set Auto Restart After Abend to 2. These
precautions include making a periodic copy of the existing startup directory on
the C partition (nwserver) to an alternate directory
or CD to ensure that a file or the entire directory can be restored quickly in
the event of disk corruption. In addition to this, the scan disk utility can be
added to the DOS partition if it not already there and it can be launched out
of the AUTOEXEC.BAT file for server boot. Since the amount of data on any
server's DOS partition should be quite low, the utility can be used to correct
any issues that may pop up and it would do it quickly.