Alfred Perlstein [mailto:email@example.com] wrote:
> This would allow us to use writecaching for data, but force
> stable storage for meta-data. I think we'd also want to use
> this for forced data sync (fsync(2) and files opened with O_SYNC).
I do hope this gets implemented. I did something similar at Hitachi in our
OSF/1 port to the S/390 back around 1992. The disk controllers had a
substantial amount of volatile RAM cache, and a lesser amount of NV-RAM
cache. We directed the metadata writes to NV-RAM, and the data to volatile
cache (with a flush at partition close, of course). Since the hit rate on
metadata writes in UFS is very high, even with a small NV cache, we were
able to get substantial speedups in metadata intensive operations such as a
recursive directory copy.
We also implemented sequential I/O hinting, detected by the read-ahead
mechanism in the file system. Passing this hint down allowed the controller
to do a better job of cache management: sequential I/O recycled the buffers
after they had been read or written, rather than aging them through the LRU
list, so sequential reads and writes didn't trash the cache.
For NFS v2, it's also helpful to be able to mark the write I/O's as
non-cachable, thus hinting them toward NV-RAM.
In Windows NT, NTFS uses the SCSI Force Unit Access (FUA) bit on it's
metadata writes (log writes for sure; it should also use FUA for lazy writes
of metadata, else there is a race condition for recycling the log entry,
no?, but I don't know whether it actually uses FUA for lazy writes of
metadata). It doesn't hint the SCSI CDBs with sequential access information,
but such info is available in the IRP presented to the SCSI class driver,
and a filter driver could do some magic...
If NT and FreeBSD both support hinting of metadata writes, it's only a
matter of time before the hardware support appears.
333 South Street
Shrewsbury, MA 01545
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message