Talk:Clarifying Direct IO's Semantics
From Ext4
(Difference between revisions)
(Created page with 'The Solaris mount_ufs man page suggests: If forcedirectio is specified [...] data is transferred directly between user address space and the disk. ...') |
|||
Line 1: | Line 1: | ||
+ | = Solaris's behavior = | ||
+ | |||
The Solaris mount_ufs man page suggests: | The Solaris mount_ufs man page suggests: | ||
If forcedirectio is specified [...] data is transferred | If forcedirectio is specified [...] data is transferred | ||
Line 44: | Line 46: | ||
See man mount_ufs(1M), directio(3C) | See man mount_ufs(1M), directio(3C) | ||
− | |||
--dave | --dave | ||
+ | |||
+ | :Dave, if Solaris is coalescing multiple write requests into a smaller number of physical writes, that implies that the actual write to disk has not been completed at the time when the write(2) system call returns. Otherwise, it would not be possible to coalesce the write request with other write requests. But that raises a major problem; how does the application know when it is safe to reuse the buffer passed to the write(2) system call? Are you sure Solaris really does write coalescing when directio is enabled? I see no documentation of that on the Solaris man pages; just your claim here. How did you test for it? [[User:Tytso|Tytso]] 19:16, 27 August 2009 (UTC) |
Latest revision as of 19:16, 27 August 2009
[edit] Solaris's behavior
The Solaris mount_ufs man page suggests:
If forcedirectio is specified [...] data is transferred directly between user address space and the disk.
forcedirectio is a performance option that is of benefit only in large sequential data transfers. The default behavior is noforcedirectio.
[That was a quote: a paraphrase follows at the end of this page]
Note the mention of large sequential I/O: in a recent project we were pleased (and the customer was a little surprised) to find that Solaris UFS was coalescing many contiguous logical writes into a substantially smaller number of large physical writes. This improved their performance when doing full-table scans and large updates.
There is more discussion in the directio man page, where they note that buffered I/O is used if the buffer is misaligned or mmap'd, and:
Large sequential I/O generally performs best with DIRECTIO_ON, except when a file is sparse or is being extended and is opened with O_SYNC or O_DSYNC
[Another quote]
Again, they recommend direct i/o for large read or writes.
To paraphrase for copyright purposes, one might say:
Solaris provides "forcedirectio" as a mount option, and when it is applied, data is transferred without being copied to the buffer cache. It is recommended as a performance optimization when large amounts of data are transferred sequentially, unlike other discussion of direct I/O. In practice, forecedirectio indeed does appear to coalesce multiple contiguous logical writes into a substantially smaller number of larger physical writes. This improves performance when doing full-table scans or other large I/O operations.
See man mount_ufs(1M), directio(3C)
--dave
- Dave, if Solaris is coalescing multiple write requests into a smaller number of physical writes, that implies that the actual write to disk has not been completed at the time when the write(2) system call returns. Otherwise, it would not be possible to coalesce the write request with other write requests. But that raises a major problem; how does the application know when it is safe to reuse the buffer passed to the write(2) system call? Are you sure Solaris really does write coalescing when directio is enabled? I see no documentation of that on the Solaris man pages; just your claim here. How did you test for it? Tytso 19:16, 27 August 2009 (UTC)