http://www.linux-nfs.org/wiki/index.php?title=Nfsd4_server_recovery&feed=atom&action=historyNfsd4 server recovery - Revision history2024-03-28T22:25:10ZRevision history for this page on the wikiMediaWiki 1.16.5http://www.linux-nfs.org/wiki/index.php?title=Nfsd4_server_recovery&diff=4579&oldid=prevBfields: /* Requirements summary */2011-08-03T20:14:38Z<p><span class="autocomment">Requirements summary</span></p>
<table style="background-color: white; color:black;">
<col class='diff-marker' />
<col class='diff-content' />
<col class='diff-marker' />
<col class='diff-content' />
<tr valign='top'>
<td colspan='2' style="background-color: white; color:black;">← Older revision</td>
<td colspan='2' style="background-color: white; color:black;">Revision as of 20:14, 3 August 2011</td>
</tr><tr><td colspan="2" class="diff-lineno">Line 56:</td>
<td colspan="2" class="diff-lineno">Line 56:</td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>* When a new client becomes active, a record for that client must be created in stable storage before responding to the rpc in question (OPEN, OPEN_CONFIRM, or RECLAIM_COMPLETE).</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>* When a new client becomes active, a record for that client must be created in stable storage before responding to the rpc in question (OPEN, OPEN_CONFIRM, or RECLAIM_COMPLETE).</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>* When a client expires, the record must be removed (or otherwise marked expired) before responding to any requests for locks or other state which would conflict with state held by the expiring client.</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>* When a client expires, the record must be removed (or otherwise marked expired) before responding to any requests for locks or other state which would conflict with state held by the expiring client.</div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>* Updates must be made by upcalls to userspace; the kernel will not be</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div>* Updates must be made by upcalls to userspace; the kernel will not be directly involved in managing stable storage. The upcall interface should be extensible.</div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>directly involved in managing stable storage. The upcall interface</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div>* The records must include the client owner name, to allow identifying clients on restart. The protocol allows client owner names to consist of up to 1024 bytes of binary data. (This is the client-supplied long form, not the server-generated shorthand clientid; co_ownerid for 4.1).</div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>should be extensible.</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div></div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>* The records must include the client owner name, to allow identifying</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div></div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>clients on restart. The protocol allows client owner names to consist</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div></div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>of up to 1024 bytes of binary data. (This is the client-supplied</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div></div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>long form, not the server-generated shorthand clientid; co_ownerid for</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div></div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>4.1).</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div></div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>== Nice to have ==</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>== Nice to have ==</div></td></tr>
</table>Bfieldshttp://www.linux-nfs.org/wiki/index.php?title=Nfsd4_server_recovery&diff=4578&oldid=prevBfields: minor reorganization2011-08-03T20:13:57Z<p>minor reorganization</p>
<table style="background-color: white; color:black;">
<col class='diff-marker' />
<col class='diff-content' />
<col class='diff-marker' />
<col class='diff-content' />
<tr valign='top'>
<td colspan='2' style="background-color: white; color:black;">← Older revision</td>
<td colspan='2' style="background-color: white; color:black;">Revision as of 20:13, 3 August 2011</td>
</tr><tr><td colspan="2" class="diff-lineno">Line 8:</td>
<td colspan="2" class="diff-lineno">Line 8:</td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>= Requirements =</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>= Requirements =</div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;"></ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;">== Discussion ==</ins></div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>Requirements, as compared to current code:</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>Requirements, as compared to current code:</div></td></tr>
<tr><td colspan="2" class="diff-lineno">Line 47:</td>
<td colspan="2" class="diff-lineno">Line 49:</td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>inactive, stable storage must be updated, and until the update has</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>inactive, stable storage must be updated, and until the update has</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>completed the server must do nothing that acknowledges the new state.</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>completed the server must do nothing that acknowledges the new state.</div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div><del style="color: red; font-weight: bold; text-decoration: none;">So:</del></div></td><td colspan="2"> </td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div><del class="diffchange diffchange-inline">* When a new client becomes active</del>, <del class="diffchange diffchange-inline">a record for that client must be created in stable storage before responding to </del>the <del class="diffchange diffchange-inline">rpc in question (OPEN, OPEN_CONFIRM, or RECLAIM_COMPLETE).</del></div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins class="diffchange diffchange-inline">So</ins>, the <ins class="diffchange diffchange-inline">fundamental requirements are:</ins></div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div><del class="diffchange diffchange-inline">* When a client expires, the record must be removed (or otherwise marked expired) before responding to any requests for locks or other state which would conflict with state held by the expiring client.</del></div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div></div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;">== Requirements summary ==</ins></div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>Updates must be made by upcalls to userspace; the kernel will not be</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins class="diffchange diffchange-inline">* When a new client becomes active, a record for that client must be created in stable storage before responding to the rpc in question (OPEN, OPEN_CONFIRM, or RECLAIM_COMPLETE).</ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins class="diffchange diffchange-inline">* When a client expires, the record must be removed (or otherwise marked expired) before responding to any requests for locks or other state which would conflict with state held by the expiring client.</ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins class="diffchange diffchange-inline">* </ins>Updates must be made by upcalls to userspace; the kernel will not be</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>directly involved in managing stable storage. The upcall interface</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>directly involved in managing stable storage. The upcall interface</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>should be extensible.</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>should be extensible.</div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div> </div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins class="diffchange diffchange-inline">* </ins>The records must include the client owner name, to allow identifying</div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>The records must include the client owner name, to allow identifying</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div></div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>clients on restart. The protocol allows client owner names to consist</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>clients on restart. The protocol allows client owner names to consist</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>of up to 1024 bytes of binary data. (This is the client-supplied</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>of up to 1024 bytes of binary data. (This is the client-supplied</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>long form, not the server-generated shorthand clientid; co_ownerid for</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>long form, not the server-generated shorthand clientid; co_ownerid for</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>4.1).</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>4.1).</div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;"></ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;">== Nice to have ==</ins></div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>Also desirable, but not absolutely required in the first</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>Also desirable, but not absolutely required in the first</div></td></tr>
</table>Bfieldshttp://www.linux-nfs.org/wiki/index.php?title=Nfsd4_server_recovery&diff=4577&oldid=prevJlayton: /* Another Draft Design */2011-08-03T14:36:18Z<p><span class="autocomment">Another Draft Design</span></p>
<table style="background-color: white; color:black;">
<col class='diff-marker' />
<col class='diff-content' />
<col class='diff-marker' />
<col class='diff-content' />
<tr valign='top'>
<td colspan='2' style="background-color: white; color:black;">← Older revision</td>
<td colspan='2' style="background-color: white; color:black;">Revision as of 14:36, 3 August 2011</td>
</tr><tr><td colspan="2" class="diff-lineno">Line 147:</td>
<td colspan="2" class="diff-lineno">Line 147:</td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>Most of the files would be replaced with a single rpc_pipefs pipe in a nfsd subdirectory in rpc_pipefs. The daemon would open this pipe and listen on it. The upcall format would contain a command field that the daemon would use to determine what sort of message this was. The commands are as follows:</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>Most of the files would be replaced with a single rpc_pipefs pipe in a nfsd subdirectory in rpc_pipefs. The daemon would open this pipe and listen on it. The upcall format would contain a command field that the daemon would use to determine what sort of message this was. The commands are as follows:</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>* set_client: given a server IP address, and information that can be used to generate clientid4, returns clientid4 (and a verifier?). Called during SETCLIENTID/EXCHANGE_ID phase<del class="diffchange diffchange-inline">. This ensures uniqueness of the generated clientid4 by delegating it to userspace</del>. If set_client is called during the grace period, then the client must already be in the db. If not, an error will be returned (NFS4ERR_GRACE?). If set_client is called outside the grace period, then a <del class="diffchange diffchange-inline">new record </del>is <del class="diffchange diffchange-inline">created</del>, <del class="diffchange diffchange-inline">or something like EEXIST is </del>returned <del class="diffchange diffchange-inline">if </del>the <del class="diffchange diffchange-inline">client already exists in the db</del>. <del class="diffchange diffchange-inline">Does not return until a new record has safely been recorded on disk, but client </del>is marked "<del class="diffchange diffchange-inline">unconfirmed</del>".</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div>* set_client: given a server IP address, and information that can be used to generate clientid4, returns clientid4 (and a verifier?). Called during SETCLIENTID/EXCHANGE_ID phase. If set_client is called during the grace period, then the client must already be in the db <ins class="diffchange diffchange-inline">and the existing clientid4 is returned</ins>. If not, an error will be returned (NFS4ERR_GRACE?). If set_client is called outside the grace period, then a <ins class="diffchange diffchange-inline">unique clientid4 </ins>is <ins class="diffchange diffchange-inline">generated</ins>, <ins class="diffchange diffchange-inline">stored on disk and </ins>returned <ins class="diffchange diffchange-inline">to </ins>the <ins class="diffchange diffchange-inline">kernel</ins>. <ins class="diffchange diffchange-inline">The stored clientid </ins>is marked "<ins class="diffchange diffchange-inline">incomplete</ins>".</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>* confirm_client: given a clientid4 and a server-side IP address, returns an error. The kernel will call this on the first reclaim OPEN or OPEN_CONFIRM (for v4.0 clients) or on RECLAIM_COMPLETE (for 4.1 clients). Does not return until client is marked "confirmed".</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>* confirm_client: given a clientid4 and a server-side IP address, returns an error. The kernel will call this on the first reclaim OPEN or OPEN_CONFIRM (for v4.0 clients) or on RECLAIM_COMPLETE (for 4.1 clients). Does not return until client is marked "confirmed".</div></td></tr>
</table>Jlaytonhttp://www.linux-nfs.org/wiki/index.php?title=Nfsd4_server_recovery&diff=4576&oldid=prevJlayton: /* Another Draft Design */2011-08-03T14:30:11Z<p><span class="autocomment">Another Draft Design</span></p>
<table style="background-color: white; color:black;">
<col class='diff-marker' />
<col class='diff-content' />
<col class='diff-marker' />
<col class='diff-content' />
<tr valign='top'>
<td colspan='2' style="background-color: white; color:black;">← Older revision</td>
<td colspan='2' style="background-color: white; color:black;">Revision as of 14:30, 3 August 2011</td>
</tr><tr><td colspan="2" class="diff-lineno">Line 147:</td>
<td colspan="2" class="diff-lineno">Line 147:</td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>Most of the files would be replaced with a single rpc_pipefs pipe in a nfsd subdirectory in rpc_pipefs. The daemon would open this pipe and listen on it. The upcall format would contain a command field that the daemon would use to determine what sort of message this was. The commands are as follows:</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>Most of the files would be replaced with a single rpc_pipefs pipe in a nfsd subdirectory in rpc_pipefs. The daemon would open this pipe and listen on it. The upcall format would contain a command field that the daemon would use to determine what sort of message this was. The commands are as follows:</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>* set_client: given <del class="diffchange diffchange-inline">a client owner and </del>a server<del class="diffchange diffchange-inline">-side </del>IP address, returns <del class="diffchange diffchange-inline">an error. The kernel will call this on the first reclaim OPEN or OPEN_CONFIRM </del>(<del class="diffchange diffchange-inline">for v4.0 clients</del>) <del class="diffchange diffchange-inline">or on RECLAIM_COMPLETE (for 4</del>.<del class="diffchange diffchange-inline">1 clients)</del>. If set_client is called during the grace period, then the client must already be in the db. If not, an error will be returned (NFS4ERR_GRACE?). If <del class="diffchange diffchange-inline">get_client </del>is called outside the grace period, then a new record is created, or something like EEXIST is returned if the client already exists in the db. Does not return until a new record has safely been recorded on disk. </div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div>* set_client: given a server IP address<ins class="diffchange diffchange-inline">, and information that can be used to generate clientid4</ins>, returns <ins class="diffchange diffchange-inline">clientid4 </ins>(<ins class="diffchange diffchange-inline">and a verifier?</ins>). <ins class="diffchange diffchange-inline">Called during SETCLIENTID/EXCHANGE_ID phase. This ensures uniqueness of the generated clientid4 by delegating it to userspace</ins>. If set_client is called during the grace period, then the client must already be in the db. If not, an error will be returned (NFS4ERR_GRACE?). If <ins class="diffchange diffchange-inline">set_client </ins>is called outside the grace period, then a new record is created, or something like EEXIST is returned if the client already exists in the db. Does not return until a new record has safely been recorded on disk<ins class="diffchange diffchange-inline">, but client is marked "unconfirmed"</ins>.</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>* <del class="diffchange diffchange-inline">expire_client</del>: given a client <del class="diffchange diffchange-inline">owner </del>and server IP address, replies with an empty reply. Replies only after it has recorded to disk the fact that the client has expired. The kernel will call this when a client loses its lease, before removing its locks and opens (and allowing potentially conflicting operations).</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div>* <ins class="diffchange diffchange-inline">confirm_client</ins>: given a <ins class="diffchange diffchange-inline">clientid4 and a server-side IP address, returns an error. The kernel will call this on the first reclaim OPEN or OPEN_CONFIRM (for v4.0 clients) or on RECLAIM_COMPLETE (for 4.1 clients). Does not return until </ins>client <ins class="diffchange diffchange-inline">is marked "confirmed".</ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div> </div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins class="diffchange diffchange-inline">* expire_client: given a clientid4 </ins>and server IP address, replies with an empty reply. Replies only after it has recorded to disk the fact that the client has expired. The kernel will call this when a client loses its lease, before removing its locks and opens (and allowing potentially conflicting operations).</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>* begin_grace: kernel will send a server IP address and number of seconds. Daemon responds with a count of clients that need to be reclaimed. If that number is 0, then the kernel will know that the grace period can be immediately lifted. The daemon need not do anything else.</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>* begin_grace: kernel will send a server IP address and number of seconds. Daemon responds with a count of clients that need to be reclaimed. If that number is 0, then the kernel will know that the grace period can be immediately lifted. The daemon need not do anything else.</div></td></tr>
</table>Jlaytonhttp://www.linux-nfs.org/wiki/index.php?title=Nfsd4_server_recovery&diff=4575&oldid=prevJlayton: /* Another Draft Design */2011-08-02T14:03:18Z<p><span class="autocomment">Another Draft Design</span></p>
<table style="background-color: white; color:black;">
<col class='diff-marker' />
<col class='diff-content' />
<col class='diff-marker' />
<col class='diff-content' />
<tr valign='top'>
<td colspan='2' style="background-color: white; color:black;">← Older revision</td>
<td colspan='2' style="background-color: white; color:black;">Revision as of 14:03, 2 August 2011</td>
</tr><tr><td colspan="2" class="diff-lineno">Line 145:</td>
<td colspan="2" class="diff-lineno">Line 145:</td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>This draft design will use rpc_pipefs to handle most of the communications. The reason for that is that I'm not convinced that polling on /proc/fs/nfsd files will work well. How would the daemon know that the kernel has data ready to read off the socket? That may be fixable, but rpc_pipefs should already be suitable for this.</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>This draft design will use rpc_pipefs to handle most of the communications. The reason for that is that I'm not convinced that polling on /proc/fs/nfsd files will work well. How would the daemon know that the kernel has data ready to read off the socket? That may be fixable, but rpc_pipefs should already be suitable for this.</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>Most of the files would be replaced with a single rpc_pipefs pipe in <del class="diffchange diffchange-inline">the top level of </del>rpc_pipefs. The daemon would open this pipe and listen on it. The upcall format would contain a command field that the daemon would use to determine what sort of message this was. The commands are as follows:</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div>Most of the files would be replaced with a single rpc_pipefs pipe in <ins class="diffchange diffchange-inline">a nfsd subdirectory in </ins>rpc_pipefs. The daemon would open this pipe and listen on it. The upcall format would contain a command field that the daemon would use to determine what sort of message this was. The commands are as follows:</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>* <del class="diffchange diffchange-inline">create_client</del>: given a client owner and a server-side IP address, returns an error<del class="diffchange diffchange-inline">. Does not return until a new record has safely been recorded on disk</del>. The kernel will call this on the first reclaim OPEN or OPEN_CONFIRM (for v4.0 clients) or on RECLAIM_COMPLETE (for 4.1 clients). <del class="diffchange diffchange-inline">(FIXME: do we need a separate recover_client command</del>, <del class="diffchange diffchange-inline">or </del>will <del class="diffchange diffchange-inline">this do double-duty</del>?)</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div>* <ins class="diffchange diffchange-inline">set_client</ins>: given a client owner and a server-side IP address, returns an error. The kernel will call this on the first reclaim OPEN or OPEN_CONFIRM (for v4.0 clients) or on RECLAIM_COMPLETE (for 4.1 clients). <ins class="diffchange diffchange-inline">If set_client is called during the grace period</ins>, <ins class="diffchange diffchange-inline">then the client must already be in the db. If not, an error </ins>will <ins class="diffchange diffchange-inline">be returned (NFS4ERR_GRACE</ins>?)<ins class="diffchange diffchange-inline">. If get_client is called outside the grace period, then a new record is created, or something like EEXIST is returned if the client already exists in the db. Does not return until a new record has safely been recorded on disk. </ins></div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>* expire_client: given a client owner and server IP address, replies with an empty reply. Replies only after it has recorded to disk the fact that the client has expired. The kernel will call this when a client loses its lease, before removing its locks and opens (and allowing potentially conflicting operations).</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>* expire_client: given a client owner and server IP address, replies with an empty reply. Replies only after it has recorded to disk the fact that the client has expired. The kernel will call this when a client loses its lease, before removing its locks and opens (and allowing potentially conflicting operations).</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>* begin_grace: kernel will send a server IP address<del class="diffchange diffchange-inline">, daemon </del>responds with a count of clients that need to be reclaimed<del class="diffchange diffchange-inline">? </del>If that number is 0, then the kernel will know that the grace period can be immediately lifted.</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div>* begin_grace: kernel will send a server IP address <ins class="diffchange diffchange-inline">and number of seconds. Daemon </ins>responds with a count of clients that need to be reclaimed<ins class="diffchange diffchange-inline">. </ins>If that number is 0, then the kernel will know that the grace period can be immediately lifted<ins class="diffchange diffchange-inline">. The daemon need not do anything else</ins>.</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>* grace_complete: given a server IP address, returns with an empty reply. The client will upcall with this command after the grace period expires. The daemon will use that to purge any unreclaimed client records for the given server address.</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>* grace_complete: given a server IP address, returns with an empty reply. The client will upcall with this command after the grace period expires. The daemon will use that to purge any unreclaimed client records for the given server address.</div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;"></ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;">One concern is that this data is not per-client like most of the stuff under rpc_pipefs, but hopefully that "weirdness" won't be show stopper.</ins></div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>The important thing to note is that all of the above commands are kernel-initiated. We also want to allow the daemon to initiate a grace_complete as well when all of the client records for an address have been reclaimed. That requires a different interface, possibly a simple file in /proc/fs/nfsd. When the last state record for an IP address has been recovered, it would write that IP addr into the file. The kernel would then know that the grace period for that IP address is now complete.</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>The important thing to note is that all of the above commands are kernel-initiated. We also want to allow the daemon to initiate a grace_complete as well when all of the client records for an address have been reclaimed. That requires a different interface, possibly a simple file in /proc/fs/nfsd. When the last state record for an IP address has been recovered, it would write that IP addr into the file. The kernel would then know that the grace period for that IP address is now complete.</div></td></tr>
</table>Jlaytonhttp://www.linux-nfs.org/wiki/index.php?title=Nfsd4_server_recovery&diff=4571&oldid=prevJlayton: /* Another Draft Design */2011-08-01T14:46:00Z<p><span class="autocomment">Another Draft Design</span></p>
<table style="background-color: white; color:black;">
<col class='diff-marker' />
<col class='diff-content' />
<col class='diff-marker' />
<col class='diff-content' />
<tr valign='top'>
<td colspan='2' style="background-color: white; color:black;">← Older revision</td>
<td colspan='2' style="background-color: white; color:black;">Revision as of 14:46, 1 August 2011</td>
</tr><tr><td colspan="2" class="diff-lineno">Line 143:</td>
<td colspan="2" class="diff-lineno">Line 143:</td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>= Another Draft Design =</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>= Another Draft Design =</div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>This draft design will use rpc_pipefs to handle most of the communications. The reason for that is that I'm not convinced that polling on /proc/fs/nfsd files will work well. How would the daemon know that the kernel has data ready to read off the socket? That may be fixable, but rpc_pipefs should already be suitable for this<del class="diffchange diffchange-inline">. In the top level of rpc_pipefs, we add a new subdirectory called "v4recover"</del>.</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div>This draft design will use rpc_pipefs to handle most of the communications. The reason for that is that I'm not convinced that polling on /proc/fs/nfsd files will work well. How would the daemon know that the kernel has data ready to read off the socket? That may be fixable, but rpc_pipefs should already be suitable for this.</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div><del class="diffchange diffchange-inline">create_client and expire_client </del>would be replaced with a single rpc_pipefs pipe in <del class="diffchange diffchange-inline">that dir (maybe called "upcall")</del>. The upcall format would contain a command field that the daemon would use to determine what sort of message this was.</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins class="diffchange diffchange-inline">Most of the files </ins>would be replaced with a single rpc_pipefs pipe in <ins class="diffchange diffchange-inline">the top level of rpc_pipefs. The daemon would open this pipe and listen on it</ins>. The upcall format would contain a command field that the daemon would use to determine what sort of message this was. <ins class="diffchange diffchange-inline">The commands are as follows:</ins></div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div><del class="diffchange diffchange-inline">The allow_client </del>and <del class="diffchange diffchange-inline">grace_done interfaces would use another rpc_pipefs pipe </del>(<del class="diffchange diffchange-inline">"downcall"?</del>). <del class="diffchange diffchange-inline">That would also contain </del>a command <del class="diffchange diffchange-inline">field </del>that <del class="diffchange diffchange-inline">would tell </del>the kernel <del class="diffchange diffchange-inline">what sort </del>of <del class="diffchange diffchange-inline">message was being sent</del>. <del class="diffchange diffchange-inline">Alternately</del>, the <del class="diffchange diffchange-inline">grace_done </del>interface <del class="diffchange diffchange-inline">could be </del>in /proc/fs/nfsd <del class="diffchange diffchange-inline">instead</del>.</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins class="diffchange diffchange-inline">* create_client: given a client owner </ins>and <ins class="diffchange diffchange-inline">a server-side IP address, returns an error. Does not return until a new record has safely been recorded on disk. The kernel will call this on the first reclaim OPEN or OPEN_CONFIRM </ins>(<ins class="diffchange diffchange-inline">for v4.0 clients</ins>) <ins class="diffchange diffchange-inline">or on RECLAIM_COMPLETE (for 4</ins>.<ins class="diffchange diffchange-inline">1 clients). (FIXME: do we need </ins>a <ins class="diffchange diffchange-inline">separate recover_client </ins>command<ins class="diffchange diffchange-inline">, or will this do double-duty?)</ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div> </div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins class="diffchange diffchange-inline">* expire_client: given a client owner and server IP address, replies with an empty reply. Replies only after it has recorded to disk the fact </ins>that the <ins class="diffchange diffchange-inline">client has expired. The </ins>kernel <ins class="diffchange diffchange-inline">will call this when a client loses its lease, before removing its locks and opens (and allowing potentially conflicting operations).</ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div> </div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins class="diffchange diffchange-inline">* begin_grace: kernel will send a server IP address, daemon responds with a count </ins>of <ins class="diffchange diffchange-inline">clients that need to be reclaimed? If that number is 0, then the kernel will know that the grace period can be immediately lifted</ins>.</div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div> </div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins class="diffchange diffchange-inline">* grace_complete: given a server IP address</ins>, <ins class="diffchange diffchange-inline">returns with an empty reply. The client will upcall with this command after </ins>the <ins class="diffchange diffchange-inline">grace period expires. The daemon will use that to purge any unreclaimed client records for the given server address.</ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div> </div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins class="diffchange diffchange-inline">The important thing to note is that all of the above commands are kernel-initiated. We also want to allow the daemon to initiate a grace_complete as well when all of the client records for an address have been reclaimed. That requires a different </ins>interface<ins class="diffchange diffchange-inline">, possibly a simple file </ins>in /proc/fs/nfsd<ins class="diffchange diffchange-inline">. When the last state record for an IP address has been recovered, it would write that IP addr into the file. The kernel would then know that the grace period for that IP address is now complete</ins>.</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>= Accomodating active/active NFSv4 clusters =</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>= Accomodating active/active NFSv4 clusters =</div></td></tr>
</table>Jlaytonhttp://www.linux-nfs.org/wiki/index.php?title=Nfsd4_server_recovery&diff=4570&oldid=prevJlayton: /* Another Draft Design */2011-08-01T14:10:39Z<p><span class="autocomment">Another Draft Design</span></p>
<table style="background-color: white; color:black;">
<col class='diff-marker' />
<col class='diff-content' />
<col class='diff-marker' />
<col class='diff-content' />
<tr valign='top'>
<td colspan='2' style="background-color: white; color:black;">← Older revision</td>
<td colspan='2' style="background-color: white; color:black;">Revision as of 14:10, 1 August 2011</td>
</tr><tr><td colspan="2" class="diff-lineno">Line 145:</td>
<td colspan="2" class="diff-lineno">Line 145:</td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>This draft design will use rpc_pipefs to handle most of the communications. The reason for that is that I'm not convinced that polling on /proc/fs/nfsd files will work well. How would the daemon know that the kernel has data ready to read off the socket? That may be fixable, but rpc_pipefs should already be suitable for this. In the top level of rpc_pipefs, we add a new subdirectory called "v4recover".</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>This draft design will use rpc_pipefs to handle most of the communications. The reason for that is that I'm not convinced that polling on /proc/fs/nfsd files will work well. How would the daemon know that the kernel has data ready to read off the socket? That may be fixable, but rpc_pipefs should already be suitable for this. In the top level of rpc_pipefs, we add a new subdirectory called "v4recover".</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>create_client and expire_client would be replaced with a single rpc_pipefs pipe in that dir (maybe called "upcall"). The upcall format would contain a <del class="diffchange diffchange-inline">"</del>command<del class="diffchange diffchange-inline">" </del>field that the daemon would use to determine what sort of message this was.</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div>create_client and expire_client would be replaced with a single rpc_pipefs pipe in that dir (maybe called "upcall"). The upcall format would contain a command field that the daemon would use to determine what sort of message this was.</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>The allow_client and grace_done interfaces would use another rpc_pipefs pipe ("downcall"?). That would also contain a command field that would tell the kernel what sort of message was being sent. Alternately, the grace_done interface could be in /proc/fs/nfsd instead.</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>The allow_client and grace_done interfaces would use another rpc_pipefs pipe ("downcall"?). That would also contain a command field that would tell the kernel what sort of message was being sent. Alternately, the grace_done interface could be in /proc/fs/nfsd instead.</div></td></tr>
</table>Jlaytonhttp://www.linux-nfs.org/wiki/index.php?title=Nfsd4_server_recovery&diff=4569&oldid=prevJlayton: /* Accomodating active/active NFSv4 clusters */2011-08-01T14:07:05Z<p><span class="autocomment">Accomodating active/active NFSv4 clusters</span></p>
<table style="background-color: white; color:black;">
<col class='diff-marker' />
<col class='diff-content' />
<col class='diff-marker' />
<col class='diff-content' />
<tr valign='top'>
<td colspan='2' style="background-color: white; color:black;">← Older revision</td>
<td colspan='2' style="background-color: white; color:black;">Revision as of 14:07, 1 August 2011</td>
</tr><tr><td colspan="2" class="diff-lineno">Line 141:</td>
<td colspan="2" class="diff-lineno">Line 141:</td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>The daemon will then wait for create_client, expire_client, and</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>The daemon will then wait for create_client, expire_client, and</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>grace_done calls. On grace_done, it will rename new_boot_time to boot_time.</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>grace_done calls. On grace_done, it will rename new_boot_time to boot_time.</div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;"></ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;">= Another Draft Design =</ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;">This draft design will use rpc_pipefs to handle most of the communications. The reason for that is that I'm not convinced that polling on /proc/fs/nfsd files will work well. How would the daemon know that the kernel has data ready to read off the socket? That may be fixable, but rpc_pipefs should already be suitable for this. In the top level of rpc_pipefs, we add a new subdirectory called "v4recover".</ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;"></ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;">create_client and expire_client would be replaced with a single rpc_pipefs pipe in that dir (maybe called "upcall"). The upcall format would contain a "command" field that the daemon would use to determine what sort of message this was.</ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;"></ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;">The allow_client and grace_done interfaces would use another rpc_pipefs pipe ("downcall"?). That would also contain a command field that would tell the kernel what sort of message was being sent. Alternately, the grace_done interface could be in /proc/fs/nfsd instead.</ins></div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>= Accomodating active/active NFSv4 clusters =</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>= Accomodating active/active NFSv4 clusters =</div></td></tr>
</table>Jlaytonhttp://www.linux-nfs.org/wiki/index.php?title=Nfsd4_server_recovery&diff=4535&oldid=prevJlayton at 14:53, 29 July 20112011-07-29T14:53:23Z<p></p>
<table style="background-color: white; color:black;">
<col class='diff-marker' />
<col class='diff-content' />
<col class='diff-marker' />
<col class='diff-content' />
<tr valign='top'>
<td colspan='2' style="background-color: white; color:black;">← Older revision</td>
<td colspan='2' style="background-color: white; color:black;">Revision as of 14:53, 29 July 2011</td>
</tr><tr><td colspan="2" class="diff-lineno">Line 141:</td>
<td colspan="2" class="diff-lineno">Line 141:</td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>The daemon will then wait for create_client, expire_client, and</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>The daemon will then wait for create_client, expire_client, and</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>grace_done calls. On grace_done, it will rename new_boot_time to boot_time.</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>grace_done calls. On grace_done, it will rename new_boot_time to boot_time.</div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;"></ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;">= Accomodating active/active NFSv4 clusters =</ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;">To handle the situation where we have an active/active NFSv4 cluster with IP addresses that float between machines, we'll need to further tie each of the client records to one of the server's IP addreses. The create_client and expire_client interfaces will need to contain a server IP address encoded, and the daemon will need to store that information. We'll also need to have some way to tell the daemon to only feed records for a certain IP address to the server, so that when the server picks up a new address it can get the new records.</ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;"></ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins style="color: red; font-weight: bold; text-decoration: none;">What may be best is to have a "simple" state recovery daemon for single server configurations, and a "clustered" one for clustered configurations. The upcall/downcall formats should be the same for them, however.</ins></div></td></tr>
</table>Jlaytonhttp://www.linux-nfs.org/wiki/index.php?title=Nfsd4_server_recovery&diff=4524&oldid=prevBfields: /* Draft design */2011-07-26T19:47:34Z<p><span class="autocomment">Draft design</span></p>
<table style="background-color: white; color:black;">
<col class='diff-marker' />
<col class='diff-content' />
<col class='diff-marker' />
<col class='diff-content' />
<tr valign='top'>
<td colspan='2' style="background-color: white; color:black;">← Older revision</td>
<td colspan='2' style="background-color: white; color:black;">Revision as of 19:47, 26 July 2011</td>
</tr><tr><td colspan="2" class="diff-lineno">Line 132:</td>
<td colspan="2" class="diff-lineno">Line 132:</td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>manage boot times and old clients using files in /var/lib/nfs:</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>manage boot times and old clients using files in /var/lib/nfs:</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>* If boot_time exists:</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div><ins class="diffchange diffchange-inline">* Record the current time in new_boot_time (replacing any existing such file).</ins></div></td></tr>
<tr><td colspan="2"> </td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div>* If <ins class="diffchange diffchange-inline">the file </ins>boot_time exists:</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>** It will be read, and the contents interpreted as an ascii-encoded unix time in seconds.</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>** It will be read, and the contents interpreted as an ascii-encoded unix time in seconds.</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>** All client records older than that time will be removed.</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>** All client records older than that time will be removed.</div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div><del style="color: red; font-weight: bold; text-decoration: none;">** The current boot_time will be recorded to new_boot_time (replacing any existing such file).</del></div></td><td colspan="2"> </td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>** All remaining clients will be written to allow_client.</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>** All remaining clients will be written to allow_client.</div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>* If boot_time does not exist, an empty /var/lib/nfs/v4clients/ is created if necessary<del class="diffchange diffchange-inline">, but nothing else is done</del>.</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div>* If boot_time does not exist, an empty /var/lib/nfs/v4clients/ is created if necessary.</div></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>The daemon will then wait for create_client, expire_client, and</div></td><td class='diff-marker'> </td><td style="background: #eee; color:black; font-size: smaller;"><div>The daemon will then wait for create_client, expire_client, and</div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div>grace_done calls. On grace_done, it will rename <del class="diffchange diffchange-inline">boot_time to</del></div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div>grace_done calls. On grace_done, it will rename new_boot_time to boot_time.</div></td></tr>
<tr><td class='diff-marker'>-</td><td style="background: #ffa; color:black; font-size: smaller;"><div><del class="diffchange diffchange-inline">old_boot_time, and </del>new_boot_time to boot_time.</div></td><td class='diff-marker'>+</td><td style="background: #cfc; color:black; font-size: smaller;"><div></div></td></tr>
</table>Bfields