2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

1234 Commits

Author SHA1 Message Date
Trond Myklebust
536ff0f809 NFSv4: Ensure we don't corrupt fl->fl_flags in nfs4_proc_unlck
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:53:33 -04:00
Trond Myklebust
c1d519312d NFSv4: Only increment the sequence id if the server saw it
It is quite possible that the OPEN, CLOSE, LOCK, LOCKU,... compounds fail
before the actual stateful operation has been executed (for instance in the
PUTFH call). There is no way to tell from the overall status result which
operations were executed from the COMPOUND.

The fix is to move incrementing of the sequence id into the XDR layer,
so that we do it as we process the results from the stateful operation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:53:15 -04:00
Trond Myklebust
35d05778e2 NFSv4: Remove bogus call to nfs4_drop_state_owner() in _nfs4_open_expired()
There should be no need to invalidate a perfectly good state owner just
because of a stale filehandle. Doing so can cause the state recovery code
to break, since nfs4_get_renew_cred() and nfs4_get_setclientid_cred() rely
on finding active state owners.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:53:12 -04:00
Trond Myklebust
98a8e32394 SUNRPC: Add a helper rpcauth_lookup_generic_cred()
The NFSv4 protocol allows clients to negotiate security protocols on the
fly in the case where an administrator on the server changes the export
settings and/or in the case where we may have a filesystem migration event.

Instead of having the NFS client code cache credentials that are tied to a
particular AUTH method it is therefore preferable to have a generic credential
that can be converted into whatever AUTH is in use by the RPC client when
the read/write/sillyrename/... is put on the wire.

We do this by means of the new "generic" credential, which basically just
caches the minimal information that is needed to look up an RPCSEC_GSS,
AUTH_SYS, or AUTH_NULL credential.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-14 13:42:49 -04:00
Trond Myklebust
5d00837b90 SUNRPC: Run rpc timeout functions as callbacks instead of in softirqs
An audit of the current RPC timeout functions shows that they don't really
ever need to run in the softirq context. As long as the softirq is
able to signal that the wakeup is due to a timeout (which it can do by
setting task->tk_status to -ETIMEDOUT) then the callback functions can just
run as standard task->tk_callback functions (in the rpciod/process
context).

The only possible border-line case would be xprt_timer() for the case of
UDP, when the callback is used to reduce the size of the transport
congestion window. In testing, however, the effect of moving that update
to a callback would appear to be minor.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 21:40:44 -08:00
Trond Myklebust
fda1393938 SUNRPC: Convert users of rpc_wake_up_task to use rpc_wake_up_queued_task
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 21:40:42 -08:00
Trond Myklebust
101070ca2f NFS: Ensure that the asynchronous RPC calls complete on nfsiod.
We want to ensure that rpc_call_ops that involve mntput() are run on nfsiod
rather than on rpciod, so that they don't deadlock when the resulting
umount calls rpc_shutdown_client(). Hence we specify that read, write and
commit calls must complete on nfsiod.
Ditto for NFSv4 open, lock, locku and close asynchronous calls.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 21:40:37 -08:00
Jan Blunck
4ac9137858 Embed a struct path into struct nameidata instead of nd->{dentry,mnt}
This is the central patch of a cleanup series. In most cases there is no good
reason why someone would want to use a dentry for itself. This series reflects
that fact and embeds a struct path into nameidata.

Together with the other patches of this series
- it enforced the correct order of getting/releasing the reference count on
  <dentry,vfsmount> pairs
- it prepares the VFS for stacking support since it is essential to have a
  struct path in every place where the stack can be traversed
- it reduces the overall code size:

without patch series:
   text    data     bss     dec     hex filename
5321639  858418  715768 6895825  6938d1 vmlinux

with patch series:
   text    data     bss     dec     hex filename
5320026  858418  715768 6894212  693284 vmlinux

This patch:

Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix cifs]
[akpm@linux-foundation.org: fix smack]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
Linus Torvalds
75659ca0c1 Merge branch 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc
* 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc: (22 commits)
  Remove commented-out code copied from NFS
  NFS: Switch from intr mount option to TASK_KILLABLE
  Add wait_for_completion_killable
  Add wait_event_killable
  Add schedule_timeout_killable
  Use mutex_lock_killable in vfs_readdir
  Add mutex_lock_killable
  Use lock_page_killable
  Add lock_page_killable
  Add fatal_signal_pending
  Add TASK_WAKEKILL
  exit: Use task_is_*
  signal: Use task_is_*
  sched: Use task_contributes_to_load, TASK_ALL and TASK_NORMAL
  ptrace: Use task_is_*
  power: Use task_is_*
  wait: Use TASK_NORMAL
  proc/base.c: Use task_is_*
  proc/array.c: Use TASK_REPORT
  perfmon: Use task_is_*
  ...

Fixed up conflicts in NFS/sunrpc manually..
2008-02-01 11:45:47 +11:00
Trond Myklebust
e6f8107595 NFS: Add an asynchronous delegreturn operation for use in nfs_clear_inode
Otherwise, there is a potential deadlock if the last dput() from an NFSv4
close() or other asynchronous operation leads to nfs_clear_inode calling
the synchronous delegreturn.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:12 -05:00
J. Bruce Fields
3d1c550874 nfs4: allow nfsv4 acls on non-regular-files
The rfc doesn't give any reason it shouldn't be possible to set an
attribute on a non-regular file.  And if the server supports it, then it
shouldn't be up to us to prevent it.

Thanks to Erez for the report and Trond for further analysis.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Tested-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:10 -05:00
Trond Myklebust
69dd716c5f NFSv4: Add socket proto argument to setclientid
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:58 -05:00
Chuck Lever
d4d3c50749 NFS: Enable NFS client to generate CLIENTID strings with IPv6 addresses
We recently added methods to RPC transports that provide string versions of
the remote peer address information.  Convert the NFSv4 SETCLIENTID
procedure to use those methods instead of building the client ID out of
whole cloth.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:51 -05:00
Trond Myklebust
bfc69a4566 NFS: define a function to update nfsi->cache_change_attribute
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:47 -05:00
Trond Myklebust
5138fde011 NFS/SUNRPC: Convert all users of rpc_call_setup()
Replace use of rpc_call_setup() with rpc_init_task(), and in cases where we
need to initialise task->tk_action, with rpc_call_start().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:32 -05:00
Trond Myklebust
bdc7f021f3 NFS: Clean up the (commit|read|write)_setup() callback routines
Move the common code for setting up the nfs_write_data and nfs_read_data
structures into fs/nfs/read.c, fs/nfs/write.c and fs/nfs/direct.c.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:32 -05:00
Trond Myklebust
c970aa85e7 SUNRPC: Clean up rpc_run_task
Make it use the new task initialiser structure instead of acting as a
wrapper.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:30 -05:00
Trond Myklebust
2f74c0a056 NFSv4: Clean up the OPEN/CLOSE serialisation code
Reduce the time spent locking the rpc_sequence structure by queuing the
nfs_seqid only when we are ready to take the lock (when calling
nfs_wait_on_sequence).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:24 -05:00
Trond Myklebust
e6e21970ba NFSv4: Fix open_to_lock_owner sequenceid allocation...
NFSv4 file locking is currently completely broken since it doesn't respect
the OPEN sequencing when it is given an unconfirmed lock_owner and needs to
do an open_to_lock_owner. Worse: it breaks the sunrpc rules by doing a
GFP_KERNEL allocation inside an rpciod callback.

Fix is to preallocate the open seqid structure in nfs4_alloc_lockdata if we
see that the lock_owner is unconfirmed.
Then, in nfs4_lock_prepare() we wait for either the open_seqid, if
the lock_owner is still unconfirmed, or else fall back to waiting on the
standard lock_seqid.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-03 09:37:17 -05:00
Trond Myklebust
bb22629ee8 NFSv4: nfs4_open_confirm must not set the open_owner as confirmed on error
RFC3530 states that the open_owner is confirmed if and only if the client
sends an OPEN_CONFIRM request with the appropriate sequence id and stateid
within the lease period.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-03 09:37:17 -05:00
Matthew Wilcox
150030b78a NFS: Switch from intr mount option to TASK_KILLABLE
By using the TASK_KILLABLE infrastructure, we can get rid of the 'intr'
mount option.  We have to use _killable everywhere instead of _interruptible
as we get rid of rpc_clnt_sigmask/sigunmask.

Signed-off-by: Liam R. Howlett <howlett@gmail.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2007-12-06 17:40:25 -05:00
Trond Myklebust
a49c3c7736 NFSv4: Ensure that we wait for the CLOSE request to complete
Otherwise, we do end up breaking close-to-open semantics. We also end up
breaking some of the silly-rename tests in Connectathon on some setups.

Please refer to the bug-report at
	http://bugzilla.linux-nfs.org/show_bug.cgi?id=150

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-19 17:19:25 -04:00
Trond Myklebust
565277f63c NFS: Fix a race in sillyrename
lookup() and sillyrename() can race one another because the sillyrename()
completion cannot take the parent directory's inode->i_mutex since the
latter may be held by whoever is calling dput().

We therefore have little option but to add extra locking to ensure that
nfs_lookup() and nfs_atomic_open() do not race with the sillyrename
completion.
If somebody has looked up the sillyrenamed file in the meantime, we just
transfer the sillydelete information to the new dentry.

Please refer to the bug-report at
	http://bugzilla.linux-nfs.org/show_bug.cgi?id=150

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-19 17:19:16 -04:00
Trond Myklebust
40d2470409 NFS: Fix a connectathon regression in NFSv3 and NFSv4
We're failing basic test6 against Linux servers because they lack a correct
change attribute. The fix is to assume that we always want to invalidate
the readdir caches when we call update_changeattr and/or
nfs_post_op_update_inode on a directory.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:20:47 -04:00
Trond Myklebust
9e08a3c5ae NFS: Use nfs_refresh_inode() in ops that aren't expected to change the inode
nfs_post_op_update_inode() is really only meant to be used if we expect the
inode and its attributes to have changed in some way.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:20:45 -04:00
Trond Myklebust
d75340cc4d NFSv4: Fix nfs_atomic_open() to set the verifier on negative dentries too
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:20:06 -04:00
Trond Myklebust
d4d9cdcb47 NFS: Don't hash the negative dentry when optimising for an O_EXCL open
We don't want to leave an unverified hashed negative dentry if the
exclusive create fails to complete.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:19:27 -04:00
Trond Myklebust
70ca88521f NFS: Fake up 'wcc' attributes to prevent cache invalidation after write
NFSv2 and v4 don't offer weak cache consistency attributes on WRITE calls.
In NFSv3, returning wcc data is optional. In all cases, we want to prevent
the client from invalidating our cached data whenever ->write_done()
attempts to update the inode attributes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:19:15 -04:00
Trond Myklebust
8850df999c NFS: Fix atime revalidation in read()
NFSv3 will correctly update atime on a read() call, so there is no need to
set the NFS_INO_INVALID_ATIME flag unless the call to nfs_refresh_inode()
fails.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:19:06 -04:00
Trond Myklebust
c481299839 NFS: Fix atime revalidation in readdir()
NFSv3 will correctly update atime on a readdir call, so there is no need to
set the NFS_INO_INVALID_ATIME flag unless the call to nfs_refresh_inode()
fails.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:19:03 -04:00
Trond Myklebust
17cadc9537 NFS: Don't force a dcache revalidation if nfs_wcc_update_inode succeeds
The reason is that if the weak cache consistency update was successful,
then we know that our client must be the only one that changed the
directory, and we've already updated the dcache to reflect the change.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:55 -04:00
Trond Myklebust
76b32999df NFSv4: Make NFSv4 ACCESS calls return attributes too...
It doesn't really make sense to cache an access call without also
revalidating the attributes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:38 -04:00
Trond Myklebust
af22f94ae0 NFSv4: Simplify _nfs4_do_access()
Currently, _nfs4_do_access() is just a copy of nfs_do_access() with added
conversion of the open flags into an access mask. This patch merges the
duplicate functionality.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:34 -04:00
Trond Myklebust
cd3758e37d NFS: Replace file->private_data with calls to nfs_file_open_context()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:31 -04:00
Peter Staubach
4e769b934e 64 bit ino support for NFS client
Hi.

Attached is a patch to modify the NFS client code to support
64 bit ino's, as appropriate for the system and the NFS
protocol version.

The code basically just expand the NFS interfaces for routines
which handle ino's from using ino_t to u64 and then uses the
fileid in the nfs_inode instead of i_ino in the inode.  The
code paths that were updated are in the getattr method and
the readdir methods.

This should be no real change on 64 bit platforms.  Since
the ino_t is an unsigned long, it would already be 64 bits
wide.

    Thanx...

           ps

Signed-off-by: Peter Staubach <staubach@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:15:29 -04:00
Trond Myklebust
deee9369b9 NFSv4: Ensure that we pass the correct dentry to nfs4_intent_set_file
This patch fixes an Oops that was reported by Gabriel Barazer.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-09-01 10:14:38 -04:00
Trond Myklebust
65bbf6bdbb NFSv4: Fix a typo in _nfs4_do_open_reclaim
This should fix the following Oops reported by Jeff Garzik:

kernel BUG at fs/nfs/nfs4xdr.c:1040!
invalid opcode: 0000 [1] SMP 
CPU 0 
Modules linked in: nfs lockd sunrpc af_packet
ipv6 cpufreq_ondemand acpi_cpufreq battery floppy nvram sg snd_hda_intel
ata_generic snd_pcm_oss snd_mixer_oss snd_pcm i2c_i801 snd_page_alloc e1000
firewire_ohci ata_piix i2c_core sr_mod cdrom sata_sil ahci libata sd_mod
scsi_mod ext3 jbd ehci_hcd uhci_hcd
Pid: 16353, comm: 10.10.10.1-recl Not tainted 2.6.23-rc3 #1
RIP: 0010:[<ffffffff88240980>] [<ffffffff88240980>] :nfs:encode_open+0x1c0/0x330
RSP: 0018:ffff8100467c5c60  EFLAGS: 00010202
RAX: ffff81000f89b8b8 RBX: 00000000697a6f6d RCX: ffff81000f89b8b8
RDX: 0000000000000004 RSI: 0000000000000004 RDI: ffff8100467c5c80
RBP: ffff8100467c5c80 R08: ffff81000f89bc30 R09: ffff81000f89b83f
R10: 0000000000000001 R11: ffffffff881e79e0 R12: ffff81003cbd1808
R13: ffff81000f89b860 R14: ffff81005fc984e0 R15: ffffffff88240af0
FS:  0000000000000000(0000) GS:ffffffff8052a000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002adb9e51a030 CR3: 000000007ea7e000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process 10.10.10.1-recl (pid: 16353, threadinfo ffff8100467c4000, task ffff8100038ce780)
Stack:  ffff81004aeb6a40 ffff81003cbd1808 ffff81003cbd1808 ffffffff88240b5d
 ffff81000f89b8bc ffff81005fc984e8 ffff81000f89bc30 ffff81005fc984e8
 0000000300000000 0000000000000000 0000000000000000 ffff81003cbd1800
Call Trace:
 [<ffffffff88240b5d>] :nfs:nfs4_xdr_enc_open_noattr+0x6d/0x90
 [<ffffffff881e74b7>] :sunrpc:rpcauth_wrap_req+0x97/0xf0
 [<ffffffff88240af0>] :nfs:nfs4_xdr_enc_open_noattr+0x0/0x90
 [<ffffffff881df57a>] :sunrpc:call_transmit+0x18a/0x290
 [<ffffffff881e5e7b>] :sunrpc:__rpc_execute+0x6b/0x290
 [<ffffffff881dff76>] :sunrpc:rpc_do_run_task+0x76/0xd0
 [<ffffffff882373f6>] :nfs:_nfs4_proc_open+0x76/0x230
 [<ffffffff88237a2e>] :nfs:nfs4_open_recover_helper+0x5e/0xc0
 [<ffffffff88237b74>] :nfs:nfs4_open_recover+0xe4/0x120
 [<ffffffff88238e14>] :nfs:nfs4_open_reclaim+0xa4/0xf0
 [<ffffffff882413c5>] :nfs:nfs4_reclaim_open_state+0x55/0x1b0
 [<ffffffff882417ea>] :nfs:reclaimer+0x2ca/0x390
 [<ffffffff88241520>] :nfs:reclaimer+0x0/0x390
 [<ffffffff8024e59b>] kthread+0x4b/0x80
 [<ffffffff8020cad8>] child_rip+0xa/0x12
 [<ffffffff8024e550>] kthread+0x0/0x80
 [<ffffffff8020cace>] child_rip+0x0/0x12


Code: 0f 0b eb fe 48 89 ef c7 00 00 00 00 02 be 08 00 00 00 e8 79 
RIP  [<ffffffff88240980>] :nfs:encode_open+0x1c0/0x330
 RSP <ffff8100467c5c60>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-09-01 10:14:37 -04:00
Trond Myklebust
45328c354e NFS: Fix NFSv4 open stateid regressions
Do not allow cached open for O_RDONLY or O_WRONLY unless the file has been
previously opened in these modes.

Also Fix the calculation of the mode in nfs4_close_prepare. We should only
issue an OPEN_DOWNGRADE if we're sure that we will still be holding the
correct open modes. This may not be the case if we've been doing delegated
opens.

Finally, there is no need to adjust the open mode bit flags in
nfs4_close_done(): that has already been done in nfs4_close_prepare().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-08-07 15:13:19 -04:00
Trond Myklebust
e4eff1a622 SUNRPC: Clean up the sillyrename code
Fix a couple of bugs:
 - Don't rely on the parent dentry still being valid when the call completes.
   Fixes a race with shrink_dcache_for_umount_subtree()

 - Don't remove the file if the filehandle has been labelled as stale.

Fix a couple of inefficiencies
 - Remove the global list of sillyrenamed files. Instead we can cache the
   sillyrename information in the dentry->d_fsdata
 - Move common code from unlink_setup/unlink_done into fs/nfs/unlink.c

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-19 15:21:39 -04:00
Trond Myklebust
4fdc17b2a7 NFS: Introduce struct nfs_removeargs+nfs_removeres
We need a common structure for setting up an unlink() rpc call in order to
fix the asynchronous unlink code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-19 15:21:39 -04:00
Trond Myklebust
9936781d01 NFSv4: Try to recover from getfh failures in nfs4_xdr_dec_open
Try harder to recover the open state if the server failed to return a
filehandle.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-19 15:09:03 -04:00
Trond Myklebust
56659e9926 NFSv4: 'constify' lookup arguments.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-19 15:09:03 -04:00
Trond Myklebust
6f220ed5a8 NFSv4: Fix open state recovery
Ensure that opendata->state is always initialised when we do state
recovery.

Ensure that we set the filehandle in the case where we're doing an
"OPEN_CLAIM_PREVIOUS" call due to a server reboot.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-19 15:09:03 -04:00
Frank Filz
137d6acaa6 NFSv4: Make sure unlock is really an unlock when cancelling a lock
I ran into a curious issue when a lock is being canceled. The
cancellation results in a lock request to the vfs layer instead of an
unlock request. This is particularly insidious when the process that
owns the lock is exiting. In that case, sometimes the erroneous lock is
applied AFTER the process has entered zombie state, preventing the lock
from ever being released. Eventually other processes block on the lock
causing a slow degredation of the system. In the 2.6.16 kernel this was
investigated on, the problem is compounded by the fact that the cl_sem
is held while blocking on the vfs lock, which results in most processes
accessing the nfs file system in question hanging.

In more detail, here is how the situation occurs:

first _nfs4_do_setlk():

static int _nfs4_do_setlk(struct nfs4_state *state, int cmd, struct file_lock *fl, int reclaim)
...
        ret = nfs4_wait_for_completion_rpc_task(task);
        if (ret == 0) {
...
        } else
                data->cancelled = 1;

then nfs4_lock_release():

static void nfs4_lock_release(void *calldata)
...
        if (data->cancelled != 0) {
                struct rpc_task *task;
                task = nfs4_do_unlck(&data->fl, data->ctx, data->lsp,
                                data->arg.lock_seqid);

The problem is the same file_lock that was passed in to _nfs4_do_setlk()
gets passed to nfs4_do_unlck() from nfs4_lock_release(). So the type is
still F_RDLCK or FWRLCK, not F_UNLCK. At some point, when cancelling the
lock, the type needs to be changed to F_UNLCK. It seemed easiest to do
that in nfs4_do_unlck(), but it could be done in nfs4_lock_release().
The concern I had with doing it there was if something still needed the
original file_lock, though it turns out the original file_lock still
needs to be modified by nfs4_do_unlck() because nfs4_do_unlck() uses the
original file_lock to pass to the vfs layer, and a copy of the original
file_lock for the RPC request.

It seems like the simplest solution is to force all situations where
nfs4_do_unlck() is being used to result in an unlock, so with that in
mind, I made the following change:

Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:49 -04:00
Trond Myklebust
8bda4e4c98 NFSv4: Fix up stateid locking...
We really don't need to grab both the state->so_owner and the
inode->i_lock.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:43 -04:00
Trond Myklebust
1ac7e2fd35 NFSv4: Clean up the callers of nfs4_open_recover_helper()
Rely on nfs4_try_open_cached() when appropriate.

Also fix an RCU violation in _nfs4_do_open_reclaim()

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:43 -04:00
Trond Myklebust
6ee4126890 NFSv4: Don't call OPEN if we already have an open stateid for a file
If we already have a stateid with the correct open mode for a given file,
then we can reuse that stateid instead of re-issuing an OPEN call without
violating the close-to-open caching semantics.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:43 -04:00
Trond Myklebust
aac00a8d0a NFSv4: Check for the existence of a delegation in nfs4_open_prepare()
We should not be calling open() on an inode that has a delegation unless
we're doing a reclaim.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:43 -04:00
Trond Myklebust
3e309914a1 NFSv4: Clean up _nfs4_proc_open()
Use a flag instead of the 'data->rpc_status = -ENOMEM hack.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:42 -04:00
Trond Myklebust
1b370bc28f NFSv4: Allow nfs4_opendata_to_nfs4_state to return errors.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:42 -04:00
Trond Myklebust
6f43ddccb3 NFSv4: Improve the debugging of bad sequence id errors...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:42 -04:00
Trond Myklebust
003707c722 NFSv4: Always use the delegation if we have one
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:41 -04:00
Trond Myklebust
0f9f95e0ad NFSv4: Clean up confirmation of sequence ids...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:41 -04:00
Trond Myklebust
13437e12fb NFSv4: Support recalling delegations by stateid part 2
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:41 -04:00
Trond Myklebust
2ced46c270 NFSv4: Fix up a bug in nfs4_open_recover()
Don't clobber the delegation info...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:40 -04:00
Trond Myklebust
549d6ed5e8 NFSv4: set the delegation in nfs4_opendata_to_nfs4_state
This ensures that nfs4_open_release() and nfs4_open_confirm_release()
can now handle an eventual delegation that was returned with out open.
As such, it fixes a delegation "leak" when the user breaks out of an open
call.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:40 -04:00
Trond Myklebust
1b45c46cf7 NFSv4: Fix atomic open for execute...
Currently we do not check for the FMODE_EXEC flag as we should. For that
particular case, we need to perform an ACCESS call to the server in order
to check that the file is executable.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:40 -04:00
Trond Myklebust
9f958ab885 NFSv4: Reduce the chances of an open_owner identifier collision
Currently we just use a 32-bit counter.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:39 -04:00
Trond Myklebust
4e56e082dd NFSv4: Clean up _nfs4_proc_lookup() vs _nfs4_proc_lookupfh()
They differ only slightly in the arguments they take. Why have they not
been merged?

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:38 -04:00
Trond Myklebust
c6d00e639b NFSv4: Convert struct nfs4_opendata to use struct kref
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:28 -04:00
Jeff Layton
aa53ed541a NFS4: on a O_EXCL OPEN make sure SETATTR sets the fields holding the verifier
The Linux NFS4 client simply skips over the bitmask in an O_EXCL open
call and so it doesn't bother to reset any fields that may be holding
the verifier. This patch has us save the first two words of the bitmask
(which is all the current client has #defines for). The client then
later checks this bitmask and turns on the appropriate flags in the
sattr->ia_verify field for the following SETATTR call.

This patch only currently checks to see if the server used the atime
and mtime slots for the verifier (which is what the Linux server uses
for this). I'm not sure of what other fields the server could
reasonably use, but adding checks for others should be trivial.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:25 -04:00
Trond Myklebust
b39e625b6e NFSv4: Clean up nfs4_call_async()
Use rpc_run_task() instead of doing it ourselves.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:24 -04:00
Trond Myklebust
4a35bd41af NFSv4: Ensure that nfs4_do_close() doesn't race with umount
nfs4_do_close() does not currently have any way to ensure that the user
won't attempt to unmount the partition while the asynchronous RPC call
is completing. This again may cause Oopses in nfs_update_inode().

Add a vfsmount argument to nfs4_close_state to ensure that the partition
remains mounted while we're closing the file.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:24 -04:00
Trond Myklebust
ad389da79f NFSv4: Ensure asynchronous open() calls always pin the mountpoint
A number of race conditions may currently ensue if the user presses ^C
and then unmounts the partition while an asynchronous open() is in
progress.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:24 -04:00
Trond Myklebust
539cd03a57 NFSv4: Cleanup: pass the nfs_open_context to open recovery code
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:24 -04:00
Trond Myklebust
88be9f990f NFS: Replace vfsmount and dentry in nfs_open_context with struct path
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:23 -04:00
Trond Myklebust
10afec9081 NFS: Fix some 'sparse' warnings...
- fs/nfs/dir.c:610:8: warning: symbol 'nfs_llseek_dir' was not declared.
   Should it be static?
 - fs/nfs/dir.c:636:5: warning: symbol 'nfs_fsync_dir' was not declared.
   Should it be static?
 - fs/nfs/write.c:925:19: warning: symbol 'req' shadows an earlier one
 - fs/nfs/write.c:61:6: warning: symbol 'nfs_commit_rcu_free' was not
   declared. Should it be static?
 - fs/nfs/nfs4proc.c:793:5: warning: symbol 'nfs4_recover_expired_lease'
   was not declared. Should it be static?

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-05-14 19:33:46 -04:00
Linus Torvalds
2d56d3c43c Merge branch 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux
* 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux:
  gfs2: nfs lock support for gfs2
  lockd: add code to handle deferred lock requests
  lockd: always preallocate block in nlmsvc_lock()
  lockd: handle test_lock deferrals
  lockd: pass cookie in nlmsvc_testlock
  lockd: handle fl_grant callbacks
  lockd: save lock state on deferral
  locks: add fl_grant callback for asynchronous lock return
  nfsd4: Convert NFSv4 to new lock interface
  locks: add lock cancel command
  locks: allow {vfs,posix}_lock_file to return conflicting lock
  locks: factor out generic/filesystem switch from setlock code
  locks: factor out generic/filesystem switch from test_lock
  locks: give posix_test_lock same interface as ->lock
  locks: make ->lock release private data before returning in GETLK case
  locks: create posix-to-flock helper functions
  locks: trivial removal of unnecessary parentheses
2007-05-07 12:34:24 -07:00
J. Bruce Fields
70cc6487a4 locks: make ->lock release private data before returning in GETLK case
The file_lock argument to ->lock is used to return the conflicting lock
when found.  There's no reason for the filesystem to return any private
information with this conflicting lock, but nfsv4 is.

Fix nfsv4 client, and modify locks.c to stop calling fl_release_private
for it in this case.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Cc: "Trond Myklebust" <Trond.Myklebust@netapp.com>"
2007-05-06 17:38:19 -04:00
J. Bruce Fields
08efa202eb NFS4: invalidate cached acl on setacl
The ACL that the server sets may not be exactly the one we set--for
example, it may silently turn off bits that it does not support.  So we
should remove any cached ACL so that any subsequent request for the ACL
will go to the server.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-05-02 07:36:09 -07:00
Trond Myklebust
d9bc125caf Merge branch 'master' of /home/trondmy/kernel/linux-2.6/
Conflicts:

	net/sunrpc/auth_gss/gss_krb5_crypto.c
	net/sunrpc/auth_gss/gss_spkm3_token.c
	net/sunrpc/clnt.c

Merge with mainline and fix conflicts.
2007-02-12 22:43:25 -08:00
Arjan van de Ven
92e1d5be91 [PATCH] mark struct inode_operations const 2
Many struct inode_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:46 -08:00
Trond Myklebust
e148582e10 NFSv4: Add lockdep checks to nfs4_wait_clnt_recover()
Attempt to detect deadlocks due to caller holding locks on clp->cl_sem

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-02-03 15:35:08 -08:00
Trond Myklebust
a6a352e93d NFSv4: Don't start state recovery in nfs4_close_done()
We might not even have any open files at this point...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-02-03 15:35:08 -08:00
Trond Myklebust
8e0969f045 NFS: Remove nfs_readpage_sync()
It makes no sense to maintain 2 parallel systems for reading in pages.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-02-03 15:35:06 -08:00
Trond Myklebust
c228fd3aee NFSv4: Cleanups for fs_locations code.
Start long arduous project...  What the hell is

	struct dentry = {};

all about?

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-02-03 15:35:06 -08:00
Robert P. J. Day
5cbded585d [PATCH] getting rid of all casts of k[cmz]alloc() calls
Run this:

	#!/bin/sh
	for f in $(grep -Erl "\([^\)]*\) *k[cmz]alloc" *) ; do
	  echo "De-casting $f..."
	  perl -pi -e "s/ ?= ?\([^\)]*\) *(k[cmz]alloc) *\(/ = \1\(/" $f
	done

And then go through and reinstate those cases where code is casting pointers
to non-pointers.

And then drop a few hunks which conflicted with outstanding work.

Cc: Russell King <rmk@arm.linux.org.uk>, Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Karsten Keil <kkeil@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Steven French <sfrench@us.ibm.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Jaroslav Kysela <perex@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:58 -08:00
Trond Myklebust
200baa2112 NFS: Remove nfs_writepage_sync()
Maintaining two parallel ways of doing synchronous writes is rather
pointless. This patch gets rid of the legacy nfs_writepage_sync(), and
replaces it with the faster asynchronous writes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:38 -05:00
Frank Filz
cae823c4c0 NFS: Remove use of the Big Kernel Lock around calls to rpc_call_sync
Remove use of the Big Kernel Lock around calls to rpc_call_sync.

Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:31 -05:00
Trond Myklebust
e6b3c4db6f Fix a second potential rpc_wakeup race...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:25 -05:00
Al Viro
bc4785cd47 [PATCH] nfs: verifier is network-endian
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:40 -07:00
Al Viro
0dbb4c6799 [PATCH] xdr annotations: NFS readdir entries
on-the-wire data is big-endian

[in large part pulled from Alexey's patch]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:40 -07:00
Chuck Lever
b87c0adfea [PATCH] NFS: remove unused check in nfs4_open_revalidate
Coverity spotted a superfluous error check in nfs4_open_revalidate().  Remove
it.

Coverity: #cid 847

Test plan:
Code inspection; another pass through Coverity.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:39 -07:00
Trond Myklebust
2066fe89b4 NFSv4: Poll more aggressively when handling NFS4ERR_DELAY
Change the initial retry delay from 1s to 0.1s (and then back off
exponentially).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:04 -04:00
Trond Myklebust
c514983d8d NFSv4: Handle the condition NFS4ERR_FILE_OPEN
Retry a few times before we give up: the error is usually due to ordering
issues with asynchronous RPC calls.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:03 -04:00
Trond Myklebust
6b30954ebb NFSv4: Retry lease recovery if it failed during a synchronous operation.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:03 -04:00
Chuck Lever
94a6d75320 NFS: Use cached page as buffer for NFS symlink requests
Now that we have a copy of the symlink path in the page cache, we can pass
a struct page down to the XDR routines instead of a string buffer.

Test plan:
Connectathon, all NFS versions.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:53 -04:00
Chuck Lever
4f390c152b NFS: Fix double d_drop in nfs_instantiate() error path
If the LOOKUP or GETATTR in nfs_instantiate fail, nfs_instantiate will do a
d_drop before returning.  But some callers already do a d_drop in the case
of an error return.  Make certain we do only one d_drop in all error paths.

This issue was introduced because over time, the symlink proc API diverged
slightly from the create/mkdir/mknod proc API.  To prevent other coding
mistakes of this type, change the symlink proc API to be more like
create/mkdir/mknod and move the nfs_instantiate call into the symlink proc
routines so it is used in exactly the same way for create, mkdir, mknod,
and symlink.

Test plan:
Connectathon, all versions of NFS.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:52 -04:00
David Howells
54ceac4515 NFS: Share NFS superblocks per-protocol per-server per-FSID
The attached patch makes NFS share superblocks between mounts from the same
server and FSID over the same protocol.

It does this by creating each superblock with a false root and returning the
real root dentry in the vfsmount presented by get_sb(). The root dentry set
starts off as an anonymous dentry if we don't already have the dentry for its
inode, otherwise it simply returns the dentry we already have.

We may thus end up with several trees of dentries in the superblock, and if at
some later point one of anonymous tree roots is discovered by normal filesystem
activity to be located in another tree within the superblock, the anonymous
root is named and materialises attached to the second tree at the appropriate
point.

Why do it this way? Why not pass an extra argument to the mount() syscall to
indicate the subpath and then pathwalk from the server root to the desired
directory? You can't guarantee this will work for two reasons:

 (1) The root and intervening nodes may not be accessible to the client.

     With NFS2 and NFS3, for instance, mountd is called on the server to get
     the filehandle for the tip of a path. mountd won't give us handles for
     anything we don't have permission to access, and so we can't set up NFS
     inodes for such nodes, and so can't easily set up dentries (we'd have to
     have ghost inodes or something).

     With this patch we don't actually create dentries until we get handles
     from the server that we can use to set up their inodes, and we don't
     actually bind them into the tree until we know for sure where they go.

 (2) Inaccessible symbolic links.

     If we're asked to mount two exports from the server, eg:

	mount warthog:/warthog/aaa/xxx /mmm
	mount warthog:/warthog/bbb/yyy /nnn

     We may not be able to access anything nearer the root than xxx and yyy,
     but we may find out later that /mmm/www/yyy, say, is actually the same
     directory as the one mounted on /nnn. What we might then find out, for
     example, is that /warthog/bbb was actually a symbolic link to
     /warthog/aaa/xxx/www, but we can't actually determine that by talking to
     the server until /warthog is made available by NFS.

     This would lead to having constructed an errneous dentry tree which we
     can't easily fix. We can end up with a dentry marked as a directory when
     it should actually be a symlink, or we could end up with an apparently
     hardlinked directory.

     With this patch we need not make assumptions about the type of a dentry
     for which we can't retrieve information, nor need we assume we know its
     place in the grand scheme of things until we actually see that place.

This patch reduces the possibility of aliasing in the inode and page caches for
inodes that may be accessed by more than one NFS export. It also reduces the
number of superblocks required for NFS where there are many NFS exports being
used from a server (home directory server + autofs for example).

This in turn makes it simpler to do local caching of network filesystems, as it
can then be guaranteed that there won't be links from multiple inodes in
separate superblocks to the same cache file.

Obviously, cache aliasing between different levels of NFS protocol could still
be a problem, but at least that gives us another key to use when indexing the
cache.

This patch makes the following changes:

 (1) The server record construction/destruction has been abstracted out into
     its own set of functions to make things easier to get right.  These have
     been moved into fs/nfs/client.c.

     All the code in fs/nfs/client.c has to do with the management of
     connections to servers, and doesn't touch superblocks in any way; the
     remaining code in fs/nfs/super.c has to do with VFS superblock management.

 (2) The sequence of events undertaken by NFS mount is now reordered:

     (a) A volume representation (struct nfs_server) is allocated.

     (b) A server representation (struct nfs_client) is acquired.  This may be
     	 allocated or shared, and is keyed on server address, port and NFS
     	 version.

     (c) If allocated, the client representation is initialised.  The state
     	 member variable of nfs_client is used to prevent a race during
     	 initialisation from two mounts.

     (d) For NFS4 a simple pathwalk is performed, walking from FH to FH to find
     	 the root filehandle for the mount (fs/nfs/getroot.c).  For NFS2/3 we
     	 are given the root FH in advance.

     (e) The volume FSID is probed for on the root FH.

     (f) The volume representation is initialised from the FSINFO record
     	 retrieved on the root FH.

     (g) sget() is called to acquire a superblock.  This may be allocated or
     	 shared, keyed on client pointer and FSID.

     (h) If allocated, the superblock is initialised.

     (i) If the superblock is shared, then the new nfs_server record is
     	 discarded.

     (j) The root dentry for this mount is looked up from the root FH.

     (k) The root dentry for this mount is assigned to the vfsmount.

 (3) nfs_readdir_lookup() creates dentries for each of the entries readdir()
     returns; this function now attaches disconnected trees from alternate
     roots that happen to be discovered attached to a directory being read (in
     the same way nfs_lookup() is made to do for lookup ops).

     The new d_materialise_unique() function is now used to do this, thus
     permitting the whole thing to be done under one set of locks, and thus
     avoiding any race between mount and lookup operations on the same
     directory.

 (4) The client management code uses a new debug facility: NFSDBG_CLIENT which
     is set by echoing 1024 to /proc/net/sunrpc/nfs_debug.

 (5) Clone mounts are now called xdev mounts.

 (6) Use the dentry passed to the statfs() op as the handle for retrieving fs
     statistics rather than the root dentry of the superblock (which is now a
     dummy).

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:37 -04:00
David Howells
8fa5c000d7 NFS: Move rpc_ops from nfs_server to nfs_client
Move the rpc_ops from the nfs_server struct to the nfs_client struct as they're
common to all server records of a particular NFS protocol version.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:35 -04:00
David Howells
1f163415dc NFS: Make better use of inode* dereferencing macros
Make better use of inode* dereferencing macros to hide dereferencing chains
(including NFS_PROTO and NFS_CLIENT).

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:35 -04:00
David Howells
509de81116 NFS: Add extra const qualifiers
Add some extra const qualifiers into NFS.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:34 -04:00
David Howells
24c8dbbb5f NFS: Generalise the nfs_client structure
Generalise the nfs_client structure by:

 (1) Moving nfs_client to a more general place (nfs_fs_sb.h).

 (2) Renaming its maintenance routines to be non-NFS4 specific.

 (3) Move those maintenance routines to a new non-NFS4 specific file (client.c)
     and move the declarations to internal.h.

 (4) Make nfs_find/get_client() take a full sockaddr_in to include the port
     number (will be required for NFS2/3).

 (5) Make nfs_find/get_client() take the NFS protocol version (again will be
     required to differentiate NFS2, 3 & 4 client records).

Also:

 (6) Make nfs_client construction proceed akin to inodes, marking them as under
     construction and providing a function to indicate completion.

 (7) Make nfs_get_client() wait interruptibly if it finds a client that it can
     share, but that client is currently being constructed.

 (8) Make nfs4_create_client() use (6) and (7) instead of locking cl_sem.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:33 -04:00
David Howells
e9326dcab4 NFS: Add a server capabilities NFS RPC op
Add a set_capabilities NFS RPC op so that the server capabilities can be set.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:33 -04:00
David Howells
2b3de4411b NFS: Add a lookupfh NFS RPC op
Add a lookup filehandle NFS RPC op so that a file handle can be looked up
without requiring dentries and inodes and other VFS stuff when doing an NFS4
pathwalk during mounting.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:32 -04:00
David Howells
7539bbab80 NFS: Rename nfs_server::nfs4_state
Rename nfs_server::nfs4_state to nfs_client as it will be used to represent the
client state for NFS2 and NFS3 also.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:32 -04:00
David Howells
adfa6f980b NFS: Rename struct nfs4_client to struct nfs_client
Rename struct nfs4_client to struct nfs_client so that it can become the basis
for a general client record for NFS2 and NFS3 in addition to NFS4.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:31 -04:00
Trond Myklebust
76723de0cf NFSv4: Fix incorrect semaphore release in _nfs4_do_open()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-19 11:54:53 -04:00
Trond Myklebust
16b4289c74 NFSv4: Add v4 exception handling for the ACL functions.
This is needed in order to handle any NFS4ERR_DELAY errors that might be
returned by the server. It also ensures that we map the NFSv4 errors before
they are returned to userland.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
(cherry picked from 71c12b3f0abc7501f6ed231a6d17bc9c05a238dc commit)
2006-08-24 15:54:13 -04:00
Trond Myklebust
01c3b861cd NLM,NFSv4: Wait on local locks before we put RPC calls on the wire
Use FL_ACCESS flag to test and/or wait for local locks before we try
requesting a lock from the server

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-07-05 13:13:18 -04:00
Trond Myklebust
42a2d13eee NFSv4: Ensure nfs4_lock_expired() caches delegated locks
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-07-05 13:13:18 -04:00
Trond Myklebust
9b07357490 NLM,NFSv4: Don't put UNLOCK requests on the wire unless we hold a lock
Use the new behaviour of {flock,posix}_file_lock(F_UNLCK) to determine if
we held a lock, and only send the RPC request to the server if this was the
case.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-07-05 13:13:17 -04:00
David Howells
f7b422b17e NFS: Split fs/nfs/inode.c
As fs/nfs/inode.c is rather large, heterogenous and unwieldy, the attached
patch splits it up into a number of files:

 (*) fs/nfs/inode.c

     Strictly inode specific functions.

 (*) fs/nfs/super.c

     Superblock management functions for NFS and NFS4, normal access, clones
     and referrals.  The NFS4 superblock functions _could_ move out into a
     separate conditionally compiled file, but it's probably not worth it as
     there're so many common bits.

 (*) fs/nfs/namespace.c

     Some namespace-specific functions have been moved here.

 (*) fs/nfs/nfs4namespace.c

     NFS4-specific namespace functions (this could be merged into the previous
     file).  This file is conditionally compiled.

 (*) fs/nfs/internal.h

     Inter-file declarations, plus a few simple utility functions moved from
     fs/nfs/inode.c.

     Additionally, all the in-.c-file externs have been moved here, and those
     files they were moved from now includes this file.

For the most part, the functions have not been changed, only some multiplexor
functions have changed significantly.

I've also:

 (*) Added some extra banner comments above some functions.

 (*) Rearranged the function order within the files to be more logical and
     better grouped (IMO), though someone may prefer a different order.

 (*) Reduced the number of #ifdefs in .c files.

 (*) Added missing __init and __exit directives.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-06-09 09:34:33 -04:00
Manoj Naik
6b97fd3da1 NFSv4: Follow a referral
Respond to a moved error on NFS lookup by setting up the referral.
Note: We don't actually follow the referral during lookup/getattr, but
later when we detect fsid mismatch in inode revalidation (similar to the
processing done for cloning submounts). Referrals will have fake attributes
until they are actually followed or traversed.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:29 -04:00
Manoj Naik
830b8e33fe NFSv4: Define an fs_locations bitmap
This is (similar to getattr bitmap) but includes fs_locations and
mounted_on_fileid attributes. Use this bitmap for encoding in fs_locations
requests.
Note: We can probably do better by requesting locations as part of fsinfo
itself.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:25 -04:00
Manoj Naik
361e624f6d NFSv4: GETATTR attributes on referral
Per referral draft, only fs_locations, fsid, and mounted_on_fileid can be
requested in a GETATTR on referrals.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:24 -04:00
Manoj Naik
7aaa0b3bd4 NFSv4: convert fs-locations-components to conform to RFC3530
Use component4-style formats for decoding list of servers and pathnames in
fs_locations.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:23 -04:00
Trond Myklebust
683b57b435 NFSv4: Implement the fs_locations function call
NFSv4 allows for the fact that filesystems may be replicated across
several servers or that they may be migrated to a backup server in case of
failure of the primary server.
fs_locations is an NFSv4 operation for retrieving information about the
location of migrated and/or replicated filesystems.

Based on an initial implementation by Jiaying Zhang <jiayingz@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:22 -04:00
Trond Myklebust
55a975937d NFS: Ensure the client submounts, when it crosses a server mountpoint.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:19 -04:00
Trond Myklebust
38478b24e3 NFS: More page cache revalidation fixups
Whenever the directory changes, we want to make sure that we always
invalidate its page cache. Fix up update_changeattr() and
nfs_mark_for_revalidate() so that they do so.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:09 -04:00
Trond Myklebust
73a3d07c10 NFS: Clean up inode metadata updates
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:04 -04:00
Trond Myklebust
95cf959b24 VFS: Fix another open intent Oops
If the call to nfs_intent_set_file() fails to open a file in
nfs4_proc_create(), we should return an error.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 12:43:46 -04:00
J. Bruce Fields
096455a22a NFSv4: Dont list system.nfs4_acl for filesystems that don't support it.
Thanks to Frank Filz for pointing out that we list system.nfs4_acl extended
attribute even on filesystems where we don't actually support nfs4_acl.
This is inconsistent with the e.g. ext3 POSIX ACL behaviour, and seems to
annoy cp.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 23:23:42 -05:00
Trond Myklebust
7a1218a277 SUNRPC: Ensure rpc_call_async() always calls tk_ops->rpc_release()
Currently this will not happen if we exit before rpc_new_task() was called.
Also fix up rpc_run_task() to do the same (for consistency).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 18:11:10 -05:00
Trond Myklebust
03f28e3a20 NFS: Make nfs_fhget() return appropriate error values
Currently it returns NULL, which usually gets interpreted as ENOMEM. In
fact it can mean a host of issues.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:48 -05:00
Trond Myklebust
51581f3bf9 NFSv4: SETCLIENTID_CONFIRM should handle NFS4ERR_DELAY/NFS4ERR_RESOURCE
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:47 -05:00
Trond Myklebust
3e4f6290ca NFSv4: Send the delegation stateid for SETATTR calls
In the case where we hold a delegation stateid, use that in for inside
SETATTR calls.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:46 -05:00
Trond Myklebust
ec06c096ed NFS: Cleanup of NFS read code
Same callback hierarchy inversion as for the NFS write calls. This patch is
not strictly speaking needed by the O_DIRECT code, but avoids confusing
differences between the asynchronous read and write code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:27 -05:00
Trond Myklebust
788e7a89a0 NFS: Cleanup of NFS write code in preparation for asynchronous o_direct
This patch inverts the callback hierarchy for NFS write calls.

Instead of having the NFSv2/v3/v4-specific code set up the RPC callback
ops, we allow the original caller to do so. This allows for more
flexibility w.r.t. how to set up and tear down the nfs_write_data
structure while still allowing the NFSv3/v4 code to perform error
handling.

The greater flexibility is needed by the asynchronous O_DIRECT code, which
wants to be able to hold on to the original nfs_write_data structures after
the WRITE RPC call has completed in order to be able to replay them if the
COMMIT call determines that the server has rebooted.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:27 -05:00
Chuck Lever
006ea73e5f NFS: add hooks to account for NFSERR_JUKEBOX errors
Make an inode or an nfs_server struct available in the logic that handles
JUKEBOX/DELAY type errors so the NFS client can account for them.

This patch is split out from the main nfs iostat patch to highlight minor
architectural changes required to support this statistic.

Test plan:
None.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:14 -05:00
Trond Myklebust
a162a6b804 NFSv4: Kill braindead gcc warnings
nfs4_open_revalidate: 'res' may be used uninitialized
nfs4_callback_compound: ‘hdr_res.nops’ may be used uninitialized
			'op_nr’ may be used uninitialized
encode_getattr_res: ‘savep’ may be used uninitialized

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:10 -05:00
Jesper Juhl
c8d149f3db NFS: "const static" vs "static const" in nfs4
My previous "const static" vs "static const" cleanup missed a single case,
patch below takes care of it.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:07 -05:00
Trond Myklebust
c12e87f465 [PATCH] NFSv4: fix mount segfault on errors returned that are < -1000
It turns out that nfs4_proc_get_root() may return raw NFSv4 errors instead of
mapping them to kernel errors.  Problem spotted by Neil Horman
<nhorman@tuxdriver.com>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-14 07:57:18 -08:00
Trond Myklebust
fa178f29c0 NFSv4: Ensure DELEGRETURN returns attributes
Upon return of a write delegation, the server will almost always bump the
 change attribute. Ensure that we pick up that change so that we don't
 invalidate our data cache unnecessarily.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:51 -05:00
Trond Myklebust
286d7d6a0c NFSv4: Remove requirement for machine creds for the "setclientid" operation
Use a cred from the nfs4_client->cl_state_owners list.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:47 -05:00
Trond Myklebust
b4454fe1a7 NFSv4: Remove requirement for machine creds for the "renew" operation
In RFC3530, the RENEW operation is allowed to use either

 the same principal, RPC security flavour and (if RPCSEC_GSS), the same
  mechanism and service that was used for SETCLIENTID_CONFIRM

 OR

 Any principal, RPC security flavour and service combination that
 currently has an OPEN file on the server.

 Choose the latter since that doesn't require us to keep credentials for
 the same principal for the entire duration of the mount.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:47 -05:00
Trond Myklebust
58d9714a44 NFSv4: Send RENEW requests to the server only when we're holding state
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:46 -05:00
Trond Myklebust
433fbe4c88 NFSv4: State recovery cleanup
Use wait_on_bit() when waiting for state recovery to complete.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:45 -05:00
Trond Myklebust
26e976a884 NFSv4: OPEN/LOCK/LOCKU/CLOSE will automatically renew the NFSv4 lease
Cut down on the number of unnecessary RENEW requests on the wire.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:45 -05:00
Trond Myklebust
fe650407a8 NFSv4: Make DELEGRETURN an interruptible operation.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:44 -05:00
Trond Myklebust
a5d16a4d09 NFSv4: Convert LOCK rpc call into an asynchronous RPC call
In order to allow users to interrupt/cancel it.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:44 -05:00
Trond Myklebust
911d1aaf26 NFSv4: locking XDR cleanup
Get rid of some unnecessary intermediate structures

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:44 -05:00
Trond Myklebust
864472e9b8 NFSv4: Make open recovery track O_RDWR, O_RDONLY and O_WRONLY correctly
When recovering from a delegation recall or a network partition, we need
 to replay open(O_RDWR), open(O_RDONLY) and open(O_WRONLY) separately.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:43 -05:00
Trond Myklebust
e761692381 NFSv4: Make nfs4_state track O_RDWR, O_RDONLY and O_WRONLY separately
A closer reading of RFC3530 reveals that OPEN_DOWNGRADE must always
 specify a access modes that have been the argument of a previous OPEN
 operation.
 IOW: doing OPEN(O_RDWR) and then OPEN_DOWNGRADE(O_WRONLY) is forbidden
 unless the user called OPEN(O_WRONLY)

 In order to fix that, we really need to track the three possible open
 states separately.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:43 -05:00
Trond Myklebust
cdd4e68b5f NFSv4: Make open_confirm() asynchronous too
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:42 -05:00
Trond Myklebust
24ac23ab88 NFSv4: Convert open() into an asynchronous RPC call
OPEN is a stateful operation, so we must ensure that it always
 completes. In order to allow users to interrupt the operation,
 we need to make the RPC call asynchronous, and then wait on
 completion (or cancel).

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:42 -05:00
Trond Myklebust
e56e0b78eb NFSv4: Allocate OPEN call RPC arguments using kmalloc()
Cleanup in preparation for making OPEN calls interruptible by the user.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:41 -05:00
Trond Myklebust
06f814a3ad NFSv4: Make locku use the new RPC "wait on completion" interface.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:40 -05:00
Trond Myklebust
4ce70ada1f SUNRPC: Further cleanups
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:40 -05:00
Trond Myklebust
963d8fe533 RPC: Clean up RPC task structure
Shrink the RPC task structure. Instead of storing separate pointers
 for task->tk_exit and task->tk_release, put them in a structure.

 Also pass the user data pointer as a parameter instead of passing it via
 task->tk_calldata. This enables us to nest callbacks.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:39 -05:00
Trond Myklebust
3b6efee923 NFSv4: Fix an Oops in the synchronous write path
- Missing initialisation of attribute bitmask in _nfs4_proc_write()
 - On success, _nfs4_proc_write() must return number of bytes written.
 - Missing post_op_update_inode() in _nfs4_proc_write()
 - Missing initialisation of attribute bitmask in _nfs4_proc_commit()
 - Missing post_op_update_inode() in _nfs4_proc_commit()

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-12-03 15:20:21 -05:00
Trond Myklebust
ff6040667a NFSv4: Fix typo in lock caching
When caching locks due to holding a file delegation, we must always
 check against local locks before sending anything to the server.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-25 17:11:29 -05:00
Trond Myklebust
6bfc93ef98 NFSv4: Teach NFSv4 to cache locks when we hold a delegation
Now that we have a method of dealing with delegation recalls, actually
 enable the caching of posix and BSD locks.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:39:36 -05:00
Trond Myklebust
888e694c16 NFSv4: Recover locks too when returning a delegation
Delegations allow us to cache posix and BSD locks, however when the
 delegation is recalled, we need to "flush the cache" and send
 the cached LOCK requests to the server.

 This patch sets up the mechanism for doing so.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:38:11 -05:00
Trond Myklebust
2c56617d76 NFSv4: Fix the handling of the error NFS4ERR_OLD_STATEID
Ensure that we retry the failed operation...

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:33:50 -05:00
Trond Myklebust
d530838bfa NFSv4: Fix problem with OPEN_DOWNGRADE
RFC 3530 states that for OPEN_DOWNGRADE "The share_access and share_deny
 bits specified must be exactly equal to the union of the share_access and
 share_deny bits specified for some subset of the OPENs in effect for
 current openowner on the current file.

 Setattr is currently violating the NFSv4 rules for OPEN_DOWNGRADE in that
 it may cause a downgrade from OPEN4_SHARE_ACCESS_BOTH to
 OPEN4_SHARE_ACCESS_WRITE despite the fact that there exists no open file
 with O_WRONLY access mode.

 Fix the problem by replacing nfs4_find_state() with a modified version of
 nfs_find_open_context().

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:33:38 -05:00
Trond Myklebust
4cecb76ff8 NFSv4: Fix a race between open() and close()
We must not remove the nfs4_state structure from the inode open lists
 before we are in sequence lock.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:32:58 -05:00
Trond Myklebust
4f9838c7ec NFSv4: Add post-op attributes to NFSv4 write and commit callbacks.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:44 -04:00
Trond Myklebust
16e429596d NFSv4: Add post-op attributes to nfs4_proc_remove()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:44 -04:00
Trond Myklebust
6caf2c8276 NFSv4: Add post-op attributes to nfs4_proc_rename()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:43 -04:00
Trond Myklebust
91ba2eeec5 NFSv4: Add post-op attributes to nfs4_proc_link()
Optimise attribute revalidation when hardlinking. Add post-op attributes
 for the directory and the original inode.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:42 -04:00
Trond Myklebust
516a6af641 NFS: Add optional post-op getattr instruction to the NFSv4 file close.
"Optional" means that the close call will not fail if the getattr
 at the end of the compound fails.
 If it does succeed, try to refresh inode attributes.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:41 -04:00
Trond Myklebust
56ae19f38f NFSv4: Add directory post-op attributes to the CREATE operations.
Since the directory attributes change every time we CREATE a file,
 we might as well pick up the new directory attributes in the same
 compound.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:40 -04:00
Trond Myklebust
decf491f30 NFS: Don't let nfs_end_data_update() clobber attribute update information
Since we almost always call nfs_end_data_update() after we called
 nfs_refresh_inode(), we now end up marking the inode metadata
 as needing revalidation immediately after having updated it.

 This patch rearranges things so that we mark the inode as needing
 revalidation _before_ we call nfs_refresh_inode() on those operations
 that need it.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:39 -04:00
Trond Myklebust
0e574af1be NFS: Cleanup initialisation of struct nfs_fattr
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:38 -04:00
Trond Myklebust
ec07342828 NFSv4: Fix up locking for nfs4_state_owner
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-20 14:22:47 -07:00
J. Bruce Fields
1d95db8e16 NFSv4: Fix acl buffer size
resp_len is passed in as buffer size to decode routine; make sure it's
 set right in case where userspace provides less than a page's worth of
 buffer.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:41 -07:00
Trond Myklebust
550f57470c NFSv4: Ensure that we recover from the OPEN + OPEN_CONFIRM BAD_STATEID race
If the server is in the unconfirmed OPEN state for a given open owner
 and receives a second OPEN for the same open owner, it will cancel the
 state of the first request and set up an OPEN_CONFIRM for the second.

 This can cause a race that is discussed in rfc3530 on page 181.

 The following patch allows the client to recover by retrying the
 original open request.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:21 -07:00
Trond Myklebust
b8e5c4c297 NFSv4: If a delegated open fails, ensure that we return the delegation
Unless of course the open fails due to permission issues.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:20 -07:00
Trond Myklebust
642ac54923 NFSv4: Return delegations in case we're changing ACLs
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:19 -07:00
Trond Myklebust
6f926b5ba7 [NFS]: Check that the server returns a valid regular file to our OPEN request
Since it appears that some servers don't...

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:18 -07:00
Trond Myklebust
02a913a73b NFSv4: Eliminate nfsv4 open race...
Make NFSv4 return the fully initialized file pointer with the
 stateid that it created in the lookup w/intent.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:17 -07:00
Trond Myklebust
06735b3454 NFSv4: Fix up handling of open_to_lock sequence ids
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:15 -07:00
Trond Myklebust
faf5f49c2d NFSv4: Make NFS clean up byte range locks asynchronously
Currently we fail to do so if the process was signalled.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:15 -07:00
Trond Myklebust
0a8838f972 NFSv4: Add missing handling of OPEN_CONFIRM requests on CLAIM_DELEGATE_CUR.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:14 -07:00
Trond Myklebust
83c9d41e45 NFSv4: Remove nfs4_client->cl_sem from close() path
We no longer need to worry about collisions between close() and the state
 recovery code, since the new close will automatically recheck the
 file state once it is done waiting on its sequence slot.

 Ditto for the nfs4_proc_locku() procedure.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:13 -07:00
Trond Myklebust
e6dfa553cf NFSv4: Remove obsolete state_owner and lock_owner semaphores
OPEN, CLOSE, etc no longer need these semaphores to ensure ordering of
 requests.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:13 -07:00
Trond Myklebust
9512135df1 NFSv4: Fix a potential CLOSE race
Once the state_owner and lock_owner semaphores get removed, it will be
 possible for other OPEN requests to reopen the same file if they have
 lower sequence ids than our CLOSE call.
 This patch ensures that we recheck the file state once
 nfs_wait_on_sequence() has completed waiting.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:12 -07:00
Trond Myklebust
cee54fc944 NFSv4: Add functions to order RPC calls
NFSv4 file state-changing functions such as OPEN, CLOSE, LOCK,... are all
 labelled with "sequence identifiers" in order to prevent the server from
 reordering RPC requests, as this could cause its file state to
 become out of sync with the client.

 Currently the NFS client code enforces this ordering locally using
 semaphores to restrict access to structures until the RPC call is done.
 This, of course, only works with synchronous RPC calls, since the
 user process must first grab the semaphore.
 By dropping semaphores, and instead teaching the RPC engine to hold
 the RPC calls until they are ready to be sent, we can extend this
 process to work nicely with asynchronous RPC calls too.

 This patch adds a new list called "rpc_sequence" that defines the order
 of the RPC calls to be sent. We add one such list for each state_owner.
 When an RPC call is ready to be sent, it checks if it is top of the
 rpc_sequence list. If so, it proceeds. If not, it goes back to sleep,
 and loops until it hits top of the list.
 Once the RPC call has completed, it can then bump the sequence id counter,
 and remove itself from the rpc_sequence list, and then wake up the next
 sleeper.

 Note that the state_owner sequence ids and lock_owner sequence ids are
 all indexed to the same rpc_sequence list, so OPEN, LOCK,... requests
 are all ordered w.r.t. each other.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:12 -07:00
Nishanth Aravamudan
041e0e3b19 [PATCH] fs: fix-up schedule_timeout() usage
Use schedule_timeout_{,un}interruptible() instead of
set_current_state()/schedule_timeout() to reduce kernel size.  Also use helper
functions to convert between human time units and jiffies rather than constant
HZ division to avoid rounding errors.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:36 -07:00
Trond Myklebust
65e4308d25 [PATCH] NFS: Ensure we always update inode->i_mode when doing O_EXCL creates
When the client performs an exclusive create and opens the file for writing,
a Netapp filer will first create the file using the mode 01777. It does this
since an NFSv3/v4 exclusive create cannot immediately set the mode bits.
The 01777 mode then gets put into the inode->i_mode. After the file creation
is successful, we then do a setattr to change the mode to the correct value
(as per the NFS spec).

The problem is that nfs_refresh_inode() no longer updates inode->i_mode, so
the latter retains the 01777 mode. A bit later, the VFS notices this, and calls
remove_suid(). This of course now resets the file mode to inode->i_mode & 0777.
Hey presto, the file mode on the server is now magically changed to 0777. Duh...

Fixes http://bugzilla.linux-nfs.org/show_bug.cgi?id=32

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-16 09:30:58 -07:00
Trond Myklebust
eadf4598e7 [PATCH] NFS: Add debugging code to NFSv4 readdir
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:44 -04:00
Trond Myklebust
8d0a8a9d0e [PATCH] NFSv4: Clean up nfs4 lock state accounting
Ensure that lock owner structures are not released prematurely.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:42 -04:00
Trond Myklebust
08e9eac42e [PATCH] NFSv4: Fix up races in nfs4_proc_setattr()
If we do not hold a valid stateid that is open for writes, there is little
 point in doing an extra open of the file, as the RFC does not appear to
 mandate this...

 Make setattr use the correct stateid if we're holding mandatory byte
 range locks.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:35 -04:00
Trond Myklebust
202b50dc12 [PATCH] NFSv4: Ensure that propagate NFSv4 state errors to the reclaim code
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:34 -04:00
Andrew Morton
3e9d41543b [PATCH] NFSv4: empty array fix
Older gcc's don't like this.

 fs/nfs/nfs4proc.c:2194: field `data' has incomplete type

 Signed-off-by: Andrew Morton <akpm@osdl.org>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:28 -04:00
Adrian Bunk
b7ef19560f [PATCH] NFSv4: fs/nfs/nfs4proc.c: small simplification
The Coverity checker noticed that such a simplification was possible.

 Signed-off-by: Adrian Bunk <bunk@stusta.de>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:27 -04:00
J. Bruce Fields
e50a1c2e1f [PATCH] NFSv4: client-side caching NFSv4 ACLs
Add nfs4_acl field to the nfs_inode, and use it to cache acls.  Only cache
 acls of size up to a page.  Also prepare for up to a page of acl data even
 when the user doesn't pass in a buffer, as when they want to get the acl
 length to decide what size buffer to allocate.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:15 -04:00
J. Bruce Fields
4b580ee3dc [PATCH] NFSv4: ACL support for the NFSv4 client: write
Client-side write support for NFSv4 ACLs.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:14 -04:00
J. Bruce Fields
aa1870af92 [PATCH] NFSv4: ACL support for the NFSv4 client: read
Client-side support for NFSv4 ACLs.  Exports the raw xdr code via the
 system.nfs4_acl extended attribute.  It is up to userspace to decode the acl
 (and to provide correctly xdr'd acls on setxattr), and to convert to/from
 POSIX ACLs if desired.

 This patch provides only the read support.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:13 -04:00
J. Bruce Fields
6b3b5496d7 [PATCH] NFSv4: Add {get,set,list}xattr methods for nfs4
Add {get,set,list}xattr methods for nfs4.  The new methods are no-ops, to be
 used by subsequent ACL patch.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:10 -04:00
J. Bruce Fields
92cfc62cb8 [PATCH] NFS: Allow NFS versions to support different sets of inode operations.
ACL support will require supporting additional inode operations in v4
 (getxattr, setxattr, listxattr).  This patch allows different protocol versions
 to support different inode operations by adding a file_inode_ops to the
 nfs_rpc_ops (to match the existing dir_inode_ops).

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:09 -04:00
Trond Myklebust
4ce79717ce [PATCH] NFS: Header file cleanup...
- Move NFSv4 state definitions into a private header file.
 - Clean up gunk in nfs_fs.h

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:06 -04:00
Linus Torvalds
1da177e4c3 Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!
2005-04-16 15:20:36 -07:00