2
0
mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-09-04 20:19:47 +08:00
Commit Graph

1533 Commits

Author SHA1 Message Date
Andy Adamson
97431204ea NFSv4.1 Use clientid management rpc_clnt for secinfo_no_name
As per RFC 5661 Security Considerations

Commit 4edaa308 "NFS: Use "krb5i" to establish NFSv4 state whenever possible"
uses the nfs_client cl_rpcclient for all clientid management operations.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-08-08 11:46:25 -04:00
Andy Adamson
5ec16a8500 NFSv4.1 Use clientid management rpc_clnt for secinfo
As per RFC 3530 and RFC 5661 Security Considerations

Commit 4edaa308 "NFS: Use "krb5i" to establish NFSv4 state whenever possible"
uses the nfs_client cl_rpcclient for all clientid management operations.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-08-08 11:46:25 -04:00
Trond Myklebust
b72888cb0b NFSv4: Fix up nfs4_proc_lookup_mountpoint
Currently, we do not check the return value of client = rpc_clone_client(),
nor do we shut down the resulting cloned rpc_clnt in the case where a
NFS4ERR_WRONGSEC has caused nfs4_proc_lookup_common() to replace the
original value of 'client' (causing a memory leak).

Fix both issues and simplify the code by moving the call to
rpc_clone_client() until after nfs4_proc_lookup_common() has
done its business.

Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-08-07 20:47:26 -04:00
Chuck Lever
73d8bde5e4 NFS: Never use user credentials for lease renewal
Never try to use a non-UID 0 user credential for lease management,
as that credential can change out from under us.  The server will
block NFSv4 lease recovery with NFS4ERR_CLID_INUSE.

Since the mechanism to acquire a credential for lease management
is now the same for all minor versions, replace the minor version-
specific callout with a single function.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-08-07 13:06:08 -04:00
Trond Myklebust
55b592933b NFSv4: Fix nfs4_init_uniform_client_string for net namespaces
Commit 6f2ea7f2a (NFS: Add nfs4_unique_id boot parameter) introduces a
boot parameter that allows client administrators to set a string
identifier for use by the EXCHANGE_ID and SETCLIENTID arguments in order
to make them more globally unique.

Unfortunately, that uniquifier is no longer globally unique in the presence
of net namespaces, since each container expects to be able to set up their
own lease when mounting a new NFSv4/4.1 partition.
The fix is to add back in the container-specific hostname in addition to
the unique id.

Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-07-24 16:47:04 -04:00
Andy Adamson
1771c5774b NFSv4.1 Use the mount point rpc_clnt for layoutreturn
Should not use the clientid maintenance rpc_clnt.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-07-23 18:18:54 -04:00
Nadav Shemer
cc7936f9ad nfs: fix open(O_RDONLY|O_TRUNC) in NFS4.0
nfs4_proc_setattr removes ATTR_OPEN from sattr->ia_valid, but later
nfs4_do_setattr checks for it

Signed-off-by: Nadav Shemer <nadav@tonian.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-07-23 18:18:53 -04:00
Linus Torvalds
be0c5d8c0b NFS client updates for Linux 3.11
Feature highlights include:
 - Add basic client support for NFSv4.2
 - Add basic client support for Labeled NFS (selinux for NFSv4.2)
 - Fix the use of credentials in NFSv4.1 stateful operations, and
   add support for NFSv4.1 state protection.
 
 Bugfix highlights:
 - Fix another NFSv4 open state recovery race
 - Fix an NFSv4.1 back channel session regression
 - Various rpc_pipefs races
 - Fix another issue with NFSv3 auth negotiation
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.13 (GNU/Linux)
 
 iQIcBAABAgAGBQJR2vsSAAoJEGcL54qWCgDyWBIP/AqlpBBAblxbNQ1Bl/0m1Pdb
 iKH961qgM4U1BzK0svGtHTZqkovpm4o/VbkbKBT5mQ4g6SbbsJ/AsS1plCyfnIZi
 bdnKNJyj6zg0NsAkJ3vKWqd4BTaP+icdSfEIlRKQxAPESewN7b5B3OWgY4KdYmnk
 q5BP25anC1ryxVycSY67ux8S2IKXVSRZeCZv+RO21rvZ2G0bV5y7t8Om28ztxEnU
 RKrHgQHgaaktR7i8QVO0sbiWq3iqLa3GPkUvFLwWGr8PQJtTkYY0QwYSrsV3N4rY
 hYpMRUZFHpZ8UG5YvBT6xyOy/XaGwMGKSfZjB9/YG4QVju+tTy50U1JbTil5PEWY
 GHWYF68aurIeUkXrhSv8AVnOnhir0mISx5ou/SV7p0QoAZ92V6kq+LkPrW520qlc
 z8ILh3j28pN3ZUCIEArcaZhYCt48uO2hwBi5TqevQyyGRsXFGbN1moD5jvHkllft
 Fi0XGuCBdvhrzFRZcsEl+PDq7fT8lXUK2BHe8oR5jz9PhUp+jpEl9m/eg3RsjJjN
 DuxsHye2U4chScdnRtLBQvpFtdINvWX/Gy8Bi7kdE5tsQySvOa+rdwuBc7h88PHC
 +4xI2iX3z4O1+GpsAe/T9+pjW689jEilS+eVDRVEGl6yHGn9q8PYOayjPjwbJHxS
 R2mLTRhKu1DKguTzO13f
 =wGjn
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
 "Feature highlights include:
   - Add basic client support for NFSv4.2
   - Add basic client support for Labeled NFS (selinux for NFSv4.2)
   - Fix the use of credentials in NFSv4.1 stateful operations, and add
     support for NFSv4.1 state protection.

  Bugfix highlights:
   - Fix another NFSv4 open state recovery race
   - Fix an NFSv4.1 back channel session regression
   - Various rpc_pipefs races
   - Fix another issue with NFSv3 auth negotiation

  Please note that Labeled NFS does require some additional support from
  the security subsystem.  The relevant changesets have all been
  reviewed and acked by James Morris."

* tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (54 commits)
  NFS: Set NFS_CS_MIGRATION for NFSv4 mounts
  NFSv4.1 Refactor nfs4_init_session and nfs4_init_channel_attrs
  nfs: have NFSv3 try server-specified auth flavors in turn
  nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it
  nfs: move server_authlist into nfs_try_mount_request
  nfs: refactor "need_mount" code out of nfs_try_mount
  SUNRPC: PipeFS MOUNT notification optimization for dying clients
  SUNRPC: split client creation routine into setup and registration
  SUNRPC: fix races on PipeFS UMOUNT notifications
  SUNRPC: fix races on PipeFS MOUNT notifications
  NFSv4.1 use pnfs_device maxcount for the objectlayout gdia_maxcount
  NFSv4.1 use pnfs_device maxcount for the blocklayout gdia_maxcount
  NFSv4.1 Fix gdia_maxcount calculation to fit in ca_maxresponsesize
  NFS: Improve legacy idmapping fallback
  NFSv4.1 end back channel session draining
  NFS: Apply v4.1 capabilities to v4.2
  NFSv4.1: Clean up layout segment comparison helper names
  NFSv4.1: layout segment comparison helpers should take 'const' parameters
  NFSv4: Move the DNS resolver into the NFSv4 module
  rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set
  ...
2013-07-09 12:09:43 -07:00
Trond Myklebust
959d921f5e Merge branch 'labeled-nfs' into linux-next
* labeled-nfs:
  NFS: Apply v4.1 capabilities to v4.2
  NFS: Add in v4.2 callback operation
  NFS: Make callbacks minor version generic
  Kconfig: Add Kconfig entry for Labeled NFS V4 client
  NFS: Extend NFS xattr handlers to accept the security namespace
  NFS: Client implementation of Labeled-NFS
  NFS: Add label lifecycle management
  NFS:Add labels to client function prototypes
  NFSv4: Extend fattr bitmaps to support all 3 words
  NFSv4: Introduce new label structure
  NFSv4: Add label recommended attribute and NFSv4 flags
  NFSv4.2: Added NFS v4.2 support to the NFS client
  SELinux: Add new labeling type native labels
  LSM: Add flags field to security_sb_set_mnt_opts for in kernel mount data.
  Security: Add Hook to test if the particular xattr is part of a MAC model.
  Security: Add hook to calculate context based on a negative dentry.
  NFS: Add NFSv4.2 protocol constants

Conflicts:
	fs/nfs/nfs4proc.c
2013-06-28 16:29:51 -04:00
Andy Adamson
18aad3d552 NFSv4.1 Refactor nfs4_init_session and nfs4_init_channel_attrs
nfs4_init_session was originally written to be called prior to
nfs4_init_channel_attrs, setting the session target_max response and request
sizes that nfs4_init_channel_attrs would pay attention to.

In the current code flow, nfs4_init_session, just like nfs4_init_ds_session
for the data server case, is called after the session is all negotiated, and
is actually used in a RECLAIM COMPLETE call to the server.

Remove the un-needed fc_target_max response and request fields from
nfs4_session and just set the max_resp_sz and max_rqst_sz in
nfs4_init_channel_attrs.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-28 15:55:19 -04:00
Rafael J. Wysocki
207bc1181b Merge branch 'freezer'
* freezer:
  af_unix: use freezable blocking calls in read
  sigtimedwait: use freezable blocking call
  nanosleep: use freezable blocking call
  futex: use freezable blocking call
  select: use freezable blocking call
  epoll: use freezable blocking call
  binder: use freezable blocking calls
  freezer: add new freezable helpers using freezer_do_not_count()
  freezer: convert freezable helpers to static inline where possible
  freezer: convert freezable helpers to freezer_do_not_count()
  freezer: skip waking up tasks with PF_FREEZER_SKIP set
  freezer: shorten freezer sleep time using exponential backoff
  lockdep: check that no locks held at freeze time
  lockdep: remove task argument from debug_check_no_locks_held
  freezer: add unsafe versions of freezable helpers for CIFS
  freezer: add unsafe versions of freezable helpers for NFS
2013-06-28 13:00:53 +02:00
Bryan Schumaker
7017310ad7 NFS: Apply v4.1 capabilities to v4.2
This fixes POSIX locks and possibly a few other v4.2 features, like
readdir plus.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-19 13:55:43 -04:00
Djalal Harouni
fe2d5395c4 NFSv4: SETCLIENTID add the format string for the NETID
Make sure that NFSv4 SETCLIENTID does not parse the NETID as a
format string.

Signed-off-by: Djalal Harouni <tixxdz@opendz.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-18 13:45:01 -04:00
David Quigley
c9bccef6b9 NFS: Extend NFS xattr handlers to accept the security namespace
The existing NFSv4 xattr handlers do not accept xattr calls to the security
namespace. This patch extends these handlers to accept xattrs from the security
namespace in addition to the default NFSv4 ACL namespace.

Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-08 16:20:17 -04:00
David Quigley
aa9c266962 NFS: Client implementation of Labeled-NFS
This patch implements the client transport and handling support for labeled
NFS. The patch adds two functions to encode and decode the security label
recommended attribute which makes use of the LSM hooks added earlier. It also
adds code to grab the label from the file attribute structures and encode the
label to be sent back to the server.

Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-08 16:20:16 -04:00
David Quigley
14c43f7678 NFS: Add label lifecycle management
This patch adds the lifecycle management for the security label structure
introduced in an earlier patch. The label is not used yet but allocations and
freeing of the structure is handled.

Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-08 16:20:15 -04:00
David Quigley
1775fd3e80 NFS:Add labels to client function prototypes
After looking at all of the nfsv4 operations the label structure has been added
to the prototypes of the functions which can transmit label data.

Signed-off-by: Matthew N. Dodd <Matthew.Dodd@sparta.com>
Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-08 16:20:15 -04:00
David Quigley
a09df2ca23 NFSv4: Extend fattr bitmaps to support all 3 words
The fattr handling bitmap code only uses the first two fattr words sofar. This
patch adds the 3rd word to being sent but doesn't populate it yet.

Signed-off-by: Miguel Rodel Felipe <Rodel_FM@dsi.a-star.edu.sg>
Signed-off-by: Phua Eu Gene <PHUA_Eu_Gene@dsi.a-star.edu.sg>
Signed-off-by: Khin Mi Mi Aung <Mi_Mi_AUNG@dsi.a-star.edu.sg>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-08 16:20:14 -04:00
Steve Dickson
42c2c4249c NFSv4.2: Added NFS v4.2 support to the NFS client
This enable NFSv4.2 support. To enable this code the
CONFIG_NFS_V4_2 Kconfig define needs to be set and
the -o v4.2 mount option need to be used.

Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-08 16:20:13 -04:00
Trond Myklebust
c45ffdd269 NFSv4: Close another NFSv4 recovery race
State recovery currently relies on being able to find a valid
nfs_open_context in the inode->open_files list.
We therefore need to put the nfs_open_context on the list while
we're still protected by the sp->so_reclaim_seqcount in order
to avoid reboot races.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-06 16:24:44 -04:00
Trond Myklebust
275bb30786 NFSv4: Move dentry instantiation into the NFSv4-specific atomic open code
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-06 16:24:43 -04:00
Trond Myklebust
3efb972247 NFSv4: Refactor _nfs4_open_and_get_state to set ctx->state
Instead of having the callers set ctx->state, do it inside
_nfs4_open_and_get_state.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-06 16:24:42 -04:00
Trond Myklebust
4197a055eb NFSv4: Cleanup: pass the nfs_open_context to nfs4_do_open
All the callers have an open_context at this point, and since we always
need one in order to do state recovery, it makes sense to use it as the
basis for the nfs4_do_open() call.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-06 16:24:42 -04:00
Trond Myklebust
4f0b429df1 NFSv4.1: Enable state protection
Use the EXCHGID4_FLAG_BIND_PRINC_STATEID exchange_id flag to enable
stateid protection. This means that if we create a stateid using a
particular principal, then we must use the same principal if we
want to change that state.
IOW: if we OPEN a file using a particular credential, then we have
to use the same credential in subsequent OPEN_DOWNGRADE, CLOSE,
or DELEGRETURN operations that use that stateid.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-06 16:24:37 -04:00
Trond Myklebust
cd5875fefe NFSv4.1: Use layout credentials for get_deviceinfo calls
This is not strictly needed, since get_deviceinfo is not allowed to
return NFS4ERR_ACCESS or NFS4ERR_WRONG_CRED, but lets do it anyway
for consistency with other pNFS operations.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-06 16:24:37 -04:00
Trond Myklebust
ab7cb0dfab NFSv4.1: Ensure that test_stateid and free_stateid use correct credentials
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-06 16:24:36 -04:00
Trond Myklebust
965e9c23de NFSv4.1: Ensure that reclaim_complete uses the right credential
We want to use the same credential for reclaim_complete as we used
for the exchange_id call.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-06 16:24:35 -04:00
Trond Myklebust
9556000d8c NFSv4.1: Ensure that layoutreturn uses the correct credential
We need to use the same credential as was used for the layoutget
and/or layoutcommit operations.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-06 16:24:35 -04:00
Trond Myklebust
6ab59344d9 NFSv4.1: Ensure that layoutget is called using the layout credential
Ensure that we use the same credential for layoutget, layoutcommit and
layoutreturn.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-06-06 16:24:34 -04:00
Trond Myklebust
f448badd34 NFSv4: Fix a thinko in nfs4_try_open_cached
We need to pass the full open mode flags to nfs_may_open() when doing
a delegated open.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-05-29 16:03:23 -04:00
Andy Adamson
774d5f14ee NFSv4.1 Fix a pNFS session draining deadlock
On a CB_RECALL the callback service thread flushes the inode using
filemap_flush prior to scheduling the state manager thread to return the
delegation. When pNFS is used and I/O has not yet gone to the data server
servicing the inode, a LAYOUTGET can preceed the I/O. Unlike the async
filemap_flush call, the LAYOUTGET must proceed to completion.

If the state manager starts to recover data while the inode flush is sending
the LAYOUTGET, a deadlock occurs as the callback service thread holds the
single callback session slot until the flushing is done which blocks the state
manager thread, and the state manager thread has set the session draining bit
which puts the inode flush LAYOUTGET RPC to sleep on the forechannel slot
table waitq.

Separate the draining of the back channel from the draining of the fore channel
by moving the NFS4_SESSION_DRAINING bit from session scope into the fore
and back slot tables.  Drain the back channel first allowing the LAYOUTGET
call to proceed (and fail) so the callback service thread frees the callback
slot. Then proceed with draining the forechannel.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-05-20 14:20:14 -04:00
Colin Cross
416ad3c9c0 freezer: add unsafe versions of freezable helpers for NFS
NFS calls the freezable helpers with locks held, which is unsafe
and will cause lockdep warnings when 6aa9707 "lockdep: check
that no locks held at freeze time" is reapplied (it was reverted
in dbf520a).  NFS shouldn't be doing this, but it has
long-running syscalls that must hold a lock but also shouldn't
block suspend.  Until NFS freeze handling is rewritten to use a
signal to exit out of the critical section, add new *_unsafe
versions of the helpers that will not run the lockdep test when
6aa9707 is reapplied, and call them from NFS.

In practice the likley result of holding the lock while freezing
is that a second task blocked on the lock will never freeze,
aborting suspend, but it is possible to manufacture a case using
the cgroup freezer, the lock, and the suspend freezer to create
a deadlock.  Silencing the lockdep warning here will allow
problems to be found in other drivers that may have a more
serious deadlock risk, and prevent new problems from being added.

Signed-off-by: Colin Cross <ccross@android.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-05-12 14:16:21 +02:00
Trond Myklebust
c8b2d0bfd3 NFSv4.1: Ensure that we free the lock stateid on the server
This ensures that the server doesn't need to keep huge numbers of
lock stateids waiting around for the final CLOSE.
See section 8.2.4 in RFC5661.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-05-06 17:24:27 -04:00
Trond Myklebust
7c1d5fae4a NFSv4: Convert nfs41_free_stateid to use an asynchronous RPC call
The main reason for doing this is will be to allow for an asynchronous
RPC mode that we can use for freeing lock stateids as per section
8.2.4 of RFC5661.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-05-06 17:24:22 -04:00
Trond Myklebust
c5a2a15f81 NFSv4.x: Fix handling of partially delegated locks
If a NFS client receives a delegation for a file after it has taken
a lock on that file, we can currently end up in a situation where
we mistakenly skip unlocking that file.

The following patch swaps an erroneous check in nfs4_proc_unlck for
whether or not the file has a delegation to one which checks whether
or not we hold a lock stateid for that file.

Reported-by: Chuck Lever <Chuck.Lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [>=3.7]
Tested-by: Chuck Lever <Chuck.Lever@oracle.com>
2013-05-03 12:18:47 -04:00
Trond Myklebust
721ccfb79b NFSv4: Warn once about servers that incorrectly apply open mode to setattr
Debugging aid to help identify servers that incorrectly apply open mode
checks to setattr requests that are not changing the file size.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-29 11:11:58 -04:00
Trond Myklebust
ee3ae84ef4 NFSv4: Servers should only check SETATTR stateid open mode on size change
The NFSv4 and NFSv4.1 specs are both clear that the server should only check
stateid open mode if a SETATTR specifies the size attribute. If the
open mode is not one that allows writing, then it returns NFS4ERR_OPENMODE.

In the case where the SETATTR is not changing the size, the client will
still pass it the delegation stateid to ensure that the server does not
recall that delegation. In that case, the server should _ignore_ the
delegation open mode, and simply apply standard permission checks.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-29 11:11:39 -04:00
Trond Myklebust
b0212b84fb Merge branch 'bugfixes' into linux-next
Fix up a conflict between the linux-next branch and mainline.
Conflicts:
	fs/nfs/nfs4proc.c
2013-04-23 15:52:14 -04:00
Trond Myklebust
bd1d421abc Merge branch 'rpcsec_gss-from_cel' into linux-next
* rpcsec_gss-from_cel: (21 commits)
  NFS: Retry SETCLIENTID with AUTH_SYS instead of AUTH_NONE
  NFSv4: Don't clear the machine cred when client establish returns EACCES
  NFSv4: Fix issues in nfs4_discover_server_trunking
  NFSv4: Fix the fallback to AUTH_NULL if krb5i is not available
  NFS: Use server-recommended security flavor by default (NFSv3)
  SUNRPC: Don't recognize RPC_AUTH_MAXFLAVOR
  NFS: Use "krb5i" to establish NFSv4 state whenever possible
  NFS: Try AUTH_UNIX when PUTROOTFH gets NFS4ERR_WRONGSEC
  NFS: Use static list of security flavors during root FH lookup recovery
  NFS: Avoid PUTROOTFH when managing leases
  NFS: Clean up nfs4_proc_get_rootfh
  NFS: Handle missing rpc.gssd when looking up root FH
  SUNRPC: Remove EXPORT_SYMBOL_GPL() from GSS mech switch
  SUNRPC: Make gss_mech_get() static
  SUNRPC: Refactor nfsd4_do_encode_secinfo()
  SUNRPC: Consider qop when looking up pseudoflavors
  SUNRPC: Load GSS kernel module by OID
  SUNRPC: Introduce rpcauth_get_pseudoflavor()
  SUNRPC: Define rpcsec_gss_info structure
  NFS: Remove unneeded forward declaration
  ...
2013-04-23 15:40:40 -04:00
Trond Myklebust
bdeca1b76c NFSv4: Don't recheck permissions on open in case of recovery cached open
If we already checked the user access permissions on the original open,
then don't bother checking again on recovery. Doing so can cause a
deadlock with NFSv4.1, since the may_open() operation is not privileged.
Furthermore, we can't report an access permission failure here anyway.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-23 14:52:44 -04:00
Trond Myklebust
cd4c9be2c6 NFSv4.1: Don't do a delegated open for NFS4_OPEN_CLAIM_DELEG_CUR_FH modes
If we're in a delegation recall situation, we can't do a delegated open.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-23 14:46:25 -04:00
Trond Myklebust
8188df1733 NFSv4.1: Use the more efficient open_noattr call for open-by-filehandle
When we're doing open-by-filehandle in NFSv4.1, we shouldn't need to
do the cache consistency revalidation on the directory. It is
therefore more efficient to just use open_noattr, which returns the
file attributes, but not the directory attributes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-23 14:31:19 -04:00
Trond Myklebust
fd068b200f NFSv4: Ensure that we clear the NFS_OPEN_STATE flag when appropriate
We should always clear it before initiating file recovery.
Also ensure that we clear it after a CLOSE and/or after TEST_STATEID fails.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-22 11:29:51 -04:00
Trond Myklebust
8e472f33b5 NFSv4: Ensure the LOCK call cannot use the delegation stateid
Defensive patch to ensure that we copy the state->open_stateid, which
can never be set to the delegation stateid.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-20 01:39:54 -04:00
Trond Myklebust
92b40e9384 NFSv4: Use the open stateid if the delegation has the wrong mode
Fix nfs4_select_rw_stateid() so that it chooses the open stateid
(or an all-zero stateid) if the delegation does not match the selected
read/write mode.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-20 01:39:42 -04:00
Trond Myklebust
549b19cc9f NFSv4: Record the OPEN create mode used in the nfs4_opendata structure
If we're doing NFSv4.1 against a server that has persistent sessions,
then we should not need to call SETATTR in order to reset the file
attributes immediately after doing an exclusive create.

Note that since the create mode depends on the type of session that
has been negotiated with the server, we should not choose the
mode until after we've got a session slot.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-16 18:58:26 -04:00
Trond Myklebust
b570a975ed NFSv4: Fix handling of revoked delegations by setattr
Currently, _nfs4_do_setattr() will use the delegation stateid if no
writeable open file stateid is available.
If the server revokes that delegation stateid, then the call to
nfs4_handle_exception() will fail to handle the error due to the
lack of a struct nfs4_state, and will just convert the error into
an EIO.

This patch just removes the requirement that we must have a
struct nfs4_state in order to invalidate the delegation and
retry.

Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-12 15:21:15 -04:00
Andy Adamson
b9536ad521 NFSv4 release the sequence id in the return on close case
Otherwise we deadlock if state recovery is initiated while we
sleep.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-11 09:39:53 -04:00
Trond Myklebust
fa332941c0 NFSv4: Fix another potential state manager deadlock
Don't hold the NFSv4 sequence id while we check for open permission.
The call to ACCESS may block due to reboot recovery.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-09 13:19:35 -04:00
Trond Myklebust
bc7a05ca51 NFSv4: Handle timeouts correctly when probing for lease validity
When we send a RENEW or SEQUENCE operation in order to probe if the
lease is still valid, we want it to be able to time out since the
lease we are probing is likely to time out too. Currently, because
we use soft mount semantics for these RPC calls, the return value
is EIO, which causes the state manager to exit with an "unhandled
error" message.
This patch changes the call semantics, so that the RPC layer returns
ETIMEDOUT instead of EIO. We then have the state manager default to
a simple retry instead of exiting.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-08 18:01:59 -04:00
Trond Myklebust
db4f2e637f NFSv4: Clean up delegation recall error handling
Unify the error handling in nfs4_open_delegation_recall and
nfs4_lock_delegation_recall.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-05 17:03:55 -04:00
Trond Myklebust
be76b5b68d NFSv4: Clean up nfs4_open_delegation_recall
Make it symmetric with nfs4_lock_delegation_recall

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-05 17:03:54 -04:00
Trond Myklebust
4a706fa09f NFSv4: Clean up nfs4_lock_delegation_recall
All error cases are handled by the switch() statement, meaning that the
call to nfs4_handle_exception() is unreachable.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-04-05 17:03:54 -04:00
Trond Myklebust
8b6cc4d6f8 NFSv4: Handle NFS4ERR_DELAY and NFS4ERR_GRACE in nfs4_open_delegation_recall
A server shouldn't normally return NFS4ERR_GRACE if the client holds a
delegation, since no conflicting lock reclaims can be granted, however
the spec does not require the server to grant the open in this
instance

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-04-05 17:03:53 -04:00
Trond Myklebust
dbb21c25a3 NFSv4: Handle NFS4ERR_DELAY and NFS4ERR_GRACE in nfs4_lock_delegation_recall
A server shouldn't normally return NFS4ERR_GRACE if the client holds a
delegation, since no conflicting lock reclaims can be granted, however
the spec does not require the server to grant the lock in this
instance.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-04-05 17:03:53 -04:00
Chuck Lever
c4eafe1135 NFS: Try AUTH_UNIX when PUTROOTFH gets NFS4ERR_WRONGSEC
Most NFSv4 servers implement AUTH_UNIX, and administrators will
prefer this over AUTH_NULL.  It is harmless for our client to try
this flavor in addition to the flavors mandated by RFC 3530/5661.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29 15:45:09 -04:00
Chuck Lever
9a744ba398 NFS: Use static list of security flavors during root FH lookup recovery
If the Linux NFS client receives an NFS4ERR_WRONGSEC error while
trying to look up an NFS server's root file handle, it retries the
lookup operation with various security flavors to see what flavor
the NFS server will accept for pseudo-fs access.

The list of flavors the client uses during retry consists only of
flavors that are currently registered in the kernel RPC client.
This list may not include any GSS pseudoflavors if auth_rpcgss.ko
has not yet been loaded.

Let's instead use a static list of security flavors that the NFS
standard requires the server to implement (RFC 3530bis, section
3.2.1).  The RPC client should now be able to load support for
these dynamically; if not, they are skipped.

Recovery behavior here is prescribed by RFC 3530bis, section
15.33.5:

> For LOOKUPP, PUTROOTFH and PUTPUBFH, the client will be unable to
> use the SECINFO operation since SECINFO requires a current
> filehandle and none exist for these two [sic] operations.  Therefore,
> the client must iterate through the security triples available at
> the client and reattempt the PUTROOTFH or PUTPUBFH operation.  In
> the unfortunate event none of the MANDATORY security triples are
> supported by the client and server, the client SHOULD try using
> others that support integrity.  Failing that, the client can try
> using AUTH_NONE, but because such forms lack integrity checks,
> this puts the client at risk.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29 15:44:58 -04:00
Chuck Lever
83ca7f5ab3 NFS: Avoid PUTROOTFH when managing leases
Currently, the compound operation the Linux NFS client sends to the
server to confirm a client ID looks like this:

	{ SETCLIENTID_CONFIRM; PUTROOTFH; GETATTR(lease_time) }

Once the lease is confirmed, it makes sense to know how long before
the client will have to renew it.  And, performing these operations
in the same compound saves a round trip.

Unfortunately, this arrangement assumes that the security flavor
used for establishing a client ID can also be used to access the
server's pseudo-fs.

If the server requires a different security flavor to access its
pseudo-fs than it allowed for the client's SETCLIENTID operation,
the PUTROOTFH in this compound fails with NFS4ERR_WRONGSEC.  Even
though the SETCLIENTID_CONFIRM succeeded, our client's trunking
detection logic interprets the failure of the compound as a failure
by the server to confirm the client ID.

As part of server trunking detection, the client then begins another
SETCLIENTID pass with the same nfs4_client_id.  This fails with
NFS4ERR_CLID_INUSE because the first SETCLIENTID/SETCLIENTID_CONFIRM
already succeeded in confirming that client ID -- it was the
PUTROOTFH operation that caused the SETCLIENTID_CONFIRM compound to
fail.

To address this issue, separate the "establish client ID" step from
the "accessing the server's pseudo-fs root" step.  The first access
of the server's pseudo-fs may require retrying the PUTROOTFH
operation with different security flavors.  This access is done in
nfs4_proc_get_rootfh().

That leaves the matter of how to retrieve the server's lease time.
nfs4_proc_fsinfo() already retrieves the lease time value, though
none of its callers do anything with the retrieved value (nor do
they mark the lease as "renewed").

Note that NFSv4.1 state recovery invokes nfs4_proc_get_lease_time()
using the lease management security flavor.  This may cause some
heartburn if that security flavor isn't the same as the security
flavor the server requires for accessing the pseudo-fs.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29 15:44:49 -04:00
Chuck Lever
2ed4b95b7e NFS: Clean up nfs4_proc_get_rootfh
The long lines with no vertical white space make this function
difficult for humans to read.  Add a proper documenting comment
while we're here.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29 15:44:12 -04:00
Chuck Lever
75bc8821bd NFS: Handle missing rpc.gssd when looking up root FH
When rpc.gssd is not running, any NFS operation that needs to use a
GSS security flavor of course does not work.

If looking up a server's root file handle results in an
NFS4ERR_WRONGSEC, nfs4_find_root_sec() is called to try a bunch of
security flavors until one works or all reasonable flavors have
been tried.  When rpc.gssd isn't running, this loop seems to fail
immediately after rpcauth_create() craps out on the first GSS
flavor.

When the rpcauth_create() call in nfs4_lookup_root_sec() fails
because rpc.gssd is not available, nfs4_lookup_root_sec()
unconditionally returns -EIO.  This prevents nfs4_find_root_sec()
from retrying any other flavors; it drops out of its loop and fails
immediately.

Having nfs4_lookup_root_sec() return -EACCES instead allows
nfs4_find_root_sec() to try all flavors in its list.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-29 15:43:55 -04:00
Trond Myklebust
91876b13b8 NFSv4: Fix another reboot recovery race
If the open_context for the file is not yet fully initialised,
then open recovery cannot succeed, and since nfs4_state_find_open_context
returns an ENOENT, we end up treating the file as being irrecoverable.

What we really want to do, is just defer the recovery until later.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-28 16:22:16 -04:00
Trond Myklebust
6e3cf24152 NFSv4: Add a mapping for NFS4ERR_FILE_OPEN in nfs4_map_errors
With unlink is an asynchronous operation in the sillyrename case, it
expects nfs4_async_handle_error() to map the error correctly.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-27 12:44:40 -04:00
Trond Myklebust
ccb46e2063 NFSv4.1: Use CLAIM_DELEG_CUR_FH opens when available
Now that we do CLAIM_FH opens, we may run into situations where we
get a delegation but don't have perfect knowledge of the file path.
When returning the delegation, we might therefore not be able to
us CLAIM_DELEGATE_CUR opens to convert the delegation into OPEN
stateids and locks.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25 12:04:11 -04:00
Trond Myklebust
49f9a0fafd NFSv4.1: Enable open-by-filehandle
Sometimes, we actually _want_ to do open-by-filehandle, for instance
when recovering opens after a network partition, or when called
from nfs4_file_open.
Enable that functionality using a new capability NFS_CAP_ATOMIC_OPEN_V1,
and which is only enabled for NFSv4.1 servers that support it.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25 12:04:11 -04:00
Trond Myklebust
4a1c089345 NFSv4: Clean up nfs4_opendata_alloc in preparation for NFSv4.1 open modes
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25 12:04:11 -04:00
Trond Myklebust
3b66486c4c NFSv4.1: Select the "most recent locking state" for read/write/setattr stateids
Follow the practice described in section 8.2.2 of RFC5661: When sending a
read/write or setattr stateid, set the seqid field to zero in order to
signal that the NFS server should apply the most recent locking state.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25 12:04:11 -04:00
Trond Myklebust
39c6daae70 NFSv4: Prepare for minorversion-specific nfs_server capabilities
Clean up the setting of the nfs_server->caps, by shoving it all
into nfs4_server_common_setup().
Then add an 'initial capabilities' field into struct nfs4_minor_version_ops.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25 12:04:11 -04:00
Trond Myklebust
5521abfdcf NFSv4: Resend the READ/WRITE RPC call if a stateid change causes an error
Adds logic to ensure that if the server returns a BAD_STATEID,
or other state related error, then we check if the stateid has
already changed. If it has, then rather than start state recovery,
we should just resend the failed RPC call with the new stateid.

Allow nfs4_select_rw_stateid to notify that the stateid is unstable by
having it return -EWOULDBLOCK if an RPC is underway that might change the
stateid.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25 12:04:10 -04:00
Trond Myklebust
9b20614988 NFSv4: The stateid must remain the same for replayed RPC calls
If we replay a READ or WRITE call, we should not be changing the
stateid. Currently, we may end up doing so, because the stateid
is only selected at xdr encode time.

This patch ensures that we select the stateid after we get an NFSv4.1
session slot, and that we keep that same stateid across retries.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25 12:04:10 -04:00
Trond Myklebust
5d422301f9 NFSv4: Fail I/O if the state recovery fails irrevocably
If state recovery fails with an ESTALE or a ENOENT, then we shouldn't
keep retrying. Instead, mark the stateid as being invalid and
fail the I/O with an EIO error.
For other operations such as POSIX and BSD file locking, truncate
etc, fail with an EBADF to indicate that this file descriptor is no
longer valid.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-03-25 12:04:10 -04:00
Trond Myklebust
240286725d NFSv4.1: Add a helper pnfs_commit_and_return_layout
In order to be able to safely return the layout in nfs4_proc_setattr,
we need to block new uses of the layout, wait for all outstanding
users of the layout to complete, commit the layout and then return it.

This patch adds a helper in order to do all this safely.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
2013-03-21 10:31:21 -04:00
Trond Myklebust
a073dbff35 NFSv4.1: Fix a race in pNFS layoutcommit
We need to clear the NFS_LSEG_LAYOUTCOMMIT bits atomically with the
NFS_INO_LAYOUTCOMMIT bit, otherwise we may end up with situations
where the two are out of sync.
The first half of the problem is to ensure that pnfs_layoutcommit_inode
clears the NFS_LSEG_LAYOUTCOMMIT bit through pnfs_list_write_lseg.
We still need to keep the reference to those segments until the RPC call
is finished, so in order to make it clear _where_ those references come
from, we add a helper pnfs_list_write_lseg_done() that cleans up after
pnfs_list_write_lseg.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Benny Halevy <bhalevy@tonian.com>
Cc: stable@vger.kernel.org
2013-03-21 10:31:19 -04:00
Weston Andros Adamson
3000512137 NFSv4.1: LAYOUTGET EDELAY loops timeout to the MDS
The client will currently try LAYOUTGETs forever if a server is returning
NFS4ERR_LAYOUTTRYLATER or NFS4ERR_RECALLCONFLICT - even if the client no
longer needs the layout (ie process killed, unmounted).

This patch uses the DS timeout value (module parameter 'dataserver_timeo'
via rpc layer) to set an upper limit of how long the client tries LATOUTGETs
in this situation.  Once the timeout is reached, IO is redirected to the MDS.

This also changes how the client checks if a layout is on the clp list
to avoid a double list_add.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-28 17:41:35 -08:00
Trond Myklebust
7aa262b522 NFSv4: Fix another open/open_recovery deadlock
If we don't release the open seqid before we wait for state recovery,
then we may end up deadlocking the state recovery thread.
This patch addresses a new deadlock that was introduced by
commit c21443c2c7 (NFSv4: Fix a reboot
recovery race when opening a file)

Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-28 16:19:59 -08:00
Weston Andros Adamson
a47970ff78 NFSv4.1: Hold reference to layout hdr in layoutget
This fixes an oops where a LAYOUTGET is in still in the rpciod queue,
but the requesting processes has been killed.  Without this, killing
the process does the final pnfs_put_layout_hdr() and sets NFS_I(inode)->layout
to NULL while the LAYOUTGET rpc task still references it.

Example oops:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
IP: [<ffffffffa01bd586>] pnfs_choose_layoutget_stateid+0x37/0xef [nfsv4]
PGD 7365b067 PUD 7365d067 PMD 0
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: nfs_layout_nfsv41_files nfsv4 auth_rpcgss nfs lockd sunrpc ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ip6table_filter ip6_tables ppdev e1000 i2c_piix4 i2c_core shpchp parport_pc parport crc32c_intel aesni_intel xts aes_x86_64 lrw gf128mul ablk_helper cryptd mptspi scsi_transport_spi mptscsih mptbase floppy autofs4
CPU 0
Pid: 27, comm: kworker/0:1 Not tainted 3.8.0-dros_cthon2013+ #4 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
RIP: 0010:[<ffffffffa01bd586>]  [<ffffffffa01bd586>] pnfs_choose_layoutget_stateid+0x37/0xef [nfsv4]
RSP: 0018:ffff88007b0c1c88  EFLAGS: 00010246
RAX: ffff88006ed36678 RBX: 0000000000000000 RCX: 0000000ea877e3bc
RDX: ffff88007a729da8 RSI: 0000000000000000 RDI: ffff88007a72b958
RBP: ffff88007b0c1ca8 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007a72b958
R13: ffff88007a729da8 R14: 0000000000000000 R15: ffffffffa011077e
FS:  0000000000000000(0000) GS:ffff88007f600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000080 CR3: 00000000735f8000 CR4: 00000000001407f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/0:1 (pid: 27, threadinfo ffff88007b0c0000, task ffff88007c2fa0c0)
Stack:
 ffff88006fc05388 ffff88007a72b908 ffff88007b240900 ffff88006fc05388
 ffff88007b0c1cd8 ffffffffa01a2170 ffff88007b240900 ffff88007b240900
 ffff88007b240970 ffffffffa011077e ffff88007b0c1ce8 ffffffffa0110791
Call Trace:
 [<ffffffffa01a2170>] nfs4_layoutget_prepare+0x7b/0x92 [nfsv4]
 [<ffffffffa011077e>] ? __rpc_atrun+0x15/0x15 [sunrpc]
 [<ffffffffa0110791>] rpc_prepare_task+0x13/0x15 [sunrpc]

Reported-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-25 18:32:59 -08:00
Weston Andros Adamson
085b7a45c6 NFSv4.1: Don't decode skipped layoutgets
layoutget's prepare hook can call rpc_exit with status = NFS4_OK (0).
Because of this, nfs4_proc_layoutget can't depend on a 0 status to mean
that the RPC was successfully sent, received and parsed.

To fix this, use the result's len member to see if parsing took place.

This fixes the following OOPS -- calling xdr_init_decode() with a buffer length
0 doesn't set the stream's 'p' member and ends up using uninitialized memory
in filelayout_decode_layout.

BUG: unable to handle kernel paging request at 0000000000008050
IP: [<ffffffff81282e78>] memcpy+0x18/0x120
PGD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:11.0/0000:02:01.0/irq
CPU 1
Modules linked in: nfs_layout_nfsv41_files nfs lockd fscache auth_rpcgss nfs_acl autofs4 sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log dm_mod ppdev parport_pc parport snd_ens1371 snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc e1000 microcode vmware_balloon i2c_piix4 i2c_core sg shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix mptspi mptscsih mptbase scsi_transport_spi [last unloaded: speedstep_lib]

Pid: 1665, comm: flush-0:22 Not tainted 2.6.32-356-test-2 #2 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
RIP: 0010:[<ffffffff81282e78>]  [<ffffffff81282e78>] memcpy+0x18/0x120
RSP: 0018:ffff88003dfab588  EFLAGS: 00010206
RAX: ffff88003dc42000 RBX: ffff88003dfab610 RCX: 0000000000000009
RDX: 000000003f807ff0 RSI: 0000000000008050 RDI: ffff88003dc42000
RBP: ffff88003dfab5b0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000080 R12: 0000000000000024
R13: ffff88003dc42000 R14: ffff88003f808030 R15: ffff88003dfab6a0
FS:  0000000000000000(0000) GS:ffff880003420000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000008050 CR3: 000000003bc92000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process flush-0:22 (pid: 1665, threadinfo ffff88003dfaa000, task ffff880037f77540)
Stack:
ffffffffa0398ac1 ffff8800397c5940 ffff88003dfab610 ffff88003dfab6a0
<d> ffff88003dfab5d0 ffff88003dfab680 ffffffffa01c150b ffffea0000d82e70
<d> 000000508116713b 0000000000000000 0000000000000000 0000000000000000
Call Trace:
[<ffffffffa0398ac1>] ? xdr_inline_decode+0xb1/0x120 [sunrpc]
[<ffffffffa01c150b>] filelayout_decode_layout+0xeb/0x350 [nfs_layout_nfsv41_files]
[<ffffffffa01c17fc>] filelayout_alloc_lseg+0x8c/0x3c0 [nfs_layout_nfsv41_files]
[<ffffffff8150e6ce>] ? __wait_on_bit+0x7e/0x90

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-02-17 15:24:16 -05:00
Trond Myklebust
c8da19b986 NFSv4.1: Fix an ABBA locking issue with session and state serialisation
Ensure that if nfs_wait_on_sequence() causes our rpc task to wait for
an NFSv4 state serialisation lock, then we also drop the session slot.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-02-11 19:04:25 -05:00
Trond Myklebust
c21443c2c7 NFSv4: Fix a reboot recovery race when opening a file
If the server reboots after it has replied to our OPEN, but before we
call nfs4_opendata_to_nfs4_state(), then the reboot recovery thread
will not see a stateid for this open, and so will fail to recover it.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-11 15:33:14 -05:00
Trond Myklebust
65b62a29f7 NFSv4: Ensure delegation recall and byte range lock removal don't conflict
Add a mutex to the struct nfs4_state_owner to ensure that delegation
recall doesn't conflict with byte range lock removal.

Note that we nest the new mutex _outside_ the state manager reclaim
protection (nfsi->rwsem) in order to avoid deadlocks.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-11 15:33:13 -05:00
Trond Myklebust
37380e4264 NFSv4: Fix up the return values of nfs4_open_delegation_recall
Adjust the return values so that they return EAGAIN to the caller in
cases where we might want to retry the delegation recall after
the state recovery has run.
Note that we can't wait and retry in this routine, because the caller
may be the state manager thread.

If delegation recall fails due to a session or reboot related issue,
also ensure that we mark the stateid as delegated so that
nfs_delegation_claim_opens can find it again later.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-11 15:33:13 -05:00
Trond Myklebust
d25be546a8 NFSv4.1: Don't lose locks when a server reboots during delegation return
If the server reboots while we are converting a delegation into
OPEN/LOCK stateids as part of a delegation return, the current code
will simply exit with an error. This causes us to lose both
delegation state and locking state (i.e. locking atomicity).

Deal with this by exposing the delegation stateid during delegation
return, so that we can recover the delegation, and then resume
open/lock recovery.

Note that not having to hold the nfs_inode->rwsem across the
calls to nfs_delegation_claim_opens() also fixes a deadlock against
the NFSv4.1 reboot recovery code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-11 15:33:12 -05:00
Trond Myklebust
9a99af494b NFSv4.1: Prevent deadlocks between state recovery and file locking
We currently have a deadlock in which the state recovery thread
ends up blocking due to one of the locks which it is trying to
recover holding the nfs_inode->rwsem.
The situation is as follows: the state recovery thread is
scheduled in order to recover from a reboot. It immediately
drains the session, forcing all ordinary NFSv4.1 calls to
nfs41_setup_sequence() to be put to sleep.  This includes the
file locking process that holds the nfs_inode->rwsem.
When the thread gets to nfs4_reclaim_locks(), it tries to
grab a write lock on nfs_inode->rwsem, and boom...

Fix is to have the lock drop the nfs_inode->rwsem while it is
doing RPC calls. We use a sequence lock in order to signal to
the locking process whether or not a state recovery thread has
run on that inode, in which case it should retry the lock.

Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-11 15:33:12 -05:00
Trond Myklebust
322b2b9032 Revert "NFS: add nfs_sb_deactive_async to avoid deadlock"
This reverts commit 324d003b0c.

The deadlock turned out to be caused by a workqueue limitation that has
now been worked around in the RPC code (see comment in rpc_free_task).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-02-01 10:13:48 -05:00
Weston Andros Adamson
f8d9a897d4 NFS: Fix access to suid/sgid executables
nfs_open_permission_mask() should only check MAY_EXEC for files that
are opened with __FMODE_EXEC.

Also fix NFSv4 access-in-open path in a similar way -- openflags must be
used because fmode will not always have FMODE_EXEC set.

This patch fixes https://bugzilla.kernel.org/show_bug.cgi?id=49101

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2013-01-03 17:06:27 -05:00
David Howells
de242c0b8b NFS: Use FS-Cache invalidation
Use the new FS-Cache invalidation facility from NFS to deal with foreign
changes being detected on the server rather than attempting to retire the old
cookie and get a new one.

The problem with the old method was that NFS did not wait for all outstanding
storage and retrieval ops on the cache to complete.  There was no automatic
wait between the calls to ->readpages() and calls to invalidate_inode_pages2()
as the latter can only wait on locked pages that have been added to the
pagecache (which they haven't yet on entry to ->readpages()).

This was leading to oopses like the one below when an outstanding read got cut
off from its cookie by a premature release.

BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
IP: [<ffffffffa0075118>] __fscache_read_or_alloc_pages+0x1dd/0x315 [fscache]
PGD 15889067 PUD 15890067 PMD 0
Oops: 0000 [#1] SMP
CPU 0
Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc

Pid: 4544, comm: tar Not tainted 3.1.0-rc4-fsdevel+ #1064                  /DG965RY
RIP: 0010:[<ffffffffa0075118>]  [<ffffffffa0075118>] __fscache_read_or_alloc_pages+0x1dd/0x315 [fscache]
RSP: 0018:ffff8800158799e8  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8800070d41e0 RCX: ffff8800083dc1b0
RDX: 0000000000000000 RSI: ffff880015879960 RDI: ffff88003e627b90
RBP: ffff880015879a28 R08: 0000000000000002 R09: 0000000000000002
R10: 0000000000000001 R11: ffff880015879950 R12: ffff880015879aa4
R13: 0000000000000000 R14: ffff8800083dc158 R15: ffff880015879be8
FS:  00007f671e9d87c0(0000) GS:ffff88003bc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000000a8 CR3: 000000001587f000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process tar (pid: 4544, threadinfo ffff880015878000, task ffff880015875040)
Stack:
 ffffffffa00b1759 ffff8800070dc158 ffff8800000213da ffff88002a286508
 ffff880015879aa4 ffff880015879be8 0000000000000001 ffff88002a2866e8
 ffff880015879a88 ffffffffa00b20be 00000000000200da ffff880015875040
Call Trace:
 [<ffffffffa00b1759>] ? nfs_fscache_wait_bit+0xd/0xd [nfs]
 [<ffffffffa00b20be>] __nfs_readpages_from_fscache+0x7e/0x13f [nfs]
 [<ffffffff81095fe7>] ? __alloc_pages_nodemask+0x156/0x662
 [<ffffffffa0098763>] nfs_readpages+0xee/0x187 [nfs]
 [<ffffffff81098a5e>] __do_page_cache_readahead+0x1be/0x267
 [<ffffffff81098942>] ? __do_page_cache_readahead+0xa2/0x267
 [<ffffffff81098d7b>] ra_submit+0x1c/0x20
 [<ffffffff8109900a>] ondemand_readahead+0x28b/0x29a
 [<ffffffff810990ce>] page_cache_sync_readahead+0x38/0x3a
 [<ffffffff81091d8a>] generic_file_aio_read+0x2ab/0x67e
 [<ffffffffa008cfbe>] nfs_file_read+0xa4/0xc9 [nfs]
 [<ffffffff810c22c4>] do_sync_read+0xba/0xfa
 [<ffffffff810a62c9>] ? might_fault+0x4e/0x9e
 [<ffffffff81177a47>] ? security_file_permission+0x7b/0x84
 [<ffffffff810c25dd>] ? rw_verify_area+0xab/0xc8
 [<ffffffff810c29a4>] vfs_read+0xaa/0x13a
 [<ffffffff810c2a79>] sys_read+0x45/0x6c
 [<ffffffff813ac37b>] system_call_fastpath+0x16/0x1b

Reported-by: Mark Moseley <moseleymark@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2012-12-20 22:06:33 +00:00
Trond Myklebust
ac20d163fc NFSv4.1: Deal effectively with interrupted RPC calls.
If an RPC call is interrupted, assume that the server hasn't processed
the RPC call so that the next time we use the slot, we know that if we
get a NFS4ERR_SEQ_MISORDERED or NFS4ERR_SEQ_FALSE_RETRY, we just have
to bump the sequence number.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-15 15:39:59 -05:00
Trond Myklebust
8e63b6a8ad NFSv4.1: Move the RPC timestamp out of the slot.
Shave a few bytes off the slot table size by moving the RPC timestamp
into the sequence results.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-15 15:21:52 -05:00
Trond Myklebust
e879444084 NFSv4.1: Try to deal with NFS4ERR_SEQ_MISORDERED.
If the server returns NFS4ERR_SEQ_MISORDERED, it could be a sign
that the slot was retired at some point. Retry the attempt after
reinitialising the slot sequence number to 1.

Also add a handler for NFS4ERR_SEQ_FALSE_RETRY. Just bump the slot
sequence number and retry...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-15 14:49:09 -05:00
Andy Adamson
eb96d5c97b SUNRPC handle EKEYEXPIRED in call_refreshresult
Currently, when an RPCSEC_GSS context has expired or is non-existent
and the users (Kerberos) credentials have also expired or are non-existent,
the client receives the -EKEYEXPIRED error and tries to refresh the context
forever.  If an application is performing I/O, or other work against the share,
the application hangs, and the user is not prompted to refresh/establish their
credentials. This can result in a denial of service for other users.

Users are expected to manage their Kerberos credential lifetimes to mitigate
this issue.

Move the -EKEYEXPIRED handling into the RPC layer. Try tk_cred_retry number
of times to refresh the gss_context, and then return -EACCES to the application.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-12 15:36:02 -05:00
Trond Myklebust
8556307374 NFSv4.1: Handle NFS4ERR_BADSLOT errors correctly
Most (all) NFS4ERR_BADSLOT errors are due to the client failing to
respect the server's sr_highest_slotid limit. This mainly happens
due to reordered RPC requests.
The way to handle it is simply to drop the slot that we're using,
and retry using the new highest_slotid limits.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-11 10:31:12 -05:00
Trond Myklebust
7ce0171d4f Merge branch 'bugfixes' into nfs-for-next 2012-12-11 09:16:26 -05:00
Sven Wegener
7d3e91a89b NFSv4: Check for buffer length in __nfs4_get_acl_uncached
Commit 1f1ea6c "NFSv4: Fix buffer overflow checking in
__nfs4_get_acl_uncached" accidently dropped the checking for too small
result buffer length.

If someone uses getxattr on "system.nfs4_acl" on an NFSv4 mount
supporting ACLs, the ACL has not been cached and the buffer suplied is
too short, we still copy the complete ACL, resulting in kernel and user
space memory corruption.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-11 09:14:50 -05:00
Trond Myklebust
b75ad4cda5 NFSv4.1: Ensure smooth handover of slots from one task to the next waiting
Currently, we see a lot of bouncing for the value of highest_used_slotid
due to the fact that slots are getting freed, instead of getting instantly
transmitted to the next waiting task.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:52 +01:00
Trond Myklebust
1e1093c7fd NFSv4.1: Don't mess with task priorities in nfs41_setup_sequence
We want to preserve the rpc_task priority for things like writebacks,
that may have differing levels of urgency.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:51 +01:00
Bryan Schumaker
104287cd4e NFS: Remove _nfs_call_sync_session
All it does is pass its arguments through to another function.  Let's
cut out the middleman...

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:51 +01:00
Trond Myklebust
8fe72bac8d NFSv4: Clean up handling of privileged operations
Privileged rpc calls are those that are run by the state recovery thread,
in cases where we're trying to recover the system after a server reboot
or a network partition. In those cases, we want to fence off all other
rpc calls (see nfs4_begin_drain_session()) so that they don't end up
using stateids or clientids that are in the process of being recovered.

Prior to this patch, we had to set up special callback functions in
order to declare an rpc call as being privileged.
By adding a new field to the sequence arguments, this patch simplifies
things considerably, and allows us to declare the rpc call as privileged
before it is run.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:50 +01:00
Trond Myklebust
275e7e20aa NFSv4.1: Remove the 'FIFO' behaviour for nfs41_setup_sequence
It is more important to preserve the task priority behaviour, which ensures
that things like reclaim writes take precedence over background and kupdate
writes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:50 +01:00
Trond Myklebust
7b939a3f44 NFSv4.1: Clean up nfs41_setup_sequence
Move all the sleep-and-exit cases into a single section of code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:49 +01:00
Trond Myklebust
fd0c09537a NFSv4: Simplify the NFSv4/v4.1 synchronous call switch
We shouldn't need to pass the 'cache_reply' parameter if we
initialise the sequence_args/sequence_res in the caller.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:49 +01:00
Trond Myklebust
d9afbd1b08 NFSv4.1: Simplify the sequence setup
Nobody calls nfs4_setup_sequence or nfs41_setup_sequence without
also calling rpc_call_start() on success. This commit therefore
folds the rpc_call_start call into nfs41_setup_sequence().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:48 +01:00
Trond Myklebust
6ba7db3420 NFSv4.1: Use nfs41_setup_sequence where appropriate
There is no point in using nfs4_setup_sequence or nfs4_sequence_done
in pure NFSv4.1 functions. We already know that those have sessions...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:48 +01:00
Trond Myklebust
c10e449827 NFSv4.1: Ping server when our session table limits are too high
If the server requests a lower target_highest_slotid, then ensure
that we ping it with at least one RPC call containing an
appropriate SEQUENCE op. This ensures that the server won't need to
send a recall callback in order to shrink the slot table.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:47 +01:00
Trond Myklebust
73e39aaa83 NFSv4.1: Cleanup move session slot management to fs/nfs/nfs4session.c
NFSv4.1 session management is getting complex enough to deserve
a separate file.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:45 +01:00
Trond Myklebust
3302127967 NFSv4: Move nfs4_wait_clnt_recover and nfs4_client_recover_expired_lease
nfs4_wait_clnt_recover and nfs4_client_recover_expired_lease are both
generic state related functions. As such, they belong in nfs4state.c,
and not nfs4proc.c

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:45 +01:00
Trond Myklebust
5d63360dd8 NFSv4.1: Clean up session draining
Coalesce nfs4_check_drain_bc_complete and nfs4_check_drain_fc_complete
into a single function that can be called when the slot table is known
to be empty, then change nfs4_callback_free_slot() and nfs4_free_slot()
to use it.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:44 +01:00
Trond Myklebust
69d206b5b3 NFSv4.1: If slot allocation fails due to OOM, retry more quickly
If the NFSv4.1 session slot allocation fails due to an ENOMEM condition,
then set the task->tk_timeout to 1/4 second to ensure that we do retry
the slot allocation more quickly.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:44 +01:00
Trond Myklebust
afa296103e NFSv4.1: Remove the state manager code to resize the slot table
The state manager no longer needs any special machinery to stop the
session flow and resize the slot table. It is all done on the fly by
the SEQUENCE op code now.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:43 +01:00
Trond Myklebust
87dda67e73 NFSv4.1: Allow SEQUENCE to resize the slot table on the fly
Instead of an array of slots, use a singly linked list of slots that
can be dynamically appended to or shrunk.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:42 +01:00
Trond Myklebust
97e548a93d NFSv4.1: Support dynamic resizing of the session slot table
Allow the server to control the size of the session slot table
by adjusting the value of sr_target_max_slots in the reply to the
SEQUENCE operation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:42 +01:00
Trond Myklebust
ce008c4bb9 NFSv4.1: Fix nfs4_callback_recallslot to work with dynamic slot allocation
Ensure that the NFSv4.1 CB_RECALL_SLOT callback updates the slot table
target max slotid safely.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:37 +01:00
Trond Myklebust
da0507b7c9 NFSv4.1: Reset the sequence number for slots that have been deallocated
When the server tells us that it is dynamically resizing the session
replay cache, we should reset the sequence number for those slots
that have been deallocated.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:30:17 +01:00
Trond Myklebust
464ee9f966 NFSv4.1: Ensure that the client tracks the server target_highest_slotid
Dynamic slot allocation in NFSv4.1 depends on the client being able to
track the server's target value for the highest slotid in the
slot table.  See the reference in Section 2.10.6.1 of RFC5661.

To avoid ordering problems in the case where 2 SEQUENCE replies contain
conflicting updates to this target value, we also introduce a generation
counter, to track whether or not an RPC containing a SEQUENCE operation
was launched before or after the last update.

Also rename the nfs4_slot_table target_max_slots field to
'target_highest_slotid' to avoid confusion with a slot
table size or number of slots.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-12-06 00:29:47 +01:00
Trond Myklebust
f4af6e2abc NFSv4.1: Clean up nfs4_free_slot
Change the argument to take the pointer to the slot, instead of
just the slotid.

We know that the new value of highest_used_slot must be less than
the current value. No need to scan the whole table.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-26 17:49:53 -05:00
Trond Myklebust
2dc03b7f00 NFSv4.1: Simplify slot allocation
Clean up the NFSv4.1 slot allocation by replacing nfs_find_slot() with
a function nfs_alloc_slot() that returns a pointer to the nfs4_slot
instead of an offset into the slot table.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-26 17:49:52 -05:00
Trond Myklebust
2b2fa71723 NFSv4.1: Simplify struct nfs4_sequence_args too
Replace the session pointer + slotid with a pointer to the
allocated slot.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-26 17:49:52 -05:00
Trond Myklebust
df2fabffba NFSv4.1: Label each entry in the session slot tables with its slot number
Instead of doing slot table pointer gymnastics every time we want to
know which slot we're using.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-26 17:49:51 -05:00
Trond Myklebust
e3725ec015 NFSv4.1: Shrink struct nfs4_sequence_res by moving the session pointer
Move the session pointer into the slot table, then have struct nfs4_slot
point to that slot table.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-26 17:49:04 -05:00
Trond Myklebust
933602e368 NFSv4.1: Shrink struct nfs4_sequence_res by moving sr_renewal_time
Store the renewal time inside the session slot instead.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-21 09:29:53 -05:00
Trond Myklebust
9216106a84 NFSv4.1: clean up nfs4_recall_slot to use nfs4_alloc_slots
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-21 09:29:53 -05:00
Trond Myklebust
2d473d378e NFSv4.1: nfs4_alloc_slots doesn't need zeroing
All that memory is going to be initialised to non-zero by
nfs4_add_and_init_slots anyway.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-21 09:29:52 -05:00
Trond Myklebust
43095d3972 NFSv4.1: We must bump the clientid sequence number after CREATE_SESSION
We must always bump the clientid sequence number after a successful
call to CREATE_SESSION on the server. The result of
nfs4_verify_channel_attrs() is irrelevant to that requirement.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-21 09:29:52 -05:00
Trond Myklebust
688a9024e2 NFSv4.1: Adjust CREATE_SESSION arguments when mounting a new filesystem
If we're mounting a new filesystem, ensure that the session has negotiated
large enough request and reply sizes to match the wsize and rsize mount
arguments.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-21 09:29:51 -05:00
Trond Myklebust
ae72ae6760 NFSv4.1: Don't confuse CREATE_SESSION arguments and results
Don't store the target request and response sizes in the same
variables used to store the server's replies to those targets.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-21 09:29:51 -05:00
Bryan Schumaker
6bdb5f213c NFS: Add sequence_priviliged_ops for nfs4_proc_sequence()
If I mount an NFS v4.1 server to a single client multiple times and then
run xfstests over each mountpoint I usually get the client into a state
where recovery deadlocks.  The server informs the client of a
cb_path_down sequence error, the client then does a
bind_connection_to_session and checks the status of the lease.

I found that bind_connection_to_session sets the NFS4_SESSION_DRAINING
flag on the client, but this flag is never unset before
nfs4_check_lease() reaches nfs4_proc_sequence().  This causes the client
to deadlock, halting all NFS activity to the server.  nfs4_proc_sequence()
is only called by the state manager, so I can change it to run in privileged
mode to bypass the NFS4_SESSION_DRAINING check and avoid the deadlock.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-11-20 23:34:54 -05:00
Trond Myklebust
4ea8fed593 NFSv4: Get rid of unnecessary BUG_ON()s
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-04 14:43:39 -05:00
Trond Myklebust
d3edcf9614 NFSv4: Remove the BUG_ON() from nfs4_get_lease_time_prepare()...
An EAGAIN return value would be unexpected, but there is no reason to
BUG...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-04 14:43:38 -05:00
Weston Andros Adamson
998f40b550 NFS4: nfs4_opendata_access should return errno
Return errno - not an NFS4ERR_. This worked because NFS4ERR_ACCESS == EACCES.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-02 18:51:54 -04:00
Trond Myklebust
f9b1ef5f06 NFSv4: Initialise the NFSv4.1 slot table highest_used_slotid correctly
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-11-01 12:02:03 -04:00
Weston Andros Adamson
324d003b0c NFS: add nfs_sb_deactive_async to avoid deadlock
Use nfs_sb_deactive_async instead of nfs_sb_deactive when in a workqueue
context.  This avoids a deadlock where rpc_shutdown_client loops forever
in a workqueue kworker context, trying to kill all RPC tasks associated with
the client, while one or more of these tasks have already been assigned to the
same kworker (and will never run rpc_exit_task).

This approach is needed because RPC tasks that have already been assigned
to a kworker by queue_work cannot be canceled, as explained in the comment
for workqueue.c:insert_wq_barrier.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
[Trond: add module_get/put.]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-10-31 16:26:26 -04:00
Trond Myklebust
2b1bc308f4 NFSv4: nfs4_locku_done must release the sequence id
If the state recovery machinery is triggered by the call to
nfs4_async_handle_error() then we can deadlock.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-10-31 15:10:04 -04:00
Trond Myklebust
2240a9e2d0 NFSv4.1: We must release the sequence id when we fail to get a session slot
If we do not release the sequence id in cases where we fail to get a
session slot, then we can deadlock if we hit a recovery scenario.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-10-31 15:08:18 -04:00
Bryan Schumaker
399f11c3d8 NFS: Wait for session recovery to finish before returning
Currently, we will schedule session recovery and then return to the
caller of nfs4_handle_exception.  This works for most cases, but causes
a hang on the following test case:

	Client				Server
	------				------
	Open file over NFS v4.1
	Write to file
					Expire client
	Try to lock file

The server will return NFS4ERR_BADSESSION, prompting the client to
schedule recovery.  However, the client will continue placing lock
attempts and the open recovery never seems to be scheduled.  The
simplest solution is to wait for session recovery to run before retrying
the lock.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-10-31 13:13:28 -04:00
Andy Adamson
5f65753033 NFSv4 set open access operation call flag in nfs4_init_opendata_res
nfs4_open_recover_helper zeros the nfs4_opendata result structures, removing
the result access_request information which leads to an XDR decode error.

Move the setting of the result access_request field to nfs4_init_opendata_res
which sets all the other required nfs4_opendata result fields and is shared
between the open and recover open paths.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-10-03 17:10:28 -07:00
Andy Adamson
e23008ec81 NFSv4 reduce attribute requests for open reclaim
We currently make no distinction in attribute requests between normal OPENs
and OPEN with CLAIM_PREVIOUS.  This offers more possibility of failures in
the GETATTR response which foils OPEN reclaim attempts.

Reduce the requested attributes to the bare minimum needed to update the
reclaim open stateid and split nfs4_opendata_to_nfs4_state processing
accordingly.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-10-02 18:12:25 -07:00
Trond Myklebust
807d66d802 NFSv4: nfs4_open_done first must check that GETATTR decoded a file type
...before it can check the validity of that file type.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-10-02 17:09:00 -07:00
Weston Andros Adamson
ae2bb03236 NFSv4: don't put ACCESS in OPEN compound if O_EXCL
Don't put an ACCESS op in OPEN compound if O_EXCL, because ACCESS
will return permission denied for all bits until close.

Fixes a regression due to commit 6168f62c (NFSv4: Add ACCESS operation to
OPEN compound)

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-10-02 14:56:19 -07:00
Weston Andros Adamson
bbd3a8eee8 NFSv4: don't check MAY_WRITE access bit in OPEN
Don't check MAY_WRITE as a newly created file may not have write mode bits,
but POSIX allows the creating process to write regardless.
This is ok because NFSv4 OPEN ops handle write permissions correctly -
the ACCESS in the OPEN compound is to differentiate READ v EXEC permissions.

Fixes a regression due to commit 6168f62c (NFSv4: Add ACCESS operation to
OPEN compound)

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-10-02 14:55:41 -07:00
Trond Myklebust
ee314c2a35 NFSv4.1: Handle BAD_STATEID and EXPIRED errors in layoutget
If the layoutget call returns a stateid error, we want to invalidate the
layout stateid, and/or recover the open stateid.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-10-02 08:34:29 -07:00
Peng Tao
dc182549d4 NFS41: fix error of setting blocklayoutdriver
After commit e38eb650 (NFS: set_pnfs_layoutdriver() from
nfs4_proc_fsinfo()), set_pnfs_layoutdriver() is called inside
nfs4_proc_fsinfo(), but pnfs_blksize is not set. It causes setting
blocklayoutdriver failure and pnfsblock mount failure.

Cc: stable <stable@vger.kernel.org> [since v3.5]
Signed-off-by: Peng Tao <tao.peng@emc.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-10-01 15:37:39 -07:00
Chuck Lever
6f2ea7f2a3 NFS: Add nfs4_unique_id boot parameter
An optional boot parameter is introduced to allow client
administrators to specify a string that the Linux NFS client can
insert into its nfs_client_id4 id string, to make it both more
globally unique, and to ensure that it doesn't change even if the
client's nodename changes.

If this boot parameter is not specified, the client's nodename is
used, as before.

Client installation procedures can create a unique string (typically,
a UUID) which remains unchanged during the lifetime of that client
instance.  This works just like creating a UUID for the label of the
system's root and boot volumes.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-10-01 15:33:33 -07:00
Chuck Lever
05f4c350ee NFS: Discover NFSv4 server trunking when mounting
"Server trunking" is a fancy named for a multi-homed NFS server.
Trunking might occur if a client sends NFS requests for a single
workload to multiple network interfaces on the same server.  There
are some implications for NFSv4 state management that make it useful
for a client to know if a single NFSv4 server instance is
multi-homed.  (Note this is only a consideration for NFSv4, not for
legacy versions of NFS, which are stateless).

If a client cares about server trunking, no NFSv4 operations can
proceed until that client determines who it is talking to.  Thus
server IP trunking discovery must be done when the client first
encounters an unfamiliar server IP address.

The nfs_get_client() function walks the nfs_client_list and matches
on server IP address.  The outcome of that walk tells us immediately
if we have an unfamiliar server IP address.  It invokes
nfs_init_client() in this case.  Thus, nfs4_init_client() is a good
spot to perform trunking discovery.

Discovery requires a client to establish a fresh client ID, so our
client will now send SETCLIENTID or EXCHANGE_ID as the first NFS
operation after a successful ping, rather than waiting for an
application to perform an operation that requires NFSv4 state.

The exact process for detecting trunking is different for NFSv4.0 and
NFSv4.1, so a minorversion-specific init_client callout method is
introduced.

CLID_INUSE recovery is important for the trunking discovery process.
CLID_INUSE is a sign the server recognizes the client's nfs_client_id4
id string, but the client is using the wrong principal this time for
the SETCLIENTID operation.  The SETCLIENTID must be retried with a
series of different principals until one works, and then the rest of
trunking discovery can proceed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-10-01 15:33:33 -07:00
Chuck Lever
e984a55a74 NFS: Use the same nfs_client_id4 for every server
Currently, when identifying itself to NFS servers, the Linux NFS
client uses a unique nfs_client_id4.id string for each server IP
address it talks with.  For example, when client A talks to server X,
the client identifies itself using a string like "AX".  The
requirements for these strings are specified in detail by RFC 3530
(and bis).

This form of client identification presents a problem for Transparent
State Migration.  When client A's state on server X is migrated to
server Y, it continues to be associated with string "AX."  But,
according to the rules of client string construction above, client
A will present string "AY" when communicating with server Y.

Server Y thus has no way to know that client A should be associated
with the state migrated from server X.  "AX" is all but abandoned,
interfering with establishing fresh state for client A on server Y.

To support transparent state migration, then, NFSv4.0 clients must
instead use the same nfs_client_id4.id string to identify themselves
to every NFS server; something like "A".

Now a client identifies itself as "A" to server X.  When a file
system on server X transitions to server Y, and client A identifies
itself as "A" to server Y, Y will know immediately that the state
associated with "A," whether it is native or migrated, is owned by
the client, and can merge both into a single lease.

As a pre-requisite to adding support for NFSv4 migration to the Linux
NFS client, this patch changes the way Linux identifies itself to NFS
servers via the SETCLIENTID (NFSv4 minor version 0) and EXCHANGE_ID
(NFSv4 minor version 1) operations.

In addition to removing the server's IP address from nfs_client_id4,
the Linux NFS client will also no longer use its own source IP address
as part of the nfs_client_id4 string.  On multi-homed clients, the
value of this address depends on the address family and network
routing used to contact the server, thus it can be different for each
server.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-10-01 15:33:33 -07:00
Weston Andros Adamson
6168f62cbd NFSv4: Add ACCESS operation to OPEN compound
The OPEN operation has no way to differentiate an open for read and an
open for execution - both look like read to the server. This allowed
users to read files that didn't have READ access but did have EXEC access,
which is obviously wrong.

This patch adds an ACCESS call to the OPEN compound to handle the
difference between OPENs for reading and execution.  Since we're going
through the trouble of calling ACCESS, we check all possible access bits
and cache the results hopefully avoiding an ACCESS call in the future.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-10-01 15:20:11 -07:00
Bryan Schumaker
6938867edb NFS: Remove bad delegations during open recovery
I put the client into an open recovery loop by:
	Client: Open file
		read half
	Server: Expire client (echo 0 > /sys/kernel/debug/nfsd/forget_clients)
	Client: Drop vm cache (echo 3 > /proc/sys/vm/drop_caches)
		finish reading file

This causes a loop because the client never updates the nfs4_state after
discovering that the delegation is invalid.  This means it will keep
trying to read using the bad delegation rather than attempting to re-open
the file.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
CC: stable@vger.kernel.org [3.4+]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-10-01 15:17:25 -07:00
Bryan Schumaker
fcb6d9c6b7 NFS: Always use the open stateid when checking for expired opens
If we are reading through a delegation, and the delegation is OK then
state->stateid will still point to a delegation stateid and not an open
stateid.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-10-01 15:17:17 -07:00
Trond Myklebust
849b286fd0 NFSv4.1: nfs4_proc_layoutreturn must always drop the plh_block_lgets count
Currently it does not do so if the RPC call failed to start. Fix is to
move the decrement of plh_block_lgets into nfs4_layoutreturn_release.

Also remove a redundant test of task->tk_status in nfs4_layoutreturn_done:
if lrp->res.lrs_present is set, then obviously the RPC call succeeded.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:18 -04:00
Trond Myklebust
1f7977c136 NFSv4.1: Simplify the pNFS return-on-close code
Confine it to the nfs4_do_close() code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:12 -04:00
Trond Myklebust
7fdab069b7 NFSv4.1: Fix a race in the pNFS return-on-close code
If we sleep after dropping the inode->i_lock, then we are no longer
atomic with respect to the rpc_wake_up() call in pnfs_layout_remove_lseg().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:11 -04:00
Trond Myklebust
9369a431bc NFSv4.1: Cleanup; add "pnfs_" prefix to put_lseg() and get_lseg()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:07 -04:00
Trond Myklebust
70c3bd2bdf NFSv4.1: Cleanup; add "pnfs_" prefix to get_layout_hdr() and put_layout_hdr()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:07 -04:00
Trond Myklebust
a0b0a6e39b NFS: Clean up the pNFS layoutget interface
Ensure that we do return errors from nfs4_proc_layoutget() and that we
don't mark the layout as having failed if the error was due to a
signal or resource problem on the client side.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:06 -04:00
Trond Myklebust
795a88c968 NFSv4: Convert the nfs4_lock_state->ls_flags to a bit field
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:04 -04:00
Trond Myklebust
2a369153c8 NFS: Clean up helper function nfs4_select_rw_stateid()
We want to be able to pass on the information that the page was not
dirtied under a lock. Instead of adding a flag parameter, do this
by passing a pointer to a 'struct nfs_lock_owner' that may be NULL.

Also reuse this structure in struct nfs_lock_context to carry the
fl_owner_t and pid_t.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-28 16:03:04 -04:00
Wei Yongjun
e8d920c58d NFS: fix the return value check by using IS_ERR
In case of error, the function rpcauth_create() returns ERR_PTR()
and never returns NULL pointer. The NULL test in the return value
check should be replaced with IS_ERR().

dpatch engine is used to auto generated this patch.
(https://github.com/weiyj/dpatch)

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-25 10:36:37 -04:00
Trond Myklebust
1f1ea6c2d9 NFSv4: Fix buffer overflow checking in __nfs4_get_acl_uncached
Pass the checks made by decode_getacl back to __nfs4_get_acl_uncached
so that it knows if the acl has been truncated.

The current overflow checking is broken, resulting in Oopses on
user-triggered nfs4_getfacl calls, and is opaque to the point
where several attempts at fixing it have failed.
This patch tries to clean up the code in addition to fixing the
Oopses by ensuring that the overflow checks are performed in
a single place (decode_getacl). If the overflow check failed,
we will still be able to report the acl length, but at least
we will no longer attempt to cache the acl or copy the
truncated contents to user space.

Reported-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: Sachin Prabhu <sprabhu@redhat.com>
2012-09-06 11:11:53 -04:00
Trond Myklebust
21f498c2f7 NFSv4: Fix range checking in __nfs4_get_acl_uncached and __nfs4_proc_set_acl
Ensure that the user supplied buffer size doesn't cause us to overflow
the 'pages' array.

Also fix up some confusion between the use of PAGE_SIZE and
PAGE_CACHE_SIZE when calculating buffer sizes. We're not using
the page cache for anything here.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-09-04 14:52:43 -04:00
Trond Myklebust
c3f52af3e0 NFS: Fix the initialisation of the readdir 'cookieverf' array
When the NFS_COOKIEVERF helper macro was converted into a static
inline function in commit 99fadcd764 (nfs: convert NFS_*(inode)
helpers to static inline), we broke the initialisation of the
readdir cookies, since that depended on doing a memset with an
argument of 'sizeof(NFS_COOKIEVERF(inode))' which therefore
changed from sizeof(be32 cookieverf[2]) to sizeof(be32 *).

At this point, NFS_COOKIEVERF seems to be more of an obfuscation
than a helper, so the best thing would be to just get rid of it.

Also see: https://bugzilla.kernel.org/show_bug.cgi?id=46881

Reported-by: Andi Kleen <andi@firstfloor.org>
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-09-04 14:52:42 -04:00
Trond Myklebust
b291f1b1c8 NFSv4: Fix the acl cache size calculation
Currently, we do not take into account the size of the 16 byte
struct nfs4_cached_acl header, when deciding whether or not we should
cache the acl data.  Consequently, we will end up allocating an
8k buffer in order to fit a maximum size 4k acl.

This patch adjusts the calculation so that we limit the cache size
to 4k for the acl header+data.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-16 16:15:50 -04:00
Trond Myklebust
519d3959e3 NFSv4: Fix pointer arithmetic in decode_getacl
Resetting the cursor xdr->p to a previous value is not a safe
practice: if the xdr_stream has crossed out of the initial iovec,
then a bunch of other fields would need to be reset too.

Fix this issue by using xdr_enter_page() so that the buffer gets
page aligned at the bitmap _before_ we decode it.

Also fix the confusion of the ACL length with the page buffer length
by not adding the base offset to the ACL length...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-08-16 16:15:50 -04:00
Trond Myklebust
47fbf7976e NFSv4.1: Remove a bogus BUG_ON() in nfs4_layoutreturn_done
Ever since commit 0a57cdac3f (NFSv4.1 send layoutreturn to fence
disconnected data server) we've been sending layoutreturn calls
while there is potentially still outstanding I/O to the data
servers. The reason we do this is to avoid races between replayed
writes to the MDS and the original writes to the DS.

When this happens, the BUG_ON() in nfs4_layoutreturn_done can
be triggered because it assumes that we would never call
layoutreturn without knowing that all I/O to the DS is
finished. The fix is to remove the BUG_ON() now that the
assumptions behind the test are obsolete.

Reported-by: Boaz Harrosh <bharrosh@panasas.com>
Reported-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [>=3.5]
2012-08-08 16:03:13 -04:00
Idan Kedar
21d1f58aed pnfs: nfs4_proc_layoutget returns void
since the only user of nfs4_proc_layoutget is send_layoutget, which
ignores its return value, there is no reason to return any value.

Signed-off-by: Idan Kedar <idank@tonian.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-02 17:39:06 -04:00
Idan Kedar
8554116e17 pnfs: defer release of pages in layoutget
we have encountered a bug whereby reading a lot of files (copying
fedora's /bin) from a pNFS mount and hitting Ctrl+C in the middle caused
a general protection fault in xdr_shrink_bufhead. this function is
called when decoding the response from LAYOUTGET. the decoding is done
by a worker thread, and the caller of LAYOUTGET waits for the worker
thread to complete.

hitting Ctrl+C caused the synchronous wait to end and the next thing the
caller does is to free the pages, so when the worker thread calls
xdr_shrink_bufhead, the pages are gone. therefore, the cleanup of these
pages has been moved to nfs4_layoutget_release.

Signed-off-by: Idan Kedar <idank@tonian.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-08-02 17:38:54 -04:00
Bryan Schumaker
fac1e8e4ef NFS: Keep module parameters in the generic NFS client
Otherwise we break backwards compatibility when v4 becomes a modules.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-07-30 19:06:31 -04:00
Bryan Schumaker
1179acc6a3 NFS: Only initialize the ACL client in the v3 case
v2 and v4 don't use it, so I create two new nfs_rpc_ops functions to
initialize the ACL client only when we are using v3.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-07-30 19:05:54 -04:00
Bryan Schumaker
ff9099f266 NFS: Create a try_mount rpc op
I'm already looking up the nfs subversion in nfs_fs_mount(), so I have
easy access to rpc_ops that used to be difficult to reach.  This allows
me to set up a different mount path for NFS v2/3 and NFS v4.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-07-30 19:04:53 -04:00
Jeff Layton
f44106e217 nfs: fix fl_type tests in NFSv4 code
fl_type is not a bitmap.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-07-30 18:09:13 -04:00
Bryan Schumaker
73a79706d7 NFS: Split out NFS v4 inode operations
The NFS v4 file inode operations are already already in nfs4proc.c, so
this patch just needs to move the directory operations to the same file.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-07-17 13:33:05 -04:00
Chuck Lever
6bbb4ae8ff NFS: Clean up nfs4_proc_setclientid() and friends
Add documenting comments and appropriate debugging messages.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-07-16 15:12:16 -04:00
Chuck Lever
de73483122 NFS: Treat NFS4ERR_CLID_INUSE as a fatal error
For NFSv4 minor version 0, currently the cl_id_uniquifier allows the
Linux client to generate a unique nfs_client_id4 string whenever a
server replies with NFS4ERR_CLID_INUSE.

This implementation seems to be based on a flawed reading of RFC
3530.  NFS4ERR_CLID_INUSE actually means that the client has presented
this nfs_client_id4 string with a different principal at some time in
the past, and that lease is still in use on the server.

For a Linux client this might be rather difficult to achieve: the
authentication flavor is named right in the nfs_client_id4.id
string.  If we change flavors, we change strings automatically.

So, practically speaking, NFS4ERR_CLID_INUSE means there is some other
client using our string.  There is not much that can be done to
recover automatically.  Let's make it a permanent error.

Remove the recovery logic in nfs4_proc_setclientid(), and remove the
cl_id_uniquifier field from the nfs_client data structure.  And,
remove the authentication flavor from the nfs_client_id4 string.

Keeping the authentication flavor in the nfs_client_id4.id string
means that we could have a separate lease for each authentication
flavor used by mounts on the client.  But we want just one lease for
all the mounts on this client.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-07-16 15:12:16 -04:00
Chuck Lever
46a87b8a7b NFS: When state recovery fails, waiting tasks should exit
NFSv4 state recovery is not always successful.  Failure is signalled
by setting the nfs_client.cl_cons_state to a negative (errno) value,
then waking waiters.

Currently this can happen only during mount processing.  I'm about to
add an explicit case where state recovery failure during normal
operation should force all NFS requests waiting on that state recovery
to exit.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-07-16 15:12:15 -04:00
Chuck Lever
6a1a1e34dc SUNRPC: Add rpcauth_list_flavors()
The gss_mech_list_pseudoflavors() function provides a list of
currently registered GSS pseudoflavors.  This list does not include
any non-GSS flavors that have been registered with the RPC client.
nfs4_find_root_sec() currently adds these extra flavors by hand.

Instead, nfs4_find_root_sec() should be looking at the set of flavors
that have been explicitly registered via rpcauth_register().  And,
other areas of code will soon need the same kind of list that
contains all flavors the kernel currently knows about (see below).

Rather than cloning the open-coded logic in nfs4_find_root_sec() to
those new places, introduce a generic RPC function that generates a
full list of registered auth flavors and pseudoflavors.

A new rpc_authops method is added that lists a flavor's
pseudoflavors, if it has any.  I encountered an interesting module
loader loop when I tried to get the RPC client to invoke
gss_mech_list_pseudoflavors() by name.

This patch is a pre-requisite for server trunking discovery, and a
pre-requisite for fixing up the in-kernel mount client to do better
automatic security flavor selection.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-07-16 15:12:15 -04:00
Chuck Lever
56d08fef23 NFS: nfs_getaclargs.acl_len is a size_t
Squelch compiler warnings:

fs/nfs/nfs4proc.c: In function ‘__nfs4_get_acl_uncached’:
fs/nfs/nfs4proc.c:3811:14: warning: comparison between signed and
	unsigned integer expressions [-Wsign-compare]
fs/nfs/nfs4proc.c:3818:15: warning: comparison between signed and
	unsigned integer expressions [-Wsign-compare]

Introduced by commit bf118a34 "NFSv4: include bitmap in nfsv4 get
acl data", Dec 7, 2011.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-07-16 14:53:43 -04:00
Chuck Lever
38527b153a NFS: Clean up TEST_STATEID and FREE_STATEID error reporting
As a finishing touch, add appropriate documenting comments and some
debugging printk's.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-07-16 14:53:34 -04:00
Chuck Lever
3e60ffdd36 NFS: Clean up nfs41_check_expired_stateid()
Clean up: Instead of open-coded flag manipulation, use test_bit() and
clear_bit() just like all other accessors of the state->flag field.
This also eliminates several unnecessary implicit integer type
conversions.

To make it absolutely clear what is going on, a number of comments
are introduced.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-07-16 14:49:40 -04:00
Chuck Lever
eb64cf964d NFS: State reclaim clears OPEN and LOCK state
The "state->flags & flags" test in nfs41_check_expired_stateid()
allows the state manager to squelch a TEST_STATEID operation when
it is known for sure that a state ID is no longer valid.  If the
lease was purged, for example, the client already knows that state
ID is now defunct.

But open recovery is still needed for that inode.

To force a call to nfs4_open_expired(), change the default return
value for nfs41_check_expired_stateid() to force open recovery, and
the default return value for nfs41_check_locks() to force lock
recovery, if the requested flags are clear.  Fix suggested by Bryan
Schumaker.

Also, the presence of a delegation state ID must not prevent normal
open recovery.  The delegation state ID must be cleared if it was
revoked, but once cleared I don't think it's presence or absence has
any bearing on whether open recovery is still needed.  So the logic
is adjusted to ignore the TEST_STATEID result for the delegation
state ID.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-07-16 14:48:53 -04:00
Chuck Lever
89af273958 NFS: Don't free a state ID the server does not recognize
The result of a TEST_STATEID operation can indicate a few different
things:

  o If NFS_OK is returned, then the client can continue using the
    state ID under test, and skip recovery.

  o RFC 5661 says that if the state ID was revoked, then the client
    must perform an explicit FREE_STATEID before trying to re-open.

  o If the server doesn't recognize the state ID at all, then no
    FREE_STATEID is needed, and the client can immediately continue
    with open recovery.

Let's err on the side of caution: if the server clearly tells us the
state ID is unknown, we skip the FREE_STATEID.  For any other error,
we issue a FREE_STATEID.  Sometimes that FREE_STATEID will be
unnecessary, but leaving unused state IDs on the server needlessly
ties up resources.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-07-16 14:48:10 -04:00
Chuck Lever
377e507d15 NFS: Fix up TEST_STATEID and FREE_STATEID return code handling
The TEST_STATEID and FREE_STATEID operations can return
-NFS4ERR_BAD_STATEID, -NFS4ERR_OLD_STATEID, or -NFS4ERR_DEADSESSION.

nfs41_{test,free}_stateid() should not pass these errors to
nfs4_handle_exception() during state recovery, since that will
recursively kick off state recovery again, resulting in a deadlock.

In particular, when the TEST_STATEID operation returns NFS4_OK,
res.status can contain one of these errors.  _nfs41_test_stateid()
replaces NFS4_OK with the value in res.status, which is then returned
to callers.

But res.status is not passed through nfs4_stat_to_errno(), and thus is
a positive NFS4ERR value.  Currently callers are only interested in
!NFS4_OK, and nfs4_handle_exception() ignores positive values.

Thus the res.status values are currently ignored by
nfs4_handle_exception() and won't cause the deadlock above.  Thanks to
this missing negative, it is only when these operations fail (which
is very rare) that a deadlock can occur.

Bryan agrees the original intent was to return res.status as a
negative NFS4ERR value to callers of nfs41_test_stateid().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-07-16 14:47:52 -04:00
Trond Myklebust
8626e4a426 Merge commit '9249e17fe094d853d1ef7475dd559a2cc7e23d42' into nfs-for-3.6
Resolve conflicts with the VFS atomic open and sget changes.

Conflicts:
	fs/nfs/nfs4proc.c
2012-07-16 12:01:42 -04:00
Miklos Szeredi
8867fe5899 nfs: clean up ->create in nfs_rpc_ops
Don't pass nfs_open_context() to ->create().  Only the NFS4 implementation
needed that and only because it wanted to return an open file using open
intents.  That task has been replaced by ->atomic_open so it is not necessary
anymore to pass the context to the create rpc operation.

Despite nfs4_proc_create apparently being okay with a NULL context it Oopses
somewhere down the call chain.  So allocate a context here.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-07-14 16:33:08 +04:00
Bryan Schumaker
57208fa7e5 NFS: Create an write_pageio_init() function
pNFS needs to select a write function based on the layout driver
currently in use, so I let each NFS version decide how to best handle
initializing writes.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-29 11:46:46 -04:00
Bryan Schumaker
1abb50886a NFS: Create an read_pageio_init() function
pNFS needs to select a read function based on the layout driver
currently in use, so I let each NFS version decide how to best handle
initializing reads.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-29 11:46:46 -04:00
Bryan Schumaker
6663ee7f81 NFS: Create an alloc_client rpc_op
This gives NFS v4 a way to set up callbacks and sessions without v2 or
v3 having to do them as well.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-29 11:46:46 -04:00
Bryan Schumaker
cdb7ecedec NFS: Create a free_client rpc_op
NFS v4 needs a way to shut down callbacks and sessions, but v2 and v3
don't.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-29 11:46:45 -04:00
Bryan Schumaker
57ec14c55d NFS: Create a return_delegation rpc op
Delegations are a v4 feature, so push return_delegation out of the
generic client by creating a new rpc_op and renaming the old function to
be in the nfs v4 "namespace"

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-29 11:46:45 -04:00
Bryan Schumaker
011e2a7fd5 NFS: Create a have_delegation rpc_op
Delegations are a v4 feature, so push them out of the generic code.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-29 11:46:44 -04:00
Bryan Schumaker
e38eb6506f NFS: set_pnfs_layoutdriver() from nfs4_proc_fsinfo()
The generic client doesn't need to know about pnfs layout drivers, so
this should be done in the v4 code.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-29 11:46:43 -04:00
Andy Adamson
6e5b587d2f NFSv4.1 handle OPEN O_CREATE mdsthreshold
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-29 11:33:43 -04:00
Trond Myklebust
140150dbb1 SUNRPC: Remove unused function xdr_encode_pages
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-28 17:20:49 -04:00
Andy Adamson
2669940db8 NFSv4 do not send an empty SETATTR compound
Commit 536e43d12b ATTR_OPEN check can result in
an ia_valid with only ATTR_FILE set, and no NFS_VALID_ATTRS attributes to
request from the server.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-11 17:25:53 -04:00
Trond Myklebust
2d0dbc6ae8 NFSv4: Fix unnecessary delegation returns in nfs4_do_open
While nfs4_do_open() expects the fmode argument to be restricted to
combinations of FMODE_READ and FMODE_WRITE, both nfs4_atomic_open()
and nfs4_proc_create will pass the nfs_open_context->mode,
which contains the full fmode_t.

This patch ensures that nfs4_do_open strips the other fmode_t bits,
fixing a problem in which the nfs4_do_open call would result in an
unnecessary delegation return.

Reported-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-06-08 11:08:42 -04:00
Trond Myklebust
02c67525cf NFSv4.1: Convert another trivial printk into a dprintk
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-07 13:45:53 -04:00
Steve Dickson
f25efd851c NFS: Map minor mismatch error to protocol not support error.
Sservers that only have NFSv4.1 support the
NFS4ERR_MINOR_VERS_MISMATCH error is return on
v4.0 mounts. Mapping that error to EPROTONOSUPPORT
will cause the mount to back off to v3 instead of
failing.

Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-06 14:32:40 -04:00
Trond Myklebust
08106ac7c8 NFSv4.1: Convert a trivial printk into a dprintk
There is no need to bug the user about the server returning an error
on destroy_session. The error will be handled by the state manager,
without any need for further input from anyone else.
So convert that printk into a debugging dprintk.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-05 10:08:24 -04:00
Trond Myklebust
1549210fcc NFSv4: Fix an Oops in the open recovery code
The open recovery code does not need to request a new value for the
mdsthreshold, and so does not allocate a struct nfs4_threshold.
The problem is that encode_getfattr_open() will still request an
mdsthreshold, and so we end up Oopsing in decode_attr_mdsthreshold.

This patch fixes encode_getfattr_open so that it doesn't request an
mdsthreshold when the caller isn't asking for one. It also fixes
decode_attr_mdsthreshold so that it errors if the server returns
an mdsthreshold that we didn't ask for (instead of Oopsing).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Andy Adamson <andros@netapp.com>
2012-06-05 10:00:14 -04:00
Linus Torvalds
53f2c4a8fd NFS client updates for Linux 3.5
New features include:
 - Rewrite the O_DIRECT code so that it can share the same coalescing and
   pNFS functionality as the page cache code.
 - Allow the server to provide hints as to when we should use pNFS, and
   when it is more efficient to read and write through the metadata
   server.
 - NFS cache consistency updates:
   - Use the ctime to emulate a change attribute for NFSv2/v3 so that
     all NFS versions can share the same cache management code.
   - New cache management code will only look at the change attribute
     and size attribute when deciding whether or not our cached data
     is still valid or not.
   - Don't request NFSv4 post-op attributes on writes in cases such as
     O_DIRECT, where we don't care about data cache consistency, or
     when we have a write delegation, and know that our cache is
     still consistent.
   - Don't request NFSv4 post-op attributes on operations such as
     COMMIT, where there are no expected metadata updates.
   - Don't request NFSv4 directory post-op attributes in cases where
     the operations themselves already return change attribute updates:
     i.e.  operations such as OPEN, CREATE, REMOVE, LINK and RENAME.
 - Speed up 'ls' and friends by using READDIR rather than READDIRPLUS
   if we detect no attempts to lookup filenames.
 - Improve the code sharing between NFSv2/v3 and v4 mounts
 - NFSv4.1 state management efficiency improvements
 - More patches in preparation for NFSv4/v4.1 migration functionality.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJPw/MNAAoJEGcL54qWCgDyxU8P/2kKqhAlhoLEArBqo9FT3/OK
 YrNs5uO/erTgnCG8L0XQvTKjHB9F7TAeFXqTmBZuPlb1afRpHHt2vzPqzIvUCeOC
 ZXm8vzZf4nxWZgEFoTDdUBvqQi9lLdIzCRhSaVCKcRnNwiuaKDd/iwykbWGcHqmv
 jtR4lzXPllJdKCUL3yb3juVrpq6Vvn254ID2pqdnYcEtIJIHgaRZpwdp4Iz9+8b5
 Moishiw2rgCBJIhf+VCYd8B2oYfMgSDPxG1o3etkwY46qo+4s+CIls9Vu/6YzGXK
 3+NdLatRDqKhQpLm0/R+dI3rntnTZ8x6LgWnTGxUsiqb6pAaHZPK284rf2eh/s7M
 Q4G4203r0uw539kIt6eKOGqC9c8kZAPCHlQSPCaImZyCJsz+6OMShNlGB5bZpFPr
 tbdxaxudrhCF7UVKXicJCWgv2nIHtek6fNwey1jqFoYgZP5ipiBKymvXQC5WAMBw
 7RHJor/JEC+UJkVg/7Mkpg0UNw3E36CTYLeRJKlNCS6YO9NJQseCDxhhMNAy/ab7
 RGO8DVMkUsOUH20S+a19LyeFQtveWFIE0DiDqRn0KnNGhGwHrv2t4xFukjlrf4Sw
 8FQUBRdtFxfmspfA1IdoTY49XZQda5eagvTy1MyaWEh+jPSJ4G5j3sSjFiaKAJqw
 79iQKFGkxPOSHx2yCdAF
 =suVW
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
 "New features include:
   - Rewrite the O_DIRECT code so that it can share the same coalescing
     and pNFS functionality as the page cache code.
   - Allow the server to provide hints as to when we should use pNFS,
     and when it is more efficient to read and write through the
     metadata server.
   - NFS cache consistency updates:
     * Use the ctime to emulate a change attribute for NFSv2/v3 so that
       all NFS versions can share the same cache management code.
     * New cache management code will only look at the change attribute
       and size attribute when deciding whether or not our cached data
       is still valid or not.
     * Don't request NFSv4 post-op attributes on writes in cases such as
       O_DIRECT, where we don't care about data cache consistency, or
       when we have a write delegation, and know that our cache is still
       consistent.
     * Don't request NFSv4 post-op attributes on operations such as
       COMMIT, where there are no expected metadata updates.
     * Don't request NFSv4 directory post-op attributes in cases where
       the operations themselves already return change attribute
       updates: i.e. operations such as OPEN, CREATE, REMOVE, LINK and
       RENAME.
   - Speed up 'ls' and friends by using READDIR rather than READDIRPLUS
     if we detect no attempts to lookup filenames.
   - Improve the code sharing between NFSv2/v3 and v4 mounts
   - NFSv4.1 state management efficiency improvements
   - More patches in preparation for NFSv4/v4.1 migration functionality."

Fix trivial conflict in fs/nfs/nfs4proc.c that was due to the dcache
qstr name initialization changes (that made the length/hash a 64-bit
union)

* tag 'nfs-for-3.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (146 commits)
  NFSv4: Add debugging printks to state manager
  NFSv4: Map NFS4ERR_SHARE_DENIED into an EACCES error instead of EIO
  NFSv4: update_changeattr does not need to set NFS_INO_REVAL_PAGECACHE
  NFSv4.1: nfs4_reset_session should use nfs4_handle_reclaim_lease_error
  NFSv4.1: Handle other occurrences of NFS4ERR_CONN_NOT_BOUND_TO_SESSION
  NFSv4.1: Handle NFS4ERR_CONN_NOT_BOUND_TO_SESSION in the state manager
  NFSv4.1: Handle errors in nfs4_bind_conn_to_session
  NFSv4.1: nfs4_bind_conn_to_session should drain the session
  NFSv4.1: Don't clobber the seqid if exchange_id returns a confirmed clientid
  NFSv4.1: Add DESTROY_CLIENTID
  NFSv4.1: Ensure we use the correct credentials for bind_conn_to_session
  NFSv4.1: Ensure we use the correct credentials for session create/destroy
  NFSv4.1: Move NFSPROC4_CLNT_BIND_CONN_TO_SESSION to the end of the operations
  NFSv4.1: Handle NFS4ERR_SEQ_MISORDERED when confirming the lease
  NFSv4: When purging the lease, we must clear NFS4CLNT_LEASE_CONFIRM
  NFSv4: Clean up the error handling for nfs4_reclaim_lease
  NFSv4.1: Exchange ID must use GFP_NOFS allocation mode
  nfs41: Use BIND_CONN_TO_SESSION for CB_PATH_DOWN*
  nfs4.1: add BIND_CONN_TO_SESSION operation
  NFSv4.1 test the mdsthreshold hint parameters
  ...
2012-05-29 10:43:51 -07:00
Trond Myklebust
fb13bfa7e1 NFSv4: Map NFS4ERR_SHARE_DENIED into an EACCES error instead of EIO
If a file OPEN is denied due to a share lock, the resulting
NFS4ERR_SHARE_DENIED is currently mapped to the default EIO.
This patch adds a more appropriate mapping, and brings Linux
into line with what Solaris 10 does.

See https://bugzilla.kernel.org/show_bug.cgi?id=43286

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-05-28 17:21:48 -04:00
Trond Myklebust
359d7d1c97 NFSv4: update_changeattr does not need to set NFS_INO_REVAL_PAGECACHE
We're already invalidating the data cache, and setting the new change
attribute. Since directories don't care about the i_size field, there
is no need to be forcing any extra revalidation of the page cache.

We do keep the NFS_INO_INVALID_ATTR flag, in order to force an
attribute cache revalidation on stat() calls since we do not
update the mtime and ctime fields.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-28 10:05:47 -04:00
Trond Myklebust
9f594791dd NFSv4.1: Handle other occurrences of NFS4ERR_CONN_NOT_BOUND_TO_SESSION
Let nfs4_schedule_session_recovery() handle the details of choosing
between resetting the session, and other session related recovery.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-27 14:33:07 -04:00
Trond Myklebust
32b0131069 NFSv4.1: Don't clobber the seqid if exchange_id returns a confirmed clientid
If the EXCHGID4_FLAG_CONFIRMED_R flag is set, the client is in theory
supposed to already know the correct value of the seqid, in which case
RFC5661 states that it should ignore the value returned.

Also ensure that if the sanity check in nfs4_check_cl_exchange_flags
fails, then we must not change the nfs_client fields.

Finally, clean up the code: we don't need to retest the value of
'status' unless it can change.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-26 14:17:31 -04:00
Trond Myklebust
6624553910 NFSv4.1: Add DESTROY_CLIENTID
Ensure that we destroy our lease on last unmount

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-26 14:17:30 -04:00
Trond Myklebust
2cf047c994 NFSv4.1: Ensure we use the correct credentials for bind_conn_to_session
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Weston Andros Adamson <dros@netapp.com>
2012-05-25 18:02:10 -04:00
Trond Myklebust
848f5bda54 NFSv4.1: Ensure we use the correct credentials for session create/destroy
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-25 18:02:09 -04:00
Trond Myklebust
bbafffd293 NFSv4.1: Exchange ID must use GFP_NOFS allocation mode
Exchange ID can be called in a lease reclaim situation, so it
will deadlock if it then tries to write out dirty NFS pages.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-24 16:31:39 -04:00
Weston Andros Adamson
7c44f1ae4a nfs4.1: add BIND_CONN_TO_SESSION operation
This patch adds the BIND_CONN_TO_SESSION operation which is needed for
upcoming SP4_MACH_CRED work and useful for recovering from broken connections
without destroying the session.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-24 16:22:19 -04:00
Andy Adamson
82be417aa3 NFSv4.1 cache mdsthreshold values on OPEN
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-24 16:15:48 -04:00
Trond Myklebust
54ac471c83 NFS: Add memory barriers to the nfs_client->cl_cons_state initialisation
Ensure that a process that uses the nfs_client->cl_cons_state test
for whether the initialisation process is finished does not read
stale data.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-23 15:24:59 -04:00
Trond Myklebust
7b38c3682c NFSv4.1: Fix session initialisation races
Session initialisation is not complete until the lease manager
has run. We need to ensure that both nfs4_init_session and
nfs4_init_ds_session do so, and that they check for any resulting
errors in clp->cl_cons_state.

Only after this is done, can nfs4_ds_connect check the contents
of clp->cl_exchange_flags.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Andy Adamson <andros@netapp.com>
2012-05-23 15:20:57 -04:00
Chuck Lever
acdeb69d9c NFS: EXCHANGE_ID should save the server major and minor ID
Save the server major and minor ID results from EXCHANGE_ID, as they
are needed for detecting server trunking.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-22 16:45:48 -04:00
Chuck Lever
f092075dd3 NFS: Always use the same SETCLIENTID boot verifier
Currently our NFS client assigns a unique SETCLIENTID boot verifier
for each server IP address it knows about.  It's set to CURRENT_TIME
when the struct nfs_client for that server IP is created.

During the SETCLIENTID operation, our client also presents an
nfs_client_id4 string to servers, as an identifier on which the server
can hang all of this client's NFSv4 state.  Our client's
nfs_client_id4 string is unique for each server IP address.

An NFSv4 server is obligated to wipe all NFSv4 state associated with
an nfs_client_id4 string when the client presents the same
nfs_client_id4 string along with a changed SETCLIENTID boot verifier.

When our client unmounts the last of a server's shares, it destroys
that server's struct nfs_client.  The next time the client mounts that
NFS server, it creates a fresh struct nfs_client with a fresh boot
verifier.  On seeing the fresh verifer, the server wipes any previous
NFSv4 state associated with that nfs_client_id4.

However, NFSv4.1 clients are supposed to present the same
nfs_client_id4 string to all servers.  And, to support Transparent
State Migration, the same nfs_client_id4 string should be presented
to all NFSv4.0 servers so they recognize that migrated state for this
client belongs with state a server may already have for this client.
(This is known as the Uniform Client String model).

If the nfs_client_id4 string is the same but the boot verifier changes
for each server IP address, SETCLIENTID and EXCHANGE_ID operations
from such a client could unintentionally result in a server wiping a
client's previously obtained lease.

Thus, if our NFS client is going to use a fixed nfs_client_id4 string,
either for NFSv4.0 or NFSv4.1 mounts, our NFS client should use a
boot verifier that does not change depending on server IP address.
Replace our current per-nfs_client boot verifier with a per-nfs_net
boot verifier.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-22 16:45:46 -04:00
Chuck Lever
2c820d9a97 NFS: Force server to drop NFSv4 state
nfs4_reset_all_state() refreshes the boot verifier a server sees to
trigger that server to wipe this client's state.  This function is
invoked when an NFSv4.1 server reports that it has revoked some or
all of a client's NFSv4 state.

To facilitate server trunking discovery, we will eventually want to
move the cl_boot_time field to a more global structure.  The Uniform
Client String model (and specifically, server trunking detection)
requires that all servers see the same boot verifier until the client
actually does reboot, and not a fresh verifier every time the client
unmounts and remounts the server.

Without the cl_boot_time field, however, nfs4_reset_all_state() will
have to find some other way to force the server to purge the client's
NFSv4 state.

Because these verifiers are opaque (ie, the server doesn't know or
care that they happen to be timestamps), we can force the server
to wipe NFSv4 state by updating the boot verifier as we do now, then
immediately afterwards establish a fresh client ID using the old boot
verifier again.

Hopefully there are no extra paranoid server implementations that keep
track of the client's boot verifiers and prevent clients from reusing
a previous one.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-22 16:45:45 -04:00
Chuck Lever
177313f149 NFS: Clean up return code checking in nfs4_proc_exchange_id()
Clean up: update to use matching types in "if" expressions.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-22 16:45:44 -04:00
Chuck Lever
591555465e NFS: Use proper naming conventions for nfs_client.impl_id field
Clean up:  When naming fields and data types, follow established
conventions to facilitate accurate grep/cscope searches.

Additionally, for consistency, move the impl_id field into the NFSv4-
specific part of the nfs_client, and free that memory in the logic
that shuts down NFSv4 nfs_clients.

Introduced by commit 7d2ed9ac "NFSv4: parse and display server
implementation ids," Fri Feb 17, 2012.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-22 16:45:43 -04:00
Chuck Lever
79d4e1f0d8 NFS: Use proper naming conventions for NFSv4.1 server scope fields
Clean up:  When naming fields and data types, follow established
conventions to facilitate accurate grep/cscope searches.

Additionally, for consistency, move the scope field into the NFSv4-
specific part of the nfs_client, and free that memory in the logic
that shuts down NFSv4 nfs_clients.

Introduced by commit 99fe60d0 "nfs41: exchange_id operation", April
1 2009.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-22 16:45:43 -04:00
Chuck Lever
c3607282b4 NFS: Don't swap bytes in nfs4_construct_boot_verifier()
The SETCLIENTID boot verifier is opaque to NFSv4 servers, thus there
is no requirement for byte swapping before the client puts the
verifier on the wire.

This treatment is similar to other timestamp-based verifiers.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-22 16:45:42 -04:00
Trond Myklebust
b3f87b98aa Merge branch 'bugfixes' into nfs-for-next 2012-05-21 10:12:39 -04:00
Andy Adamson
a033a09189 NFSv4.1 remove nfs4_reset_write and nfs4_reset_read
Replaced by filelayout_reset_write and filelayout_reset_read

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-19 17:54:59 -04:00
Bryan Schumaker
bae36241be NFS: Create a single nfs_get_root()
This patch splits out the NFS v4 specific functionality of
nfs4_get_root() into its own rpc_op called by the generic client, and
leaves nfs4_proc_get_rootfh() as its own stand alone function.  This
also allows me to change nfs4_remote_mount(), nfs4_xdev_mount() and
nfs4_remote_referral_mount() to use the generic client's nfs_get_root()
function.  Later patches in this series will collapse these functions
into one common function, so using the same get_root() function
everywhere simplifies future changes.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-14 17:30:26 -07:00
Bryan Schumaker
3028eb2b32 NFS: Rename nfs4_proc_get_root()
This function is really getting the root filehandle and not the root
dentry of the filesystem.  I also removed the rpc_ops lookup from
nfs4_get_rootfh() under the assumption that if we reach this function
then we already know we are using NFS v4.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-14 17:30:25 -07:00
Linus Torvalds
26fe575028 vfs: make it possible to access the dentry hash/len as one 64-bit entry
This allows comparing hash and len in one operation on 64-bit
architectures.  Right now only __d_lookup_rcu() takes advantage of this,
since that is the case we care most about.

The use of anonymous struct/unions hides the alternate 64-bit approach
from most users, the exception being a few cases where we initialize a
'struct qstr' with a static initializer.  This makes the problematic
cases use a new QSTR_INIT() helper function for that (but initializing
just the name pointer with a "{ .name = xyzzy }" initializer remains
valid, as does just copying another qstr structure).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-10 19:54:35 -07:00
Trond Myklebust
8582715e73 NFSv4: COMMIT does not need post-op attributes
No attributes are supposed to change during a COMMIT call, so there
is no need to request post-op attributes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-01 15:42:45 -04:00
Trond Myklebust
5a37f85131 NFSv4: Don't request cache consistency attributes on some writes
We don't need cache consistency information when we're doing O_DIRECT
writes. Ditto for the case of delegated writes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-01 15:42:45 -04:00
Trond Myklebust
778d28172f NFSv4: Simplify the NFSv4 REMOVE, LINK and RENAME compounds
Get rid of the post-op GETATTR on the directory in order to reduce
the amount of processing done on the server.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-01 15:42:44 -04:00
Trond Myklebust
7c317fcfba NFSv4: Simplify the NFSv4 CREATE compound
Get rid of the post-op GETATTR on the directory in order to reduce
the amount of processing done on the server.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-01 15:42:44 -04:00
Trond Myklebust
90ff0c548d NFSv4: Simplify the NFSv4 OPEN compound
Get rid of the post-op GETATTR on the directory in order to reduce
the amount of processing done on the server.

The cost is that if we later need to stat() the directory, then we
know that the ctime and mtime are likely to be invalid.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-01 15:42:43 -04:00
Trond Myklebust
e144cbcc25 NFSv4: Retrieve attributes _before_ calling delegreturn
In order to retrieve cache consistency attributes before
anyone else has a chance to change the inode, we need to
put the GETATTR op _before_ the DELEGRETURN op.

We can then use that as part of a 'nfs_post_op_update_inode_force_wcc()'
call, to ensure that we update the attributes without clearing our
cached data.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-01 15:42:40 -04:00
Trond Myklebust
9e907fec6e NFSv4: Delegreturn only needs the cache consistency bitmask
In order to do close-to-open cache consistency checking after
a delegreturn, we don't need to retrieve the full set of
attributes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-05-01 15:42:40 -04:00
Trond Myklebust
3617e5031b NFSv4.1: Use the correct hostname in the client identifier string
We need to use the hostname of the process that created the nfs_client.
That hostname is now stored in the rpc_client->cl_nodename.

Also remove the utsname()->domainname component. There is no reason
to include the NIS/YP domainname in a client identifier string.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-30 12:04:58 -04:00
Bryan Schumaker
80a16b21a8 NFS: Remove extra rpc_clnt argument to proc_lookup
Now that I'm doing secinfo automatically in the v4 code this extra
argument isn't needed.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-27 14:10:39 -04:00
Bryan Schumaker
281cad46b3 NFS: Create a submount rpc_op
This simplifies the code for v2 and v3 and gives v4 a chance to decide
on referrals without needing to modify the generic client.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-27 14:10:39 -04:00
Bryan Schumaker
2671bfc3be NFS: Remove secinfo knowledge out of the generic client
And also remove the unneeded rpc_op.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-27 14:10:39 -04:00
Fred Isaman
6c75dc0d49 NFS: merge _full and _partial write rpc_ops
Decouple nfs_pgio_header and nfs_write_data, and have (possibly
multiple) nfs_write_datas each take a refcount on nfs_pgio_header.

For the moment keeps nfs_write_header as a way to preallocate a single
nfs_write_data with the nfs_pgio_header.  The code doesn't need this,
and would be prettier without, but given the amount of churn I am
already introducing I didn't want to play with tuning new mempools.

This also fixes bug in pnfs_ld_handle_write_error.  In the case of
desc->pg_bsize < PAGE_CACHE_SIZE, the pages list was empty, causing
replay attempt to do nothing.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-27 14:10:37 -04:00
Fred Isaman
4db6e0b74c NFS: merge _full and _partial read rpc_ops
Decouple nfs_pgio_header and nfs_read_data, and have (possibly
multiple) nfs_read_datas each take a refcount on nfs_pgio_header.

For the moment keeps nfs_read_header as a way to preallocate a single
nfs_read_data with the nfs_pgio_header.  The code doesn't need this,
and would be prettier without, but given the amount of churn I am
already introducing I didn't want to play with tuning new mempools.

This also fixes bug in pnfs_ld_handle_read_error.  In the case of
desc->pg_bsize < PAGE_CACHE_SIZE, the pages list was empty, causing
replay attempt to do nothing.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-27 14:10:37 -04:00
Fred Isaman
cd841605f7 NFS: create common nfs_pgio_header for both read and write
In order to avoid duplicating all the data in nfs_read_data whenever we
split it up into multiple RPC calls (either due to a short read result
or due to rsize < PAGE_SIZE), we split out the bits that are the same
per RPC call into a separate "header" structure.

The goal this patch moves towards is to have a single header
refcounted by several rpc_data structures.  Thus, want to always refer
from rpc_data to the header, and not the other way.  This patch comes
close to that ideal, but the directio code currently needs some
special casing, isolated in the nfs_direct_[read_write]hdr_release()
functions.  This will be dealt with in a future patch.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-27 14:10:37 -04:00
Fred Isaman
0b7c01533a NFS: add a struct nfs_commit_data to replace nfs_write_data in commits
Commits don't need the vectors of pages, etc. that writes do. Split out
a separate structure for the commit operation.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-27 14:10:37 -04:00
Bryan Schumaker
f05d147f7e NFS: Fix following referral mount points with different security
I create a new proc_lookup_mountpoint() to use when submounting an NFS
v4 share.  This function returns an rpc_clnt to use for performing an
fs_locations() call on a referral's mountpoint.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-27 14:10:02 -04:00
Bryan Schumaker
72de53ec4b NFS: Do secinfo as part of lookup
Whenever lookup sees wrongsec do a secinfo and retry the lookup to find
attributes of the file or directory, such as "is this a referral
mountpoint?".  This also allows me to remove handling -NFS4ERR_WRONSEC
as part of getattr xdr decoding.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-27 14:10:02 -04:00
Bryan Schumaker
db0a9593d5 NFS: Handle exceptions coming out of nfs4_proc_fs_locations()
We don't want to return -NFS4ERR_WRONGSEC to the VFS because it could
cause the kernel to oops.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-27 14:10:01 -04:00
Sachin Prabhu
5794d21ef4 Avoid beyond bounds copy while caching ACL
When attempting to cache ACLs returned from the server, if the bitmap
size + the ACL size is greater than a PAGE_SIZE but the ACL size itself
is smaller than a PAGE_SIZE, we can read past the buffer page boundary.

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reported-by: Jian Li <jiali@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-27 14:09:53 -04:00
Sachin Prabhu
5a00689930 Avoid reading past buffer when calling GETACL
Bug noticed in commit
bf118a342f

When calling GETACL, if the size of the bitmap array, the length
attribute and the acl returned by the server is greater than the
allocated buffer(args.acl_len), we can Oops with a General Protection
fault at _copy_from_pages() when we attempt to read past the pages
allocated.

This patch allocates an extra PAGE for the bitmap and checks to see that
the bitmap + attribute_length + ACLs don't exceed the buffer space
allocated to it.

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reported-by: Jian Li <jiali@redhat.com>
[Trond: Fixed a size_t vs unsigned int printk() warning]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-27 13:15:07 -04:00
Trond Myklebust
95b72eb0bd NFSv4: Ensure we do not reuse open owner names
The NFSv4 spec is ambiguous about whether or not it is permissible
to reuse open owner names, so play it safe. This patch adds a timestamp
to the state_owner structure, and combines that with the IDA based
uniquifier.
Fixes a regression whereby the Linux server returns NFS4ERR_BAD_SEQID.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-04-20 23:14:28 -04:00
Trond Myklebust
451146be93 NFSv4: Fix open(O_TRUNC) and ftruncate() error handling
If the file wasn't opened for writing, then truncate and ftruncate
need to report the appropriate errors.

Reported-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-04-19 13:23:09 -04:00
Trond Myklebust
55725513b5 NFSv4: Ensure that we check lock exclusive/shared type against open modes
Since we may be simulating flock() locks using NFS byte range locks,
we can't rely on the VFS having checked the file open mode for us.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-04-19 13:23:08 -04:00
Trond Myklebust
05ffe24f52 NFSv4: Ensure that the LOCK code sets exception->inode
All callers of nfs4_handle_exception() that need to handle
NFS4ERR_OPENMODE correctly should set exception->inode

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-04-19 13:23:00 -04:00
Trond Myklebust
14977489ff NFSv4: Minor cleanups for nfs4_handle_exception and nfs4_async_handle_error
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-27 21:53:14 -04:00
Trond Myklebust
e59d27e05a NFSv4.1: Fix layoutcommit error handling
Firstly, task->tk_status will always return negative error values,
so the current tests for 'NFS4ERR_DELEG_REVOKED' etc. are all being
ignored.
Secondly, clean up the code so that we only need to test
task->tk_status once!

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-03-27 21:53:14 -04:00
Trond Myklebust
05e9cfb408 NFSv4: Fix two infinite loops in the mount code
We can currently loop forever in nfs4_lookup_root() and in
nfs41_proc_secinfo_no_name(), if the first iteration returns a
NFS4ERR_DELAY or something else that causes exception.retry to get
set.

Reported-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-03-27 21:53:14 -04:00
Sachin Prabhu
20e0fa98b7 Fix length of buffer copied in __nfs4_get_acl_uncached
_copy_from_pages() used to copy data from the temporary buffer to the
user passed buffer is passed the wrong size parameter when copying
data. res.acl_len contains both the bitmap and acl lenghts while
acl_len contains the acl length after adjusting for the bitmap size.

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-24 14:33:26 -04:00
Linus Torvalds
f63d395d47 NFS client updates for Linux 3.4
New features include:
 - Add NFS client support for containers.
   This should enable most of the necessary functionality, including
   lockd support, and support for rpc.statd, NFSv4 idmapper and
   RPCSEC_GSS upcalls into the correct network namespace from
   which the mount system call was issued.
 - NFSv4 idmapper scalability improvements
   Base the idmapper cache on the keyring interface to allow concurrent
   access to idmapper entries. Start the process of migrating users from
   the single-threaded daemon-based approach to the multi-threaded
   request-key based approach.
 - NFSv4.1 implementation id.
   Allows the NFSv4.1 client and server to mutually identify each other
   for logging and debugging purposes.
 - Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of
   having to use the more counterintuitive 'vers=4,minorversion=1'.
 - SUNRPC tracepoints.
   Start the process of adding tracepoints in order to improve debugging
   of the RPC layer.
 - pNFS object layout support for autologin.
 
 Important bugfixes include:
 - Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to fail
   to wake up all tasks when applied to priority waitqueues.
 - Ensure that we handle read delegations correctly, when we try to
   truncate a file.
 - A number of fixes for NFSv4 state manager loops (mostly to do with
   delegation recovery).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJPalZbAAoJEGcL54qWCgDyCi4P+QHcmzQhJO7HWx3Pzjs67bFT
 xMSYaKHGWS4AJKUBVl5OKBxUExfrMHBNbElV3IKUIwBlDx8RVtnwfptKSe146iki
 dn4TrRO5es8nmI4hRDcGMlzJDZq4y0Qg//qiUFmojiNW/Avw0ljfMoVUejJJ09FV
 oeDk4EGtcxkEyH+g48ZjYbyspRnG8qtD3atf70Z3lYE0ELdG/B5Dyzw1RDrA5p73
 xJX3lqy8p/4ROzw/dmNoxdAXOrr3Q4/T58Bvp/lUglPy/EHyPmWzFoH0MU0C/PFu
 5VnAl6QDbNCTcIw9FvJlX/mIyErpNG9eKzUskUc9L9SA+B+J/i4rIap4KATRN3nH
 7QhE5qUacPuJnvxml7MPmlQTuft3fkAQ7NhKIWrbRi1QS9FmJC5NxctIb8loqlFn
 yIXdKeLfMshB+NyuFS9uzStX7SmV3eMgVd+5ZxRjYxm+PKJLw2KXeudArL6M5mHK
 3QeKZpqwaYQ3RfaTNpvAp0doiXHCO5UbWfI0Pe8xQs/QcMCNReffqV2G4IJKFAu6
 WpoN2UDQC9LCBifLw2nS7kku8+ZVXLQU8OC1NVl3TG15xD9cNLXuk3/y5llPGq4O
 odo52uLFpJohbDaHMj5RTKOfchTQCm2iyuVmxZEeAySypMSiAXmW7COSKHs/HxI1
 VBm+EI00Pvmm5+fUjIlp
 =LuHE
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates for Linux 3.4 from Trond Myklebust:
 "New features include:
   - Add NFS client support for containers.

     This should enable most of the necessary functionality, including
     lockd support, and support for rpc.statd, NFSv4 idmapper and
     RPCSEC_GSS upcalls into the correct network namespace from which
     the mount system call was issued.

   - NFSv4 idmapper scalability improvements

     Base the idmapper cache on the keyring interface to allow
     concurrent access to idmapper entries.  Start the process of
     migrating users from the single-threaded daemon-based approach to
     the multi-threaded request-key based approach.

   - NFSv4.1 implementation id.

     Allows the NFSv4.1 client and server to mutually identify each
     other for logging and debugging purposes.

   - Support the 'vers=4.1' mount option for mounting NFSv4.1 instead of
     having to use the more counterintuitive 'vers=4,minorversion=1'.

   - SUNRPC tracepoints.

     Start the process of adding tracepoints in order to improve
     debugging of the RPC layer.

   - pNFS object layout support for autologin.

  Important bugfixes include:

   - Fix a bug in rpc_wake_up/rpc_wake_up_status that caused them to
     fail to wake up all tasks when applied to priority waitqueues.

   - Ensure that we handle read delegations correctly, when we try to
     truncate a file.

   - A number of fixes for NFSv4 state manager loops (mostly to do with
     delegation recovery)."

* tag 'nfs-for-3.4-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (224 commits)
  NFS: fix sb->s_id in nfs debug prints
  xprtrdma: Remove assumption that each segment is <= PAGE_SIZE
  xprtrdma: The transport should not bug-check when a dup reply is received
  pnfs-obj: autologin: Add support for protocol autologin
  NFS: Remove nfs4_setup_sequence from generic rename code
  NFS: Remove nfs4_setup_sequence from generic unlink code
  NFS: Remove nfs4_setup_sequence from generic read code
  NFS: Remove nfs4_setup_sequence from generic write code
  NFS: Fix more NFS debug related build warnings
  SUNRPC/LOCKD: Fix build warnings when CONFIG_SUNRPC_DEBUG is undefined
  nfs: non void functions must return a value
  SUNRPC: Kill compiler warning when RPC_DEBUG is unset
  SUNRPC/NFS: Add Kbuild dependencies for NFS_DEBUG/RPC_DEBUG
  NFS: Use cond_resched_lock() to reduce latencies in the commit scans
  NFSv4: It is not safe to dereference lsp->ls_state in release_lockowner
  NFS: ncommit count is being double decremented
  SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up()
  Try using machine credentials for RENEW calls
  NFSv4.1: Fix a few issues in filelayout_commit_pagelist
  NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit code
  ...
2012-03-23 08:53:47 -07:00
Bryan Schumaker
c6bfa1a163 NFS: Remove nfs4_setup_sequence from generic rename code
This is an NFS v4 specific operation, so it belongs in the NFS v4 code
and not the generic client.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-21 09:31:46 -04:00
Bryan Schumaker
34e137cc7e NFS: Remove nfs4_setup_sequence from generic unlink code
This is an NFS v4 specific operation, so it belongs in the NFS v4 code
and not the generic client.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-21 09:31:46 -04:00
Bryan Schumaker
ea7c330362 NFS: Remove nfs4_setup_sequence from generic read code
This is an NFS v4 specific operation, so it belongs in the NFS v4 code
and not the generic client.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-21 09:31:45 -04:00
Bryan Schumaker
c6cb80d00b NFS: Remove nfs4_setup_sequence from generic write code
This is an NFS v4 specific operation, so it belongs in the NFS v4 code
and not the generic client.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-21 09:31:45 -04:00
Trond Myklebust
5ae67c4fee NFSv4: It is not safe to dereference lsp->ls_state in release_lockowner
It is quite possible for the release_lockowner RPC call to race with the
close RPC call, in which case, we cannot dereference lsp->ls_state in
order to find the nfs_server.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-20 13:08:25 -04:00
Cong Wang
2b86ce2db3 nfs: remove the second argument of k[un]map_atomic()
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20 21:48:24 +08:00
Trond Myklebust
9a3ba43233 NFSv4: Rate limit the state manager warning messages
Prevent the state manager from filling up system logs when recovery
fails on the server.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-03-12 18:15:22 -04:00
Trond Myklebust
17280175c5 NFS: Fix a number of sparse warnings
Fix a number of "warning: symbol 'foo' was not declared. Should it be
static?" conditions.

Fix 2 cases of "warning: Using plain integer as NULL pointer"

fs/nfs/delegation.c:263:31: warning: restricted fmode_t degrades to integer
  - We want to allow upgrades to a WRITE delegation, but should otherwise
    consider servers that hand out duplicate delegations to be borken.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-11 15:14:16 -04:00
Trond Myklebust
4fc8796d23 NFSv4: Clean up nfs4_select_rw_stateid()
Ensure that we select delegation stateids first, then
lock stateids and then open stateids.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-08 22:38:55 -05:00
Trond Myklebust
0032a7a749 NFS: Don't copy read delegation stateids in setattr
The server will just return an NFS4ERR_OPENMODE anyway.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-08 22:37:12 -05:00
Trond Myklebust
3114ea7a24 NFSv4: Return the delegation if the server returns NFS4ERR_OPENMODE
If a setattr() fails because of an NFS4ERR_OPENMODE error, it is
probably due to us holding a read delegation. Ensure that the
recovery routines return that delegation in this case.

Reported-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-03-07 17:11:19 -05:00
Trond Myklebust
cf470c3e00 NFSv4: Don't free the nfs4_lock_state until after the release_lockowner
Otherwise we can end up with sequence id problems if the client reuses
the owner_id before the server has processed the release_lockowner

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-07 13:49:12 -05:00
Chuck Lever
cd93710e8d NFS: Fix nfs4_verifier memory alignment
Clean up due to code review.

The nfs4_verifier's data field is not guaranteed to be u32-aligned.
Casting an array of chars to a u32 * is considered generally
hazardous.

Fix this by using a __be32 array to generate a verifier's contents,
and then byte-copy the contents into the verifier field.  The contents
of a verifier, for all intents and purposes, are opaque bytes.  Only
local code that generates a verifier need know the actual content and
format.  Everyone else compares the full byte array for exact
equality.

Also, sizeof(nfs4_verifer) is the size of the in-core verifier data
structure, but NFS4_VERIFIER_SIZE is the number of octets in an XDR'd
verifier.  The two are not interchangeable, even if they happen to
have the same value.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-06 10:32:48 -05:00
Trond Myklebust
2d2f24add1 NFSv4: Simplify the struct nfs4_stateid
Replace the union with the common struct stateid4 as defined in both
RFC3530 and RFC5661. This makes it easier to access the sequence id,
which will again make implementing support for parallel OPEN calls
easier.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-06 10:32:47 -05:00
Trond Myklebust
f597c53790 NFSv4: Add helpers for basic copying of stateids
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-06 10:32:46 -05:00
Trond Myklebust
1e3987c305 NFSv4: Rename nfs4_copy_stateid()
It is really a function for selecting the correct stateid to use in a
read or write situation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-06 10:32:46 -05:00
Trond Myklebust
36281caa83 NFSv4: Further clean-ups of delegation stateid validation
Change the name to reflect what we're really doing: testing two
stateids for whether or not they match according the the rules in
RFC3530 and RFC5661.
Move the code from callback_proc.c to nfs4proc.c

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-06 10:32:44 -05:00
Trond Myklebust
a1d0b5eebc NFS: Properly handle the case where the delegation is revoked
If we know that the delegation stateid is bad or revoked, we need to
remove that delegation as soon as possible, and then mark all the
stateids that relied on that delegation for recovery. We cannot use
the delegation as part of the recovery process.

Also note that NFSv4.1 uses a different error code (NFS4ERR_DELEG_REVOKED)
to indicate that the delegation was revoked.

Finally, ensure that setlk() and setattr() can both recover safely from
a revoked delegation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
2012-03-06 10:32:43 -05:00
Trond Myklebust
8aa0a410af Merge commit 'nfs-for-3.3-4' into nfs-for-next
Conflicts:
	fs/nfs/nfs4proc.c

Back-merge of the upstream kernel in order to fix a conflict with the
slotid type conversion and implementation id patches...
2012-03-03 15:05:56 -05:00
Chuck Lever
264e6351c5 NFS: Request fh_expire_type attribute in "server caps" operation
The fh_expire_type file attribute is a filesystem wide attribute that
consists of flags that indicate what characteristics file handles
on this FSID have.

Our client doesn't support volatile file handles.  It should find
out early (say, at mount time) whether the server is going to play
shenanighans with file handles during a migration.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-02 17:18:10 -05:00
Chuck Lever
81934ddb8e NFS: Introduce NFS_ATTR_FATTR_V4_LOCATIONS
The Linux NFS client must distinguish between referral events (which
it currently supports) and migration events (which it does not yet
support).

In both types of events, an fs_locations array is returned.  But upper
layers, not the XDR layer, should make the distinction between a
referral and a migration.  There really isn't a way for an XDR decoder
function to distinguish the two, in general.

Slightly adjust the FATTR flags returned by decode_fs_locations()
to set NFS_ATTR_FATTR_V4_LOCATIONS only if a non-empty locations
array was returned from the server.  Then have logic in nfs4proc.c
distinguish whether the locations array is for a referral or
something else.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-02 17:18:09 -05:00
Trond Myklebust
4e0038b6b2 SUNRPC: Move clnt->cl_server into struct rpc_xprt
When the cl_xprt field is updated, the cl_server field will also have
to change.  Since the contents of cl_server follow the remote endpoint
of cl_xprt, just move that field to the rpc_xprt.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
[ cel: simplify check_gss_callback_principal(), whitespace changes ]
[ cel: forward ported to 3.4 ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-02 15:36:41 -05:00
Trond Myklebust
2446ab6070 SUNRPC: Use RCU to dereference the rpc_clnt.cl_xprt field
A migration event will replace the rpc_xprt used by an rpc_clnt.  To
ensure this can be done safely, all references to cl_xprt must now use
a form of rcu_dereference().

Special care is taken with rpc_peeraddr2str(), which returns a pointer
to memory whose lifetime is the same as the rpc_xprt.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
[ cel: fix lockdep splats and layering violations ]
[ cel: forward ported to 3.4 ]
[ cel: remove rpc_max_reqs(), add rpc_net_ns() ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-02 15:36:38 -05:00
Chuck Lever
a3ca5651cb NFS: Add debugging messages to NFSv4's CLOSE procedure
CLOSE is new with NFSv4.  Sometimes it's important to know the timing
of this operation compared to things like lease renewal.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-02 15:36:34 -05:00
Weston Andros Adamson
7d2ed9ac22 NFSv4: parse and display server implementation ids
Shows the implementation ids in /proc/self/mountstats.  This doesn't break
the nfs-utils mountstats tool.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-01 17:10:22 -05:00
Weston Andros Adamson
9edbd953f8 NFSv4: fix server_scope memory leak
server_scope would never be freed if nfs4_check_cl_exchange_flags() returned
non-zero

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-03-01 17:10:21 -05:00
Weston Andros Adamson
abe9a6d57b NFSv4: fix server_scope memory leak
server_scope would never be freed if nfs4_check_cl_exchange_flags() returned
non-zero

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-17 17:34:03 -05:00
Trond Myklebust
f86f36a6ae NFSv4.1: Fix a NFSv4.1 session initialisation regression
Commit aacd553 (NFSv4.1: cleanup init and reset of session slot tables)
introduces a regression in the session initialisation code. New tables
now find their sequence ids initialised to 0, rather than the mandated
value of 1 (see RFC5661).

Fix the problem by merging nfs4_reset_slot_table() and nfs4_init_slot_table().
Since the tbl->max_slots is initialised to 0, the test in
nfs4_reset_slot_table for max_reqs != tbl->max_slots will automatically
pass for an empty table.

Reported-by: Vitaliy Gusev <gusev.vitaliy@nexenta.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-17 17:33:39 -05:00
Vitaliy Gusev
b4b9a0c1c8 nfs41: Verify channel's attributes accordingly to RFC v2
ca_maxoperations:

      For the backchannel, the server MUST
      NOT change the value the client offers.  For the fore channel,
      the server MAY change the requested value.

  ca_maxrequests:

       For the backchannel, the server MUST NOT change the
       value the client offers.  For the fore channel, the server MAY
       change the requested value.

Signed-off-by: Vitaliy Gusev <gusev.vitaliy@nexenta.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-15 11:16:11 -05:00
Trond Myklebust
ef159e9177 NFSv4.1: Add a module parameter to set the number of session slots
Add the module parameter 'max_session_slots' to set the initial number
of slots that the NFSv4.1 client will attempt to negotiate with the
server.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-15 00:19:44 -05:00
Trond Myklebust
45d43c291e NFSv4.1: Convert slotid from u8 to u32
It is perfectly legal to negotiate up to 2^32-1 slots in the protocol,
and with 10GigE, we are already seeing that 255 slots is far too limiting.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-15 00:19:43 -05:00
Weston Andros Adamson
a030889a01 NFS: start printks w/ NFS: even if __func__ shown
This patch addresses printks that have some context to show that they are
from fs/nfs/, but for the sake of consistency now start with NFS:

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-06 18:48:00 -05:00
Weston Andros Adamson
f9fd2d9c1f NFS: printks in fs/nfs/ should start with NFS:
Messages like "Got error -10052 from the server on DESTROY_SESSION. Session
has been destroyed regardless" can be confusing to users who aren't very
familiar with NFS.

NOTE: This patch ignores any printks() that start by printing __func__ - that
will be in a separate patch.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-06 18:47:59 -05:00
Bryan Schumaker
b01dd1d8fa NFS: Call test_stateid() and free_stateid() with correct stateids
I noticed that test_stateid() was always using the same stateid for open
and lock recovery.  After poking around a bit, I discovered that it was
always testing with a delegation stateid (even if there was no
delegation present).  I figured this wasn't correct, so now delegation
and open stateids are tested during open_expired() and lock stateids are
tested during lock_expired().

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-06 18:47:58 -05:00
Bryan Schumaker
1cab0652ba NFS: Pass a stateid to test_stateid() and free_stateid()
This takes the guesswork out of what stateid to use.  The caller is
expected to figure this out and pass in the correct one.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-02-06 18:47:58 -05:00
Trond Myklebust
331818f1c4 NFSv4: Fix an Oops in the NFSv4 getacl code
Commit bf118a342f (NFSv4: include bitmap
in nfsv4 get acl data) introduces the 'acl_scratch' page for the case
where we may need to decode multi-page data. However it fails to take
into account the fact that the variable may be NULL (for the case where
we're not doing multi-page decode), and it also attaches it to the
encoding xdr_stream rather than the decoding one.

The immediate result is an Oops in nfs4_xdr_enc_getacl due to the
call to page_address() with a NULL page pointer.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Andy Adamson <andros@netapp.com>
Cc: stable@vger.kernel.org
2012-02-03 18:50:34 -05:00
Trond Myklebust
a4980e7840 NFSv4: ACCESS validation doesn't require a full attribute refresh
We only really need to check the change attribute, so let's just use the
server->cache_consistency_bitmask.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:24 -05:00
Trond Myklebust
961a828df6 SUNRPC: Fix potential races in xprt_lock_write_next()
We have to ensure that the wake up from the waitqueue and the assignment
of xprt->snd_task are atomic. We can do this by assigning the snd_task
while under the waitqueue spinlock.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:08 -05:00
Trond Myklebust
536e43d12b NFS: Optimise away unnecessary setattrs for open(O_TRUNC);
Currently, we will correctly optimise away a truncate that doesn't
change the file size. However, in the case of open(O_TRUNC), we
also want to optimise away the time changes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:07 -05:00
Trond Myklebust
48c22eb210 NFS: Move struct nfs_unique_id into struct nfs_seqid_counter
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:07 -05:00
Trond Myklebust
9d12b216aa NFSv41: Add a new helper nfs4_init_sequence()
Clean up

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 19:28:06 -05:00
Trond Myklebust
d2d7ce28a2 NFSv4: Replace lock_owner->ld_id with an ida based allocator
Again, We're unlikely to ever need more than 2^31 simultaneous lock
owners, so let's replace the custom allocator.

Now that there are no more users, we can also get rid of the custom
allocator code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 18:20:28 -05:00
Trond Myklebust
9157c31dd6 NFSv4: Replace state_owner->so_owner_id with an ida based allocator
We're unlikely to ever need more than 2^31 simultaneous open owners,
so let's replace the custom allocator with the generic ida allocator.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 18:20:28 -05:00
Trond Myklebust
d1e284d50a NFSv4: Clean up nfs4_get_state_owner
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-31 18:20:28 -05:00
Peng Tao
de040beccd NFS4: fix compile warnings in nfs4proc.c
compile in nfs-for-3.3 branch shows following warnings. Fix it here.

fs/nfs/nfs4proc.c: In function ‘__nfs4_get_acl_uncached’:
fs/nfs/nfs4proc.c:3589: warning: format ‘%ld’ expects type ‘long int’, but argument 4 has type ‘size_t’
fs/nfs/nfs4proc.c:3589: warning: format ‘%ld’ expects type ‘long int’, but argument 6 has type ‘size_t’

Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-12 16:31:51 -05:00
Linus Torvalds
57eccf1c2a Merge branch 'nfs-for-3.3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
* 'nfs-for-3.3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv4: Change the default setting of the nfs4_disable_idmapping parameter
  NFSv4: Save the owner/group name string when doing open
  NFS: Remove pNFS bloat from the generic write path
  pnfs-obj: Must return layout on IO error
  pnfs-obj: pNFS errors are communicated on iodata->pnfs_error
  NFS: Cache state owners after files are closed
  NFS: Clean up nfs4_find_state_owners_locked()
  NFSv4: include bitmap in nfsv4 get acl data
  nfs: fix a minor do_div portability issue
  NFSv4.1: cleanup comment and debug printk
  NFSv4.1: change nfs4_free_slot parameters for dynamic slots
  NFSv4.1: cleanup init and reset of session slot tables
  NFSv4.1: fix backchannel slotid off-by-one bug
  nfs: fix regression in handling of context= option in NFSv4
  NFS - fix recent breakage to NFS error handling.
  NFS: Retry mounting NFSROOT
  SUNRPC: Clean up the RPCSEC_GSS service ticket requests
2012-01-10 14:57:40 -08:00
Trond Myklebust
6926afd192 NFSv4: Save the owner/group name string when doing open
...so that we can do the uid/gid mapping outside the asynchronous RPC
context.
This fixes a bug in the current NFSv4 atomic open code where the client
isn't able to determine what the true uid/gid fields of the file are,
(because the asynchronous nature of the OPEN call denies it the ability
to do an upcall) and so fills them with default values, marking the
inode as needing revalidation.
Unfortunately, in some cases, the VFS will do some additional sanity
checks on the file, and may override the server's decision to allow
the open because it sees the wrong owner/group fields.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-07 13:22:46 -05:00
Andy Adamson
bf118a342f NFSv4: include bitmap in nfsv4 get acl data
The NFSv4 bitmap size is unbounded: a server can return an arbitrary
sized bitmap in an FATTR4_WORD0_ACL request.  Replace using the
nfs4_fattr_bitmap_maxsz as a guess to the maximum bitmask returned by a server
with the inclusion of the bitmap (xdr length plus bitmasks) and the acl data
xdr length to the (cached) acl page data.

This is a general solution to commit e5012d1f "NFSv4.1: update
nfs4_fattr_bitmap_maxsz" and fixes hitting a BUG_ON in xdr_shrink_bufhead
when getting ACLs.

Fix a bug in decode_getacl that returned -EINVAL on ACLs > page when getxattr
was called with a NULL buffer, preventing ACL > PAGE_SIZE from being retrieved.

Cc: stable@kernel.org
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-05 10:42:42 -05:00
Andy Adamson
0b1c8fc43c NFSv4.1: cleanup comment and debug printk
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-05 10:42:41 -05:00
Andy Adamson
aabd0b40b3 NFSv4.1: change nfs4_free_slot parameters for dynamic slots
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-05 10:42:41 -05:00
Andy Adamson
aacd553727 NFSv4.1: cleanup init and reset of session slot tables
We are either initializing or resetting a session. Initialize or reset
the session slot tables accordingly.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-05 10:42:40 -05:00
Rafael J. Wysocki
b00f4dc5ff Merge branch 'master' into pm-sleep
* master: (848 commits)
  SELinux: Fix RCU deref check warning in sel_netport_insert()
  binary_sysctl(): fix memory leak
  mm/vmalloc.c: remove static declaration of va from __get_vm_area_node
  ipmi_watchdog: restore settings when BMC reset
  oom: fix integer overflow of points in oom_badness
  memcg: keep root group unchanged if creation fails
  nilfs2: potential integer overflow in nilfs_ioctl_clean_segments()
  nilfs2: unbreak compat ioctl
  cpusets: stall when updating mems_allowed for mempolicy or disjoint nodemask
  evm: prevent racing during tfm allocation
  evm: key must be set once during initialization
  mmc: vub300: fix type of firmware_rom_wait_states module parameter
  Revert "mmc: enable runtime PM by default"
  mmc: sdhci: remove "state" argument from sdhci_suspend_host
  x86, dumpstack: Fix code bytes breakage due to missing KERN_CONT
  IB/qib: Correct sense on freectxts increment and decrement
  RDMA/cma: Verify private data length
  cgroups: fix a css_set not found bug in cgroup_attach_proc
  oprofile: Fix uninitialized memory access when writing to writing to oprofilefs
  Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel"
  ...

Conflicts:
	kernel/cgroup_freezer.c
2011-12-21 21:59:45 +01:00
Trond Myklebust
652f89f64f NFSv4: Do not accept delegated opens when a delegation recall is in effect
...and report the servers that try to return a delegation when the client
is using the CLAIM_DELEG_CUR open mode. That behaviour is explicitly
forbidden in RFC3530.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-12-09 19:05:58 -05:00
Jeff Layton
d310310cbf Freezer / sunrpc / NFS: don't allow TASK_KILLABLE sleeps to block the freezer
Allow the freezer to skip wait_on_bit_killable sleeps in the sunrpc
layer. This should allow suspend and hibernate events to proceed, even
when there are RPC's pending on the wire.

Also, wrap the TASK_KILLABLE sleeps in NFS layer in freezer_do_not_count
and freezer_count calls. This allows the freezer to skip tasks that are
sleeping while looping on EJUKEBOX or NFS4ERR_DELAY sorts of errors.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-12-06 22:12:27 +01:00
Trond Myklebust
a6f498a891 NFS: Fix a regression in the referral code
Fix a regression that was introduced by commit
0c2e53f11a (NFS: Remove the unused
"lookupfh()" version of nfs4_proc_lookup()).

In the case where the lookup gets an NFS4ERR_MOVED, we want to return
the result of nfs4_get_referral(). Instead, that value is getting
clobbered by the call to nfs4_handle_exception()...

Reported-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Tested-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-11-07 16:20:00 -05:00
Jeff Layton
1788ea6e3b nfs: when attempting to open a directory, fall back on normal lookup (try #5)
commit d953126 changed how nfs_atomic_lookup handles an -EISDIR return
from an OPEN call. Prior to that patch, that caused the client to fall
back to doing a normal lookup. When that patch went in, the code began
returning that error to userspace. The d_revalidate codepath however
never had the corresponding change, so it was still possible to end up
with a NULL ctx->state pointer after that.

That patch caused a regression. When we attempt to open a directory that
does not have a cached dentry, that open now errors out with EISDIR. If
you attempt the same open with a cached dentry, it will succeed.

Fix this by reverting the change in nfs_atomic_lookup and allowing
attempts to open directories to fall back to a normal lookup

Also, add a NFSv4-specific f_ops->open routine that just returns
-ENOTDIR. This should never be called if things are working properly,
but if it ever is, then the dprintk may help in debugging.

To facilitate this, a new file_operations field is also added to the
nfs_rpc_ops struct.

Cc: stable@kernel.org
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-11-04 16:39:04 -04:00
Peng Tao
92407e75ce nfs4: serialize layoutcommit
Current pnfs_layoutcommit_inode can not handle parallel layoutcommit.
And as Trond suggested , there is no need for client to optimize for
parallel layoutcommit. So add NFS_INO_LAYOUTCOMMITTING flag to
mark inflight layoutcommit and serialize lalyoutcommit with it.
Also mark_inode_dirty_sync if pnfs_layoutcommit_inode fails to issue
layoutcommit.

Reported-by: Vitaliy Gusev <gusev.vitaliy@nexenta.com>
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-31 11:51:28 -04:00
Trond Myklebust
d00c5d4386 NFS: Get rid of nfs_restart_rpc()
It can trivially be replaced with rpc_restart_call_prepare.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-19 13:58:30 -07:00
Trond Myklebust
08ef7bd3bc NFSv4: Translate NFS4ERR_BADNAME into ENOENT when applied to a lookup
Both LOOKUP and OPEN operations may return NFS4ERR_BADNAME if we send a
an invalid name as a filename argument. As far as the application is
concerned, it just has to know that the file doesn't exist, and so
ENOENT would be the appropriate reply. We should only return EINVAL
if the filename is being used to _create_ a new object on the
remote filesystem.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18 16:13:51 -07:00
Trond Myklebust
0c2e53f11a NFS: Remove the unused "lookupfh()" version of nfs4_proc_lookup()
...and also remove the associated nfs_v4_clientops entry.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18 16:13:51 -07:00
Trond Myklebust
a9a4a87a59 NFS: Use the inode->i_version to cache NFSv4 change attribute information
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-10-18 09:14:34 -07:00
Trond Myklebust
042b60beb4 NFSv4: renewd needs to be able to handle the NFS4ERR_CB_PATH_DOWN error
The NFSv4 spec does not specify that the server must repeat that error,
so in order to avoid having the delegations revoked, we should handle
it immediately.

Also note that NFS4ERR_CB_PATH_DOWN does in fact renew the lease...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-08-24 15:07:37 -04:00
Trond Myklebust
2f60ea6b8c NFSv4: The NFSv4.0 client must send RENEW calls if it holds a delegation
RFC3530 states that if the client holds a delegation, then it is obliged
to continue to send RENEW calls once every lease period in order to allow
the server to return NFS4ERR_CB_PATH_DOWN if the callback path is
unreachable.

This is not required for NFSv4.1, since the server can at any time set
the SEQ4_STATUS_CB_PATH_DOWN_SESSION in any SEQUENCE operation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-08-24 15:07:37 -04:00
Trond Myklebust
8534d4ec05 NFSv4: nfs4_proc_renew should be declared static
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-08-24 15:07:37 -04:00
Trond Myklebust
b569ad3492 NFSv4: nfs4_proc_async_renew should use a GFP_NOFS allocation
We shouldn't allow the renew daemon to do direct reclaim on the NFS
partition.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-08-24 15:07:35 -04:00
Andy Adamson
db29c08909 pnfs: cleanup_layoutcommit
This gives layout driver a chance to cleanup structures they put in at
encode_layoutcommit.

Signed-off-by: Andy Adamson <andros@netapp.com>
[fixup layout header pointer for layoutcommit]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
[rm inode and pnfs_layout_hdr args from cleanup_layoutcommit()]
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31 12:18:15 -04:00
Fred Isaman
dae100c2b1 pnfs: ask for layout_blksize and save it in nfs_server
Block layout needs it to determine IO size.

Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Tao Guo <glorioustao@gmail.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31 12:18:15 -04:00
Andy Adamson
7f11d8d38d pnfs: GETDEVICELIST
The block driver uses GETDEVICELIST

Signed-off-by: Andy Adamson <andros@netapp.com>
[pass struct nfs_server * to getdevicelist]
[get machince creds for getdevicelist]
[fix getdevicelist decode sizing]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31 12:18:15 -04:00
Peng Tao
a9bae5666d pnfs: let layoutcommit handle a list of lseg
There can be multiple lseg per file, so layoutcommit should be
able to handle it.

[Needed in v3.0]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jim Rees <rees@umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-31 12:18:15 -04:00
Stephen Rothwell
5f00bcb38e Merge branch 'master' into devel and apply fixup from Stephen Rothwell:
vfs/nfs: fixup for nfs_open_context change

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-25 14:53:52 -04:00
Al Viro
3d4ff43d89 nfs_open_context doesn't need struct path either
just dentry, please...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:44 -04:00
Al Viro
82a2c1b77a nfs4_opendata doesn't need struct path either
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:42 -04:00
Al Viro
643168c2dc nfs4_closedata doesn't need to mess with struct path
instead of path_get()/path_put(), we can just use nfs_sb_{,de}active()
to pin the superblock down.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-07-20 01:43:41 -04:00
Trond Myklebust
a56aaa02b1 NFSv4.1: Clean up layoutreturn
Since we take a reference to it, we really ought to pass the a pointer to
the layout header in the arguments instead of assuming that
NFS_I(inode)->layout will forever point to the correct object.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12 13:40:29 -04:00
Bryan Schumaker
f062eb6ced NFS: test and free stateids during recovery
When recovering open files and locks, the stateid should be tested
against the server and freed if it is invalid.  This patch adds new
recovery functions for NFS v4.1.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12 13:40:28 -04:00
Bryan Schumaker
9aeda35fd6 NFS: added FREE_STATEID call
FREE_STATEID is used to tell the server that we want to free a stateid
that no longer has any locks associated with it.  This allows the client
to reclaim locks without encountering edge conditions documented in
section 8.4.3 of RFC 5661.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12 13:40:28 -04:00
Bryan Schumaker
7d9747947a NFS: Added TEST_STATEID call
This patch adds in the xdr for doing a TEST_STATEID call with a single
stateid. RFC 5661 allows multiple stateids to be tested in a single
call, but only testing one keeps things simpler for now.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12 13:40:27 -04:00
Bryan Schumaker
fca78d6d2c NFS: Add SECINFO_NO_NAME procedure
If the client is using NFS v4.1, then we can use SECINFO_NO_NAME to find
the secflavor for the initial mount.  If the server doesn't support
SECINFO_NO_NAME then I fall back on the "guess and check" method used
for v4.0 mounts.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12 13:40:27 -04:00
Weston Andros Adamson
78fe0f41d9 NFS: use scope from exchange_id to skip reclaim
can be skipped if the "eir_server_scope" from the exchange_id proc differs from
previous calls.

Also, in the future server_scope will be useful for determining whether client
trunking is available

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-07-12 13:40:27 -04:00
Fred Isaman
a2e1d4f2e5 nfs4.1: fix several problems with _pnfs_return_layout
_pnfs_return_layout had the following problems:

- it did not call pnfs_free_lseg_list on all paths
- it unintentionally did a forgetful return when there was no outstanding io
- it raced with concurrent LAYOUTGETS

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-06-15 11:24:30 -04:00
Andy Adamson
533eb4611c NFSv4.1: allow nfs_fhget to succeed with mounted on fileid
Commit 28331a46d8 "Ensure we request the
ordinary fileid when doing readdirplus"
changed the meaning of NFS_ATTR_FATTR_FILEID which used to be set when
FATTR4_WORD1_MOUNTED_ON_FILED was requested.

Allow nfs_fhget to succeed with only a mounted on fileid when crossing
a mountpoint or a referral.

Ask for the fileid of the absent file system if mounted_on_fileid is not
supported.

Signed-off-by: Andy Adamson <andros@netapp.com>
cc:stable@kernel.org [2.6.39]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-06-15 11:24:29 -04:00
Benny Halevy
c9c30dd5f7 NFSv4.1: deprecate headerpadsz in CREATE_SESSION
We don't support header padding yet so better off ditching it

Reported-by: Sid Moore <learnmost@gmail.com>
Signed-off-by: Benny Halevy <benny@tonian.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-06-15 11:24:28 -04:00
Linus Torvalds
cd1acdf172 Merge branch 'pnfs-submit' of git://git.open-osd.org/linux-open-osd
* 'pnfs-submit' of git://git.open-osd.org/linux-open-osd: (32 commits)
  pnfs-obj: pg_test check for max_io_size
  NFSv4.1: define nfs_generic_pg_test
  NFSv4.1: use pnfs_generic_pg_test directly by layout driver
  NFSv4.1: change pg_test return type to bool
  NFSv4.1: unify pnfs_pageio_init functions
  pnfs-obj: objlayout_encode_layoutcommit implementation
  pnfs: encode_layoutcommit
  pnfs-obj: report errors and .encode_layoutreturn Implementation.
  pnfs: encode_layoutreturn
  pnfs: layoutret_on_setattr
  pnfs: layoutreturn
  pnfs-obj: osd raid engine read/write implementation
  pnfs: support for non-rpc layout drivers
  pnfs-obj: define per-inode private structure
  pnfs: alloc and free layout_hdr layoutdriver methods
  pnfs-obj: objio_osd device information retrieval and caching
  pnfs-obj: decode layout, alloc/free lseg
  pnfs-obj: pnfs_osd XDR client implementation
  pnfs-obj: pnfs_osd XDR definitions
  pnfs-obj: objlayoutdriver module skeleton
  ...
2011-05-29 14:10:13 -07:00
Benny Halevy
8a1636c459 pnfs: layoutret_on_setattr
With the objects layout security model, we have object capabilities
that are associated with the layout and we anticipate that the server
will issue a cb_layoutrecall for any setattr that changes security
related attributes (user/group/mode/acl) or truncates the file.

Therefore, the layout is returned before issuing the setattr to avoid
the anticipated cb_layoutrecall.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29 20:54:36 +03:00
Benny Halevy
cbe8260369 pnfs: layoutreturn
NFSv4.1 LAYOUTRETURN implementation

Currently, does not support layout-type payload encoding.

Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Zhang Jingwang <zhangjingwang@nrchpc.ac.cn>
[call pnfs_return_layout right before pnfs_destroy_layout]
[remove assert_spin_locked from pnfs_clear_lseg_list]
[remove wait parameter from the layoutreturn path.]
[remove return_type field from nfs4_layoutreturn_args]
[remove range from nfs4_layoutreturn_args]
[no need to send layoutcommit from _pnfs_return_layout]
[don't wait on sync layoutreturn]
[fix layout stateid in layoutreturn args]
[fixed NULL deref in _pnfs_return_layout]
[removed recaim member of nfs4_layoutreturn_args]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29 20:54:36 +03:00
Benny Halevy
d20581aa4b pnfs: support for non-rpc layout drivers
Non-rpc layout driver such as for objects and blocks
implement their own I/O path and error handling logic.
Therefore bypass NFS-based error handling for these layout drivers.

[fix lseg ref-count bugs, and null de-refs]
[Fall out from: non-rpc layout drivers]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
[get rid of PNFS_USE_RPC_CODE]
[get rid of __nfs4_write_done_cb]
[revert useless change in nfs4_write_done_cb]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29 20:53:51 +03:00
Trond Myklebust
0ced63d1a2 NFSv4: Handle expired stateids when the lease is still valid
Currently, if the server returns NFS4ERR_EXPIRED in reply to a READ or
WRITE, but the RENEW test determines that the lease is still active, we
fail to recover and end up looping forever in a READ/WRITE + RENEW death
spiral.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
2011-05-27 17:42:01 -04:00
Andy Adamson
a8a4ae3a89 NFSv41: Resend on NFS4ERR_RETRY_UNCACHED_REP
Free the slot and resend the RPC with new session <slot#,seq#>.

For nfs4_async_handle_error, return -EAGAIN and set the task->tk_status to 0
to restart the async rpc in the rpc_restart_call_prepare state which resets
the slot.

For nfs4_handle_exception, retrying a call that uses nfs4_call_sync will
reset the slot via nfs41_call_sync_prepare.

For open/close/lock/locku/delegreturn/layoutcommit/unlink/rename/write
cachethis is true, so these operations will not trigger an
NFS4ERR_RETRY_UNCACHED_REP.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-05-11 14:01:33 -04:00
Trond Myklebust
1bd714f2a1 NFSv4: Ensure that clientid and session establishment can time out
The following patch ensures that we do not get permanently trapped in
the RPC layer when trying to establish a new client id or session.
This again ensures that the state manager can finish in a timely
fashion when the last filesystem to reference the nfs_client exits.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-24 14:29:33 -04:00
Trond Myklebust
fd954ae124 NFSv4.1: Don't loop forever in nfs4_proc_create_session
If a server for some reason keeps sending NFS4ERR_DELAY errors, we can end
up looping forever inside nfs4_proc_create_session, and so the usual
mechanisms for detecting if the nfs_client is dead don't work.

Fix this by ensuring that we loop inside the nfs4_state_manager thread
instead.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-24 14:28:18 -04:00
Bryan Schumaker
fb8a5ba811 NFSv4: Handle NFS4ERR_WRONGSEC outside of nfs4_handle_exception()
I only want to try other secflavors during an initial mount if
NFS4ERR_WRONGSEC is returned.  nfs4_handle_exception() could
potentially map other errors to EPERM, so we should handle this
error specially for correctness.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-18 17:06:00 -04:00
Bryan Schumaker
468f86134e NFSv4.1: Don't update sequence number if rpc_task is not sent
If we fail to contact the gss upcall program, then no message will
be sent to the server.  The client still updated the sequence number,
however, and this lead to NFS4ERR_SEQ_MISMATCH for the next several
RPC calls.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-18 17:05:48 -04:00
Bryan Schumaker
9b7160c55a NFS: don't negotiate when user specifies sec flavor
We were always attempting sec flavor negotiation, even if the user
told us a specific sec flavor to use.  If that sec flavor fails,
we should return an error rather than continuing with sec flavor
negotiation.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-13 15:12:23 -04:00
Bryan Schumaker
801a16dc7b NFS: Attempt mount with default sec flavor first
nfs4_lookup_root() is already configured to use either RPC_AUTH_UNIX
or a user specified flavor (through -o sec=<whatever>).  We should
use this flavor first, and only attempt negotiation if it fails
with -EPERM.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-13 15:12:23 -04:00
Bryan Schumaker
0fabee243a NFS: flav_array honors NFS_MAX_SECFLAVORS
NFS_MAX_SECFLAVORS should already take into account RPC_AUTH_UNIX
and RPC_AUTH_NULL, so we don't need to set aside extra slots
for them.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-13 15:12:22 -04:00
Bryan Schumaker
d1a8016a2d NFS: Fix infinite loop in gss_create_upcall()
There can be an infinite loop if gss_create_upcall() is called without
the userspace program running.  To prevent this, we return -EACCES if
we notice that pipe_version hasn't changed (indicating that the pipe
has not been opened).

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-13 15:12:22 -04:00
Bryan Schumaker
37adb89fad NFS: Change initial mount authflavor only when server returns NFS4ERR_WRONGSEC
When attempting an initial mount, we should only attempt other
authflavors if AUTH_UNIX receives a NFS4ERR_WRONGSEC error.
This allows other errors to be passed back to userspace programs.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-04-07 13:19:40 -07:00
Trond Myklebust
0acd220192 Merge branch 'nfs-for-2.6.39' into nfs-for-next 2011-03-24 17:03:14 -04:00
Weston Andros Adamson
35124a0994 Cleanup XDR parsing for LAYOUTGET, GETDEVICEINFO
changes LAYOUTGET and GETDEVICEINFO XDR parsing to:
 - not use vmap, which doesn't work on incoherent archs
 - use xdr_stream parsing for all xdr

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-24 17:01:41 -04:00
Andy Adamson
ef31153786 NFSv4.1 convert layoutcommit sync to boolean
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-24 15:49:48 -04:00
Bryan Schumaker
8f70e95f9f NFS: Determine initial mount security
When sec=<something> is not presented as a mount option,
we should attempt to determine what security flavor the
server is using.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-24 13:52:42 -04:00
Bryan Schumaker
7ebb931598 NFS: use secinfo when crossing mountpoints
A submount may use different security than the parent
mount does.  We should figure out what sec flavor the
submount uses at mount time.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-24 13:52:42 -04:00
Bryan Schumaker
5a5ea0d485 NFS: Add secinfo procedure
This patch adds the nfs4 operation secinfo as a
valid nfs rpc operation.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-24 13:52:41 -04:00
Bryan Schumaker
7c5130588d NFS: lookup supports alternate client
A later patch will need to perform a lookup using an
alternate client with a different security flavor.
This patch adds support for doing that on NFS v4.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-24 13:52:41 -04:00
Bryan Schumaker
e73b83f270 NFS: convert call_sync() to a function
This patch changes nfs4_call_sync() from a macro into a
static inline function.  As a macro, the call_sync()
function will not do any type checking and depends
on the sequence arguments always having the same name.
As a function, we get to have type checking and can
rename the arguments if we so choose.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-24 13:52:41 -04:00
Andy Adamson
863a3c6c68 NFSv4.1: layoutcommit
The filelayout driver sends LAYOUTCOMMIT only when COMMIT goes to
the data server (as opposed to the MDS) and the data server WRITE
is not NFS_FILE_SYNC.

Only whole file layout support means that there is only one IOMODE_RW layout
segment.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Mingyang Guo <guomingyang@nrchpc.ac.cn>
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
Signed-off-by: Zhang Jingwang <zhangjingwang@nrchpc.ac.cn>
Tested-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-23 15:29:04 -04:00
Fred Isaman
988b6dceb0 NFSv4.1: remove GETATTR from ds commits
Any COMMIT compound directed to a data server needs to have the
GETATTR calls suppressed.  We here, make sure the field we are testing
(data->lseg) is set and refcounted correctly.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-23 15:29:03 -04:00
Fred Isaman
5f452431e2 NFSv4.1: add callback to nfs4_commit_done
Add a callback that the pnfs layout driver can use to do its own
error handling of the data server's COMMIT response.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-23 15:29:03 -04:00
Trond Myklebust
b064eca2cf NFSv4: Send unmapped uid/gids to the server when using auth_sys
The new behaviour is enabled using the new module parameter
'nfs4_disable_idmapping'.

Note that if the server rejects an unmapped uid or gid, then
the client will automatically switch back to using the idmapper.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:39:27 -05:00
Trond Myklebust
3ddeb7c5c6 NFSv4: Propagate the error NFS4ERR_BADOWNER to nfs4_do_setattr
This will be required in order to switch uid/gid mapping back on if the
admin has tried to disable it.

Note that we also propagate NFS4ERR_BADNAME at the same time, in order to
work around a Linux server bug.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:39:27 -05:00
Fred Isaman
a69aef1496 NFSv4.1: pnfs filelayout driver write
Allows the pnfs filelayout driver to write to the data servers.

Note that COMMIT to data servers will be implemented in a future
patch.  To avoid improper behavior, for the moment any WRITE to a data
server that would also require a COMMIT to the data server is sent
NFS_FILE_SYNC.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Mingyang Guo <guomingyang@nrchpc.ac.cn>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:44 -05:00
Fred Isaman
7ffd10640d NFSv4.1: remove GETATTR from ds writes
Any WRITE compound directed to a data server needs to have the
GETATTR calls suppressed.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:44 -05:00
Fred Isaman
b029bc9b08 NFSv4.1: add callback to nfs4_write_done
Add callback that pnfs layout driver can use to do its own handling
of data server WRITE response.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:43 -05:00
Andy Adamson
cbdabc7f8b NFSv4.1: filelayout async error handler
Use our own async error handler.
Mark the layout as failed and retry i/o through the MDS on specified errors.

Update the mds_offset in nfs_readpage_retry so that a failed short-read retry
to a DS gets correctly resent through the MDS.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:43 -05:00
Andy Adamson
dc70d7b318 NFSv4.1: filelayout read
Attempt a pNFS file layout read by setting up the nfs_read_data struct and
calling nfs_initiate_read with the data server rpc client and the
filelayout rpc call ops.

Error handling is implemented in a subsequent patch.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Mingyang Guo <guomingyang@nrchpc.ac.cn>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Tested-by: Guo Mingyang <guomingyang@nrchpc.ac.cn>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:43 -05:00
Andy Adamson
d83217c135 NFSv4.1: data server connection
Introduce a data server set_client and init session following the
nfs4_set_client and  nfs4_init_session convention.

Once a new nfs_client is on the nfs_client_list, the nfs_client cl_cons_state
serializes access to creating an nfs_client struct with matching properties.

Use the new nfs_get_client() that initializes new clients.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:42 -05:00
Andy Adamson
45a52a0207 NFS move nfs_client initialization into nfs_get_client
Now nfs_get_client returns an nfs_client ready to be used no matter if it was
found or created.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:38:41 -05:00
Huang Weiyi
57df216bd8 nfs4: remove duplicated #include
Remove duplicated #include('s) in
  fs/nfs/nfs4proc.c

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:18:37 -05:00
Trond Myklebust
ecac799a5e NFSv4: Fix the setlk error handler
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:18:36 -05:00
Trond Myklebust
b4410c2f7f NFSv4.1: Fix the handling of the SEQUENCE status bits
We want SEQUENCE status bits to be handled by the state manager in order
to avoid threading issues.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:18:35 -05:00
Trond Myklebust
0400a6b0cb NFSv4/4.1: Fix nfs4_schedule_state_recovery abuses
nfs4_schedule_state_recovery() should only be used when we need to force
the state manager to check the lease. If we just want to start the
state manager in order to handle a state recovery situation, we should be
using nfs4_schedule_state_manager().

This patch fixes the abuses of nfs4_schedule_state_recovery() by replacing
its use with a set of helper functions that do the right thing.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-11 15:18:22 -05:00
Andy Adamson
c34c32ea97 NFSv4.1 reclaim complete must wait for completion
Signed-off-by: Andy Adamson <andros@netapp.com>
[Trond: fix whitespace errors]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-10 15:05:01 -05:00
Ricardo Labiaga
7d6d63d642 NFSv4.1: Retry CREATE_SESSION on NFS4ERR_DELAY
Fix bug where we currently retry the EXCHANGEID call again, eventhough
we already have a valid clientid.  Instead, delay and retry the CREATE_SESSION
call.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-10 15:04:59 -05:00
Jovi Zhang
43b7c3f051 nfs: fix compilation warning
this commit fix compilation warning as following:
linux-2.6/fs/nfs/nfs4proc.c:3265: warning: comparison of distinct pointer types lacks a cast

Signed-off-by: Jovi Zhang <bookjovi@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-10 15:04:56 -05:00
Trond Myklebust
bf294b41ce SUNRPC: Close a race in __rpc_wait_for_completion_task()
Although they run as rpciod background tasks, under normal operation
(i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
and nfs4_do_close() want to be fully synchronous. This means that when we
exit, we want all references to the rpc_task to be gone, and we want
any dentry references etc. held by that task to be released.

For this reason these functions call __rpc_wait_for_completion_task(),
followed by rpc_put_task() in the expectation that the latter will be
releasing the last reference to the rpc_task, and thus ensuring that the
callback_ops->rpc_release() has been called synchronously.

This patch fixes a race which exists due to the fact that
rpciod calls rpc_complete_task() (in order to wake up the callers of
__rpc_wait_for_completion_task()) and then subsequently calls
rpc_put_task() without ensuring that these two steps are done atomically.

In order to avoid adding new spin locks, the patch uses the existing
waitqueue spin lock to order the rpc_task reference count releases between
the waiting process and rpciod.
The common case where nobody is waiting for completion is optimised for by
checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
reference count is 1: in those cases we drop trying to grab the spin lock,
and immediately free up the rpc_task.

Those few processes that need to put the rpc_task from inside an
asynchronous context and that do not care about ordering are given a new
helper: rpc_put_task_async().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-10 15:04:52 -05:00
Neil Horman
e9e3d724e2 nfs4: Ensure that ACL pages sent over NFS were not allocated from the slab (v3)
The "bad_page()" page allocator sanity check was reported recently (call
chain as follows):

  bad_page+0x69/0x91
  free_hot_cold_page+0x81/0x144
  skb_release_data+0x5f/0x98
  __kfree_skb+0x11/0x1a
  tcp_ack+0x6a3/0x1868
  tcp_rcv_established+0x7a6/0x8b9
  tcp_v4_do_rcv+0x2a/0x2fa
  tcp_v4_rcv+0x9a2/0x9f6
  do_timer+0x2df/0x52c
  ip_local_deliver+0x19d/0x263
  ip_rcv+0x539/0x57c
  netif_receive_skb+0x470/0x49f
  :virtio_net:virtnet_poll+0x46b/0x5c5
  net_rx_action+0xac/0x1b3
  __do_softirq+0x89/0x133
  call_softirq+0x1c/0x28
  do_softirq+0x2c/0x7d
  do_IRQ+0xec/0xf5
  default_idle+0x0/0x50
  ret_from_intr+0x0/0xa
  default_idle+0x29/0x50
  cpu_idle+0x95/0xb8
  start_kernel+0x220/0x225
  _sinittext+0x22f/0x236

It occurs because an skb with a fraglist was freed from the tcp
retransmit queue when it was acked, but a page on that fraglist had
PG_Slab set (indicating it was allocated from the Slab allocator (which
means the free path above can't safely free it via put_page.

We tracked this back to an nfsv4 setacl operation, in which the nfs code
attempted to fill convert the passed in buffer to an array of pages in
__nfs4_proc_set_acl, which gets used by the skb->frags list in
xs_sendpages.  __nfs4_proc_set_acl just converts each page in the buffer
to a page struct via virt_to_page, but the vfs allocates the buffer via
kmalloc, meaning the PG_slab bit is set.  We can't create a buffer with
kmalloc and free it later in the tcp ack path with put_page, so we need
to either:

1) ensure that when we create the list of pages, no page struct has
   PG_Slab set

 or

2) not use a page list to send this data

Given that these buffers can be multiple pages and arbitrarily sized, I
think (1) is the right way to go.  I've written the below patch to
allocate a page from the buddy allocator directly and copy the data over
to it.  This ensures that we have a put_page free-able page for every
entry that winds up on an skb frag list, so it can be safely freed when
the frame is acked.  We do a put page on each entry after the
rpc_call_sync call so as to drop our own reference count to the page,
leaving only the ref count taken by tcp_sendpages.  This way the data
will be properly freed when the ack comes in

Successfully tested by myself to solve the above oops.

Note, as this is the result of a setacl operation that exceeded a page
of data, I think this amounts to a local DOS triggerable by an
uprivlidged user, so I'm CCing security on this as well.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
CC: security@kernel.org
CC: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-04 17:28:52 -08:00
Andy Adamson
c7a360b05b NFS construct consistent co_ownerid for v4.1
As stated in section 2.4 of RFC 5661, subsequent instances of the client need
to present the same co_ownerid. Concatinate the client's IP dot address,
host name, and the rpc_auth pseudoflavor to form the co_ownerid.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-25 22:49:14 -05:00
Andy Adamson
357f54d6b3 NFS fix the setting of exchange id flag
Indicate support for referrals. Do not set any PNFS roles. Check the flags
returned by the server for validity. Do not use exchange flags from an old
client ID instance when recovering a client ID.

Update the EXCHID4_FLAG_XXX set to RFC 5661.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-11 14:17:09 -05:00
Trond Myklebust
d035c36c58 NFSv4: Ensure continued open and lockowner name uniqueness
In order to enable migration support, we will want to move some of the
structures that are subject to migration into the struct nfs_server.
In particular, if we are to move the state_owner and state_owner_id to
being a per-filesystem structure, then we should label the resulting
open/lock owners with a per-filesytem label to ensure global uniqueness.

This patch does so by adding the super block s_dev to the open/lock owner
name.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 16:03:13 -05:00
Fred Isaman
f7e8917a67 pnfs: layout roc code
A layout can request return-on-close.  How this interacts with the
forgetful model of never sending LAYOUTRETURNS is a bit ambiguous.
We forget any layouts marked roc, and wait for them to be completely
forgotten before continuing with the close.  In addition, to compensate
for races with any inflight LAYOUTGETs, and the fact that we do not get
any layout stateid back from the server, we set the barrier to the worst
case scenario of current_seqid + number of outstanding LAYOUTGETS.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:32 -05:00
Fred Isaman
cf7d63f1f9 pnfs: serialize LAYOUTGET(openstateid)
We shouldn't send a LAYOUTGET(openstateid) unless all outstanding RPCs
using the previous stateid are completed.  This requires choosing the
stateid to encode earlier, so we can abort if one is not available (we
want to use the open stateid, but a LAYOUTGET is already out using
it), and adding a count of the number of outstanding rpc calls using
layout state (which for now consist solely of LAYOUTGETs).

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:31 -05:00
Fred Isaman
c31663d4a1 pnfs: layoutget rpc code cleanup
No functional changes, just some code minor code rearrangement and
comments.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:31 -05:00
Fred Isaman
daaa82d1c7 pnfs: remove unnecessary field lgp->status
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:30 -05:00
Andy Adamson
42acd02182 NFS add session back channel draining
Currently session draining only drains the fore channel.
The back channel processing must also be drained.

Use the back channel highest_slot_used to indicate that a callback is being
processed by the callback thread.  Move the session complete to be per channel.

When the session is draininig, wait for any current back channel processing
to complete and stop all new back channel processing by returning NFS4ERR_DELAY
to the back channel client.

Drain the back channel, then the fore channel.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:25 -05:00
Andy Adamson
f4eecd5da3 NFS implement v4.0 callback_ident
Use the small id to pointer translator service to provide a unique callback
identifier per SETCLIENTID call used to identify the v4.0 callback service
associated with the clientid.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-06 14:46:24 -05:00
Aneesh Kumar K.V
64c2ce8b72 nfsv4: Switch to generic xattr handling code
This patch make nfsv4 use the generic xattr handling code
to get the nfsv4 acl. This will help us to add richacl
support to nfsv4 in later patches

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-04 13:10:41 -05:00
Aneesh Kumar K.V
a8a5da996d nfs: Set MS_POSIXACL always
We want to skip VFS applying mode for NFS. So set MS_POSIXACL always
and selectively use umask. Ideally we would want to use umask only
when we don't have inheritable ACEs set. But NFS currently don't
allow to send umask to the server. So this is best what we can do
and this is consistent with NFSv3

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-01-04 13:10:40 -05:00
Trond Myklebust
1174dd1f89 NFSv4: Convert a few commas into semicolons...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-12-21 11:51:27 -05:00
J. Bruce Fields
611c96c8f7 nfs4: fix units bug causing hang on recovery
Note that cl_lease_time is in jiffies.  This can cause a very long wait
in the NFS4ERR_CLID_INUSE case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-12-21 11:51:24 -05:00
Aneesh Kumar K.V
08a22b392a nfs: Discard ACL cache on mode update
An update of mode bits can result in ACL value being changed. We need
to mark the acl cache invalid when we update mode. Similarly we need
to update file attribute when we change ACL value

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-12-07 19:30:42 -05:00
Trond Myklebust
ac39612824 NFS: readdir shouldn't read beyond the reply returned by the server
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-11-15 20:44:29 -05:00
Geert Uytterhoeven
12364a4f05 nfs4: The difference of 2 pointers is ptrdiff_t
On m68k, which is 32-bit:

fs/nfs/nfs4proc.c: In function ‘nfs41_sequence_done’:
fs/nfs/nfs4proc.c:432: warning: format ‘%ld’ expects type ‘long int’, but argument 3 has type ‘int’
fs/nfs/nfs4proc.c: In function ‘nfs4_setup_sequence’:
fs/nfs/nfs4proc.c:576: warning: format ‘%ld’ expects type ‘long int’, but argument 5 has type ‘int’

On 32-bit, ptrdiff_t is int; on 64-bit, ptrdiff_t is long.

Introduced by commit dfb4f30983 ("NFSv4.1: keep
seq_res.sr_slot as pointer rather than an index")

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-28 15:49:29 -04:00
J. Bruce Fields
43c2e885be nfs4: fix channel attribute sanity-checks
The sanity checks here are incorrect; in the worst case they allow
values that crash the client.

They're also over-reliant on the preprocessor.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-25 22:19:41 -04:00
Andy Adamson
b1f69b754e NFSv4.1: pnfs: add LAYOUTGET and GETDEVICEINFO infrastructure
Add the ability to actually send LAYOUTGET and GETDEVICEINFO.  This also adds
in the machinery to handle layout state and the deviceid cache.  Note that
GETDEVICEINFO is not called directly by the generic layer.  Instead it
is called by the drivers while parsing the LAYOUTGET opaque data in response
to an unknown device id embedded therein.  RFC 5661 only encodes
device ids within the driver-specific opaque data.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Dean Hildebrand <dhildebz@umich.edu>
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Mike Sager <sager@netapp.com>
Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com>
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-24 18:07:10 -04:00
Andy Adamson
504913fbc8 NFS: ask for layouttypes during v4 fsinfo call
This information will be used to determine which layout driver,
if any, to use for subsequent IO on this filesystem.  Each driver
is assigned an integer id, with 0 reserved to indicate no driver.

The server can in theory return multiple ids.  However, our current
client implementation only notes the first entry and ignores the
rest.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-24 18:07:09 -04:00
Ricardo Labiaga
55b6e7742d Ask for time_delta during fsinfo probe
Used by the client to determine if the server has a granular enough
time stamp.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-24 18:00:04 -04:00
Bryan Schumaker
82f2e5472e NFS: Readdir plus in v4
By requsting more attributes during a readdir, we can mimic the readdir plus
operation that was in NFSv3.

To test, I ran the command `ls -lU --color=none` on directories with various
numbers of files.  Without readdir plus, I see this:

n files |    100    |   1,000   |  10,000   |  100,000  | 1,000,000
--------+-----------+-----------+-----------+-----------+----------
real    | 0m00.153s | 0m00.589s | 0m05.601s | 0m56.691s | 9m59.128s
user    | 0m00.007s | 0m00.007s | 0m00.077s | 0m00.703s | 0m06.800s
sys     | 0m00.010s | 0m00.070s | 0m00.633s | 0m06.423s | 1m10.005s
access  | 3         | 1         | 1         | 4         | 31
getattr | 2         | 1         | 1         | 1         | 1
lookup  | 104       | 1,003     | 10,003    | 100,003   | 1,000,003
readdir | 2         | 16        | 158       | 1,575     | 15,749
total   | 111       | 1,021     | 10,163    | 101,583   | 1,015,784

With readdir plus enabled, I see this:

n files |    100    |   1,000   |  10,000   |  100,000  | 1,000,000
--------+-----------+-----------+-----------+-----------+----------
real    | 0m00.115s | 0m00.206s | 0m01.079s | 0m12.521s | 2m07.528s
user    | 0m00.003s | 0m00.003s | 0m00.040s | 0m00.290s | 0m03.296s
sys     | 0m00.007s | 0m00.020s | 0m00.120s | 0m01.357s | 0m17.556s
access  | 3         | 1         | 1         | 1         | 7
getattr | 2         | 1         | 1         | 1         | 1
lookup  | 4         | 3         | 3         | 3         | 3
readdir | 6         | 62        | 630       | 6,300     | 62,993
total   | 15        | 67        | 635       | 6,305     | 63,004

Readdir plus disabled has about a 16x increase in the number of rpc calls and
is 4 - 5 times slower on large directories.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-23 15:27:37 -04:00
Bryan Schumaker
56e4ebf877 NFS: readdir with vmapped pages
We can use vmapped pages to read more information from the network at once.
This will reduce the number of calls needed to complete a readdir.

Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
[trondmy: Added #include for linux/vmalloc.h> in fs/nfs/dir.c]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-23 15:27:35 -04:00
Trond Myklebust
168667c43b NFSv4: The state manager must ignore EKEYEXPIRED.
Otherwise, we cannot recover state correctly.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-10-23 15:27:28 -04:00
Trond Myklebust
ae1007d37e NFSv4: Don't call nfs4_state_mark_reclaim_reboot() from error handlers
In the case of a server reboot, the state recovery thread starts by calling
nfs4_state_end_reclaim_reboot() in order to avoid edge conditions when
the server reboots while the client is in the middle of recovery.

However, if the client has already marked the nfs4_state as requiring
reboot recovery, then the above behaviour will cause the recovery thread to
treat the open as if it was part of such an edge condition: the open will
be recovered as if it was part of a lease expiration (and all the locks
will be lost).
Fix is to remove the call to nfs4_state_mark_reclaim_reboot from
nfs4_async_handle_error(), and nfs4_handle_exception(). Instead we leave it
to the recovery thread to do this for us.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
2010-10-19 19:42:33 -04:00
Trond Myklebust
b0ed9dbc24 NFSv4: Fix open recovery
NFSv4 open recovery is currently broken: since we do not clear the
state->flags states before attempting recovery, we end up with the
'can_open_cached()' function triggering. This again leads to no OPEN call
being put on the wire.

Reported-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
2010-10-19 19:41:55 -04:00
Benny Halevy
dfb4f30983 NFSv4.1: keep seq_res.sr_slot as pointer rather than an index
Having to explicitly initialize sr_slotid to NFS4_MAX_SLOT_TABLE
resulted in numerous bugs.  Keeping the current slot as a pointer
to the slot table is more straight forward and robust as it's
implicitly set up to NULL wherever the seq_res member is initialized
to zeroes.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-24 09:17:01 -04:00
Trond Myklebust
d688e11007 NFSv4.1: Fix the slotid initialisation in nfs_async_rename()
This fixes an Oopsable condition that was introduced by commit
d3d4152a5d (nfs: make sillyrename an async
operation)

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-21 16:52:40 -04:00
Jeff Layton
d3d4152a5d nfs: make sillyrename an async operation
A synchronous rename can be interrupted by a SIGKILL. If that happens
during a sillyrename operation, it's possible for the rename call to
be sent to the server, but the task exits before processing the
reply. If this happens, the sillyrenamed file won't get cleaned up
during nfs_dentry_iput and the server is left with a dangling .nfs* file
hanging around.

Fix this problem by turning sillyrename into an asynchronous operation
and have the task doing the sillyrename just wait on the reply. If the
task is killed before the sillyrename completes, it'll still proceed
to completion.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 17:31:57 -04:00
Jeff Layton
e8582a8b96 nfs: standardize the rename response container
Right now, v3 and v4 have their own variants. Create a standard struct
that will work for v3 and v4. v2 doesn't get anything but a simple error
and so isn't affected by this.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 17:31:06 -04:00
Jeff Layton
920769f031 nfs: standardize the rename args container
Each NFS version has its own version of the rename args container.
Standardize them on a common one that's identical to the one NFSv4
uses.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 17:30:25 -04:00
Trond Myklebust
2b484297e4 NFS: Add an 'open_context' element to struct nfs_rpc_ops
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 10:56:51 -04:00
Trond Myklebust
c0204fd2b8 NFS: Clean up nfs4_proc_create()
Remove all remaining references to the struct nameidata from the low level
NFS layers. Again pass down a partially initialised struct nfs_open_context
when we want to do atomic open+create.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 10:56:51 -04:00
Trond Myklebust
535918f141 NFSv4: Further cleanups for nfs4_open_revalidate()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 10:56:51 -04:00
Trond Myklebust
b8d4caddd8 NFSv4: Clean up nfs4_open_revalidate
Remove references to 'struct nameidata' from the low-level open_revalidate
code, and replace them with a struct nfs_open_context which will be
correctly initialised upon success.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 10:56:51 -04:00
Trond Myklebust
f46e0bd34e NFSv4: Further minor cleanups for nfs4_atomic_open()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 10:56:50 -04:00
Trond Myklebust
cd9a1c0e5a NFSv4: Clean up nfs4_atomic_open
Start moving the 'struct nameidata' dependent code out of the lower level
NFS code in preparation for the removal of open intents.

Instead of the struct nameidata, we pass down a partially initialised
struct nfs_open_context that will be fully initialised by the atomic open
upon success.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 10:56:50 -04:00
Trond Myklebust
0a377cff94 NFS: Fix an Oops in the NFSv4 atomic open code
Adam Lackorzynski reports:

with 2.6.35.2 I'm getting this reproducible Oops:

[  110.825396] BUG: unable to handle kernel NULL pointer dereference at
(null)
[  110.828638] IP: [<ffffffff811247b7>] encode_attrs+0x1a/0x2a4
[  110.828638] PGD be89f067 PUD bf18f067 PMD 0
[  110.828638] Oops: 0000 [#1] SMP
[  110.828638] last sysfs file: /sys/class/net/lo/operstate
[  110.828638] CPU 2
[  110.828638] Modules linked in: rtc_cmos rtc_core rtc_lib amd64_edac_mod
i2c_amd756 edac_core i2c_core dm_mirror dm_region_hash dm_log dm_snapshot
sg sr_mod usb_storage ohci_hcd mptspi tg3 mptscsih mptbase usbcore nls_base
[last unloaded: scsi_wait_scan]
[  110.828638]
[  110.828638] Pid: 11264, comm: setchecksum Not tainted 2.6.35.2 #1
[  110.828638] RIP: 0010:[<ffffffff811247b7>]  [<ffffffff811247b7>]
encode_attrs+0x1a/0x2a4
[  110.828638] RSP: 0000:ffff88003bf5b878  EFLAGS: 00010296
[  110.828638] RAX: ffff8800bddb48a8 RBX: ffff88003bf5bb18 RCX:
0000000000000000
[  110.828638] RDX: ffff8800be258800 RSI: 0000000000000000 RDI:
ffff88003bf5b9f8
[  110.828638] RBP: 0000000000000000 R08: ffff8800bddb48a8 R09:
0000000000000004
[  110.828638] R10: 0000000000000003 R11: ffff8800be779000 R12:
ffff8800be258800
[  110.828638] R13: ffff88003bf5b9f8 R14: ffff88003bf5bb20 R15:
ffff8800be258800
[  110.828638] FS:  0000000000000000(0000) GS:ffff880041e00000(0063)
knlGS:00000000556bd6b0
[  110.828638] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
[  110.828638] CR2: 0000000000000000 CR3: 00000000be8ef000 CR4:
00000000000006e0
[  110.828638] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  110.828638] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[  110.828638] Process setchecksum (pid: 11264, threadinfo
ffff88003bf5a000, task ffff88003f232210)
[  110.828638] Stack:
[  110.828638]  0000000000000000 ffff8800bfbcf920 0000000000000000
0000000000000ffe
[  110.828638] <0> 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[  110.828638] <0> 0000000000000000 0000000000000000 0000000000000000
0000000000000000
[  110.828638] Call Trace:
[  110.828638]  [<ffffffff81124c1f>] ? nfs4_xdr_enc_setattr+0x90/0xb4
[  110.828638]  [<ffffffff81371161>] ? call_transmit+0x1c3/0x24a
[  110.828638]  [<ffffffff813774d9>] ? __rpc_execute+0x78/0x22a
[  110.828638]  [<ffffffff81371a91>] ? rpc_run_task+0x21/0x2b
[  110.828638]  [<ffffffff81371b7e>] ? rpc_call_sync+0x3d/0x5d
[  110.828638]  [<ffffffff8111e284>] ? _nfs4_do_setattr+0x11b/0x147
[  110.828638]  [<ffffffff81109466>] ? nfs_init_locked+0x0/0x32
[  110.828638]  [<ffffffff810ac521>] ? ifind+0x4e/0x90
[  110.828638]  [<ffffffff8111e2fb>] ? nfs4_do_setattr+0x4b/0x6e
[  110.828638]  [<ffffffff8111e634>] ? nfs4_do_open+0x291/0x3a6
[  110.828638]  [<ffffffff8111ed81>] ? nfs4_open_revalidate+0x63/0x14a
[  110.828638]  [<ffffffff811056c4>] ? nfs_open_revalidate+0xd7/0x161
[  110.828638]  [<ffffffff810a2de4>] ? do_lookup+0x1a4/0x201
[  110.828638]  [<ffffffff810a4733>] ? link_path_walk+0x6a/0x9d5
[  110.828638]  [<ffffffff810a42b6>] ? do_last+0x17b/0x58e
[  110.828638]  [<ffffffff810a5fbe>] ? do_filp_open+0x1bd/0x56e
[  110.828638]  [<ffffffff811cd5e0>] ? _atomic_dec_and_lock+0x30/0x48
[  110.828638]  [<ffffffff810a9b1b>] ? dput+0x37/0x152
[  110.828638]  [<ffffffff810ae063>] ? alloc_fd+0x69/0x10a
[  110.828638]  [<ffffffff81099f39>] ? do_sys_open+0x56/0x100
[  110.828638]  [<ffffffff81027a22>] ? ia32_sysret+0x0/0x5
[  110.828638] Code: 83 f1 01 e8 f5 ca ff ff 48 83 c4 50 5b 5d 41 5c c3 41
57 41 56 41 55 49 89 fd 41 54 49 89 d4 55 48 89 f5 53 48 81 ec 18 01 00 00
<8b> 06 89 c2 83 e2 08 83 fa 01 19 db 83 e3 f8 83 c3 18 a8 01 8d
[  110.828638] RIP  [<ffffffff811247b7>] encode_attrs+0x1a/0x2a4
[  110.828638]  RSP <ffff88003bf5b878>
[  110.828638] CR2: 0000000000000000
[  112.840396] ---[ end trace 95282e83fd77358f ]---

We need to ensure that the O_EXCL flag is turned off if the user doesn't
set O_CREAT.

Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-08-18 09:25:42 -04:00
Davidlohr Bueso
5d7ca35a18 nfs: Remove redundant NULL check upon kfree()
Signed-off-by: Davidlohr Bueso <dave@gnu.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-08-11 12:42:15 -04:00
Trond Myklebust
d05dd4e98f NFS: Fix the NFS users of rpc_restart_call()
Fix up those functions that depend on knowing whether or not
rpc_restart_call is successful or not.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-08-03 22:06:44 -04:00
Trond Myklebust
a6f03393ec NFSv4: Get rid of the bogus RPC_ASSASSINATED(task) checks
There is no real reason to have RPC_ASSASSINATED() checks in the NFS code.
As far as it is concerned, this is just an RPC error...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-08-03 22:06:43 -04:00
Trond Myklebust
452e93523d NFSv4: Clean up the process of renewing the NFSv4 lease
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-08-03 22:06:42 -04:00
Trond Myklebust
14516c3a30 NFSv4.1: Handle NFS4ERR_DELAY on SEQUENCE correctly
In RFC5661, an NFS4ERR_DELAY error on a SEQUENCE operation has the special
meaning that the server is not finished processing the request. In this
case we want to just retry the request without touching the slot.

Also fix a bug whereby we would fail to update the sequence id if the
server returned any error other than NFS_OK/NFS4ERR_DELAY.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-08-03 22:06:42 -04:00
Trond Myklebust
77041ed9b4 NFSv4: Ensure the lockowners are labelled using the fl_owner and/or fl_pid
flock locks want to be labelled using the process pid, while posix locks
want to be labelled using the fl_owner.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-07-30 14:46:10 -04:00
Trond Myklebust
d3c7b7ccc1 NFSv4: Add support for the RELEASE_LOCKOWNER operation
This is needed by NFSv4.0 servers in order to keep the number of locking
stateids at a manageable level.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-07-30 14:46:10 -04:00
Trond Myklebust
1f0e890dba NFSv4: Clean up struct nfs4_state_owner
The 'so_delegations' list appears to be unused.

Also eliminate so_client. If we already have so_server, we can get to the
nfs_client structure.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-24 15:11:43 -04:00
Trond Myklebust
1055d76d91 NFSv4.1: There is no need to init the session more than once...
Set up a flag to ensure that is indeed the case.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:03 -04:00
Trond Myklebust
fe74ba3a8d NFSv41: Cleanup for nfs4_alloc_session.
There is no reason to change the nfs_client state every time we allocate a
new session. Move that line into nfs4_init_client_minor_version.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:03 -04:00
Trond Myklebust
d77d76ffb6 NFSv41: Clean up exclusive create
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:03 -04:00
Trond Myklebust
e047a10c12 NFSv41: Fix nfs_async_inode_return_delegation() ugliness
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:02 -04:00
Trond Myklebust
c48f4f3541 NFSv41: Convert the various reboot recovery ops etc to minor version ops
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:02 -04:00
Trond Myklebust
97dc135947 NFSv41: Clean up the NFSv4.1 minor version specific operations
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:02 -04:00
Trond Myklebust
a2118c33aa NFSv41: Don't store session state in the nfs_client->cl_state
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:02 -04:00
Trond Myklebust
df8964554a NFSv41: Further cleanup for nfs4_sequence_done
Instead of testing if the nfs_client has a session, we should be testing if
the struct nfs4_sequence_res was set up with one.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:02 -04:00
Trond Myklebust
035168ab39 NFSv4.1: Make nfs4_setup_sequence take a nfs_server argument
In anticipation of the day when we have per-filesystem sessions, and also
in order to allow the session to change in the event of a filesystem
migration event.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:02 -04:00
Trond Myklebust
71ac6da994 NFSv4.1: Merge the nfs41_proc_async_sequence() and nfs4_proc_sequence()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:01 -04:00
Trond Myklebust
aa5190d0ed NFSv4: Kill nfs4_async_handle_error() abuses by NFSv4.1
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:01 -04:00
Trond Myklebust
d185a334c7 NFSv4.1: Simplify nfs41_sequence_done()
Nobody uses the rpc_status parameter.

It is not obvious why we need the struct nfs_client argument either, when
we already have that information in the session.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:01 -04:00
Trond Myklebust
2a6e26cdb8 NFSv4.1: Clean up nfs4_setup_sequence
Firstly, there is little point in first zeroing out the entire struct
nfs4_sequence_res, and then initialising all fields save one. Just
initialise the last field to zero...

Secondly, nfs41_setup_sequence() has only 2 possible return values: 0, or
-EAGAIN, so there is no 'terminate rpc task' case.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:01 -04:00
Trond Myklebust
d5f8d3fe72 NFSv41: Fix a memory leak in nfs41_proc_async_sequence()
If the call to rpc_call_async() fails, then the arguments will not be
freed, since there will be no call to nfs41_sequence_call_done

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-06-22 13:24:01 -04:00
Trond Myklebust
8535b2be51 NFSv4: Don't use GFP_KERNEL allocations in state recovery
We do not want to have the state recovery thread kick off and wait for a
memory reclaim, since that may deadlock when the writebacks end up
waiting for the state recovery thread to complete.

The safe thing is therefore to use GFP_NOFS in all open, close,
delegation return, lock, etc. operations that may be called by the
state recovery thread.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:33 -04:00
Chuck Lever
9bc4e3ca46 NFS: Calldata for nfs4_renew_done()
I'm about to change task->tk_start from a jiffies value to a ktime_t
value in order to make RPC RTT reporting more precise.

Recently (commit dc96aef9) nfs4_renew_done() started to reference
task->tk_start so that a jiffies value no longer had to be passed
from nfs4_proc_async_renew().  This allowed the calldata to point to
an nfs_client instead.

Changing task->tk_start to a ktime_t value makes it effectively
useless for renew timestamps, so we need to restore the pre-dc96aef9
logic that provided a jiffies "start" timestamp to nfs4_renew_done().

Both an nfs_client pointer and a timestamp need to be passed to
nfs4_renew_done(), so create a new nfs_renewdata structure that
contains both, resembling what is already done for delegreturn,
lock, and unlock.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:32 -04:00
Trond Myklebust
bb8b27e504 NFSv4: Clean up the NFSv4 setclientid operation
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:30 -04:00
Trond Myklebust
0ab64e0e14 NFS: Reduce stack footprint of nfs4_proc_create()
Move the O_EXCL open handling into _nfs4_do_open() where it belongs. Doing
so also allows us to reuse the struct fattr from the opendata.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:27 -04:00
Trond Myklebust
d346890bea NFS: Reduce stack footprint of nfs_proc_remove()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:26 -04:00
Trond Myklebust
136f2627c9 NFS: Reduce the stack footprint of nfs_link()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:25 -04:00
Trond Myklebust
011fff7239 NFS: Reduce stack footprint of nfs3_proc_rename() and nfs4_proc_rename()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:25 -04:00
Trond Myklebust
c407d41a16 NFSv4: Reduce stack footprint of nfs4_proc_access() and nfs3_proc_access()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-05-14 15:09:24 -04:00
Dan Carpenter
acf82b85a7 nfs: fix some issues in nfs41_proc_reclaim_complete()
The original code passed an ERR_PTR() to rpc_put_task() and instead of
returning zero on success it returned -ENOMEM.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-04-28 13:45:12 -04:00
Trond Myklebust
0df5dd4aae NFSv4: fix delegated locking
Arnaud Giersch reports that NFSv4 locking is broken when we hold a
delegation since commit 8e469ebd6d (NFSv4:
Don't allow posix locking against servers that don't support it).

According to Arnaud, the lock succeeds the first time he opens the file
(since we cannot do a delegated open) but then fails after we start using
delegated opens.

The following patch fixes it by ensuring that locking behaviour is
governed by a per-filesystem capability flag that is initially set, but
gets cleared if the server ever returns an OPEN without the
NFS4_OPEN_RESULT_LOCKTYPE_POSIX flag being set.

Reported-by: Arnaud Giersch <arnaud.giersch@iut-bm.univ-fcomte.fr>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
2010-04-12 07:55:15 -04:00
Al Viro
04287f975e Have nfs ->d_revalidate() report errors properly
If nfs atomic open implementation ends up doing open request from
->d_revalidate() codepath and gets an error from server, return that error
to caller explicitly and don't bother with lookup_instantiate_filp() at all.
->d_revalidate() can return an error itself just fine...

See
	http://bugzilla.kernel.org/show_bug.cgi?id=15674
	http://marc.info/?l=linux-kernel&m=126988782722711&w=2

for original report.

Reported-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-07 16:10:16 -07:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Dan Carpenter
7dd08a570d nfs: fix unlikely memory leak
I'll admit that it's unlikely for the first allocation to fail and
the second one to succeed.  I won't be offended if you ignore this
patch.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-08 14:10:00 -05:00
Trond Myklebust
3fa04ecd72 Merge branch 'writeback-for-2.6.34' into nfs-for-2.6.34 2010-03-05 15:46:18 -05:00
Al Viro
f694869709 a couple of mntget+dget -> path_get in nfs4proc
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2010-03-03 14:07:56 -05:00
Andy Adamson
180b62a3d8 nfs41 fix NFS4ERR_CLID_INUSE for exchange id
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-02 13:45:33 -05:00
Alexandros Batsakis
0851de0617 nfs4: renewd renew operations should take/put a client reference
renewd sends RENEW requests to the NFS server in order to renew state.
As the request is asynchronous, renewd should take a reference to the
nfs_client to prevent concurrent umounts from freeing the client

Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-02 13:00:03 -05:00
Alexandros Batsakis
7135840fc7 nfs41: renewd sequence operations should take/put client reference
renewd sends SEQUENCE requests to the NFS server in order to renew state.
As the request is asynchronous, renewd should take a reference to the
nfs_client to prevent concurrent umounts from freeing the session/client

Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-02 12:54:30 -05:00
Alexandros Batsakis
dc96aef96a nfs: prevent backlogging of renewd requests
If the renewd send queue gets backlogged (e.g., if the server goes down),
we will keep filling the queue with periodic RENEW/SEQUENCE requests.

This patch schedules a new renewd request if and only if the previous one
returns (either success or failure)

Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
[Trond.Myklebust@netapp.com: moved nfs4_schedule_state_renewal() into
separate nfs4_renew_release() and nfs41_sequence_release() callbacks
to ensure correct behaviour on call setup failure]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-03-02 12:44:07 -05:00
Andy Adamson
104aeba484 nfs41: resize slot table in reset
When session is reset, client can renegotiate slot table size.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-02-10 08:30:59 -05:00
Mike Sager
8e0d46e138 nfs41: Adjust max cache response size value
For the CREATE_SESSION attribute ca_maxresponsesize_cached, calculate
the value based on the rpc reply header size plus the maximum nfs compound
reply size.

Signed-off-by: Mike Sager <sager@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-02-10 08:30:54 -05:00
Jeff Layton
2c6434888c nfs4: handle -EKEYEXPIRED errors from RPC layer
If a KRB5 TGT ticket expires, we don't want to return an error
immediatel. If someone has a long running job and just forgets to run
"kinit" in time then this will make it fail.

Instead, we want to treat this situation as we would NFS4ERR_DELAY and
retry the upcall after delaying a bit with an exponential backoff.

This patch just makes any place that would handle NFS4ERR_DELAY also
handle -EKEYEXPIRED the same way. In the future, we may want to be more
sophisticated however and handle hard vs. soft mounts differently, or
specify some upper limit on how long we'll wait for a new TGT to be
acquired.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-02-10 08:30:50 -05:00
Trond Myklebust
a2c0b9e291 NFS: Ensure that we handle NFS4ERR_STALE_STATEID correctly
Even if the server is crazy, we should be able to mark the stateid as being
bad, to ensure it gets recovered.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
2010-01-26 15:42:47 -05:00
Trond Myklebust
03391693a9 NFSv4.1: Don't call nfs4_schedule_state_recovery() unnecessarily
Currently, nfs4_handle_exception() will call it twice if called with an
error of -NFS4ERR_STALE_CLIENTID, -NFS4ERR_STALE_STATEID or
-NFS4ERR_EXPIRED.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
2010-01-26 15:42:38 -05:00
Trond Myklebust
8e469ebd6d NFSv4: Don't allow posix locking against servers that don't support it
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
2010-01-26 15:42:30 -05:00
Trond Myklebust
2bee72a6aa NFSv4: Ensure that the NFSv4 locking can recover from stateid errors
In most cases, we just want to mark the lock_stateid sequence id as being
uninitialised.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
2010-01-26 15:42:21 -05:00
Trond Myklebust
72211dbe72 NFSv4: Release the sequence id before restarting a CLOSE rpc call
If the CLOSE or OPEN_DOWNGRADE call triggers a state recovery, and has
to be resent, then we must release the seqid. Otherwise the open
recovery will wait for the close to finish, which causes a deadlock.

This is mainly a NFSv4.1 problem, although it can theoretically happen
with NFSv4.0 too, in a OPEN_DOWNGRADE situation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-15 14:47:36 -05:00
Andy Adamson
68bf05efb7 nfs41: fix session fore channel negotiation
If the rsize or wsize is not set on the mount command, negotiate the highest
supported rsize and wsize in session creation.

Fixes a bug where the client negotiated nfs41_maxwrite_overhead as
ca_maxrequestsize and nfs41_maxread_overhead as ca_maxresponsesize resulting
in NFS4ERR_REQ_TOO_BIG errors on writes.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-15 13:58:42 -05:00
Andy Adamson
a5523b84c4 nfs41: do not zero seqid portion of stateid on close
Remove code left over from a previous minorversion draft.
which specified zeroing seqid portions of stateid's.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-15 13:58:36 -05:00
Alexandros Batsakis
5601a00d67 nfs: run state manager in privileged mode
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-15 13:58:23 -05:00
Alexandros Batsakis
b257957e50 nfs: make recovery state manager operations privileged
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-15 13:58:07 -05:00
Alexandros Batsakis
689cf5c15b nfs: enforce FIFO ordering of operations trying to acquire slot
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-15 13:55:18 -05:00
Alexandros Batsakis
40ead580ae nfs: remove rpc_task argument from nfs4_find_slot
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-15 13:51:22 -05:00
Alexandros Batsakis
afe6c27ccb nfs: change nfs4_do_setlk params to identify recovery type
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-15 13:50:32 -05:00
Alexandros Batsakis
0f7e720694 nfs: do not do a LOOKUP after open
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-15 13:50:21 -05:00
Alexandros Batsakis
3bfb0fc591 nfs: minor cleanup of session draining
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-15 13:50:01 -05:00
Trond Myklebust
88069f77e1 NFSv41: Fix a potential state leakage when restarting nfs4_close_prepare
Currently, if the call to nfs4_setup_sequence() in nfs4_close_prepare
fails, any later retries will fail to launch an RPC call, due to the fact
that the &state->flags will have been cleared.
Ditto if nfs4_close_done() triggers a call to the NFSv4.1 version of
nfs_restart_rpc().

We therefore move the actual clearing of the state->flags to
nfs4_close_done(), when we know that the RPC call was successful.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-08 08:33:16 -05:00
Ricardo Labiaga
74e7bb73a3 nfs41: Handle NFSv4.1 session errors in the delegation recall code
Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-07 09:48:30 -05:00
Ricardo Labiaga
7970886118 nfs41: Retry delegation return if it failed with session error
Update nfs4_delegreturn_done() to retry the operation after setting the
NFS4CLNT_SESSION_SETUP bit to indicate the need to reset the session.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-07 09:23:21 -05:00
Ricardo Labiaga
bcfa49f6f9 nfs41: Handle session errors during delegation return
Add session error handling to nfs4_open_delegation_recall()

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-07 09:22:29 -05:00
Trond Myklebust
0110ee152b NFS: Fix up the declaration of nfs4_restart_rpc when NFSv4 not configured
Also rename it: it is used in generic code, and so should not have a 'nfs4'
prefix.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-07 09:00:24 -05:00
Ricardo Labiaga
9dfdf404c9 nfs41: Don't clear DRAINING flag on NFS4ERR_STALE_CLIENTID
If CREATE_SESSION fails with NFS4ERR_STALE_CLIENTID, don't clear the
NFS4CLNT_SESSION_DRAINING flag and don't wake RPCs waiting for the
session to be reestablished.  We don't have a session yet, so there
is no reason to wake other RPCs.

This avoids sending spurious compounds with bogus sequenceID during
session and state recovery.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
[Trond.Myklebust@netapp.com: cleaned up patch by adding the
                             nfs41_begin/end_drain_session() helpers]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-06 12:57:34 -05:00
Ricardo Labiaga
9430fb6b53 nfs41: nfs41_setup_state_renewal
Move call to get the lease time and the setup of the state
renewal out of nfs4_create_session so that it can be called
after clearing the DRAINING flag.  We use the getattr RPC
to obtain the lease time, which requires a sequence slot.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-06 12:23:46 -05:00
Trond Myklebust
bcb56164ce NFSv41: More cleanups
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-05 19:32:19 -05:00
Trond Myklebust
35dc1d74a8 NFSv41: Fix up some bugs in the NFS4CLNT_SESSION_DRAINING code
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-05 19:32:19 -05:00
Trond Myklebust
d61e612a72 NFSv41: Clean up slot table management
We no longer need to maintain a distinction between nfs41_sequence_done and
nfs41_sequence_free_slot.

This fixes a number of slot table leakages in the NFSv4.1 code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-05 19:32:19 -05:00
Trond Myklebust
f26468fb93 NFSv41: Fix nfs4_proc_create_session
We should not assume that nfs41_init_clientid() will always want to
initialise the session. If it is being called due to a server reboot, then
we just want to reset the session after re-establishing the clientid.

Fix this by getting rid of the 'reset' parameter in
nfs4_proc_create_session(), and instead relying on whether or not the
session slot table pointer is non-NULL.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-05 19:32:11 -05:00
Ricardo Labiaga
fce5c838e1 nfs41: RECLAIM_COMPLETE functionality
Implements RECLAIM_COMPLETE as an asynchronous RPC.
NFS4ERR_DELAY is retried, NFS4ERR_DEADSESSION invokes the error handling
but does not result in a retry, since we don't want to have a lingering
RECLAIM_COMPLETE call sent in the middle of a possible new state recovery
cycle.  If a session reset occurs, a new wave of reclaim operations will
follow, containing their own RECLAIM_COMPLETE call.  We don't want a
retry to get on the way of recovery by incorrectly indicating to the
server that we're done reclaiming state.

A subsequent patch invokes the functionality.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-05 16:08:41 -05:00
Alexandros Batsakis
0629e370dd nfs41: check SEQUENCE status flag
the server can indicate a number of error conditions by setting the
appropriate bits in the SEQUENCE operation. The client re-establishes
state with the server when it receives one of those, with the action
depending on the specific case.

Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-05 13:46:14 -05:00
Alexandros Batsakis
2449ea2e19 nfs41: V2 adjust max_rqst_sz, max_resp_sz w.r.t to rsize, wsize
The v4.1 client should take into account the desired rsize, wsize when
negotiating the max size in CREATE_SESSION. Accordingly, it should use
rsize, wsize that are smaller than the session negotiated values.

Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-05 13:36:55 -05:00
Alexandros Batsakis
7b183d0d43 nfs41: remove server-only EXCHGID4_FLAG_CONFIRMED_R flag from exchange_id
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-05 13:33:25 -05:00
Alexandros Batsakis
4882ef72cd nfs41: add support for the exclusive create flags
In v4.1 the client MUST/SHOULD use the EXCLUSIVE4_1 flag instead of
EXCLUSIVE4, and GUARDED when the server supports persistent sessions.
For now (and until we support suppattr_exclcreat), we don't send any
attributes with EXCLUSIVE4_1 relying in the subsequent SETATTR as in v4.0

Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-05 13:30:21 -05:00
Andy Adamson
0b9e2d41f1 nfs41: only state manager sets NFS4CLNT_SESSION_SETUP
Replace sync and async handlers setting of the NFS4CLNT_SESSION_SETUP bit with
setting NFS4CLNT_CHECK_LEASE, and let the state manager decide to reset the session.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-04 16:02:14 -05:00
Andy Adamson
691daf3b0c nfs41: drain session cleanup
Do not wake up the next slot_tbl_waitq task in nfs4_free_slot because we
may be draining the slot. Either signal the state manager that the session
is drained (the state manager wakes up tasks) OR wake up the next task.

In nfs41_sequence_done, the slot dereference is only needed in the sequence
operation success case.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-04 15:55:39 -05:00
Andy Adamson
ea028ac925 nfs41: nfs41: fix state manager deadlock in session reset
If the session is reset during state recovery, the state manager thread can
sleep on the slot_tbl_waitq causing a deadlock.

Add a completion framework to the session.  Have the state manager thread set
a new session state (NFS4CLNT_SESSION_DRAINING) and wait for the session slot
table to drain.

Signal the state manager thread in nfs41_sequence_free_slot when the
NFS4CLNT_SESSION_DRAINING bit is set and the session is drained.

Reported-by: Trond Myklebust <trond@netapp.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-04 15:55:38 -05:00
Andy Adamson
05f0d23647 nfs41: remove nfs4_recover_session
nfs4_recover_session can put rpciod to sleep. Just use nfs4_schedule_recovery.

Reported-by: Trond Myklebust <trond.myklebust@netapp.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-04 15:55:37 -05:00
Andy Adamson
2628eddff1 nfs41: don't clear tk_action on success
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-04 15:55:35 -05:00
Andy Adamson
b9179237e2 nfs41: fix switch in nfs4_handle_exception
Do not fall through and call nfs4_delay on session error handling.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-04 15:55:32 -05:00
Andy Adamson
36bbe34239 nfs41: free the slot on unhandled read errors
nfs4_read_done returns zero on unhandled errors. nfs_readpage_result will
return on a negative tk_status without freeing the slot.
Call nfs4_sequence_free_slot on unhandled errors in nfs4_read_done.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-04 15:55:30 -05:00
Andy Adamson
e608e79f1b nfs41: call free slot from nfs4_restart_rpc
nfs41_sequence_free_slot can be called multiple times on SEQUENCE operation
errors.
No reason to inline nfs4_restart_rpc

Reported-by: Trond Myklebust <trond.myklebust@netapp.com>

nfs_writeback_done and nfs_readpage_retry call nfs4_restart_rpc outside the
error handler, and the slot is not freed prior to restarting in the rpc_prepare
state during session reset.

Fix this by moving the call to nfs41_sequence_free_slot from the error
path of nfs41_sequence_done into nfs4_restart_rpc, and by removing the test
for NFS4CLNT_SESSION_SETUP.
Always free slot and goto the rpc prepare state on async errors.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-04 15:55:29 -05:00
Andy Adamson
1d9ddde94a nfs41: nfs4_get_lease_time will never session reset
Make this clear by calling rpc_restart-call.
Prepare for nfs4_restart_rpc() to free slots.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-04 15:55:27 -05:00
Andy Adamson
6df08189ff nfs41: rename cl_state session SETUP bit to RESET
The bit is no longer used for session setup, only for session reset.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-04 15:55:05 -05:00
Andy Adamson
4d643d1dfa nfs41: add create session into establish_clid
Reported-by: Trond Myklebust <trond.myklebust@netapp.com>

Resetting the clientid from the state manager could result in not confirming
the clientid due to create session not being called.

Move the create session call from the NFS4CLNT_SESSION_SETUP state manager
initialize session case into the NFS4CLNT_LEASE_EXPIRED case establish_clid
call.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-04 15:52:24 -05:00
Trond Myklebust
7285f2d2ff Merge branch 'devel' into linux-next 2009-12-03 21:27:36 -05:00
NeilBrown
44ed3556ba NFS4ERR_FILE_OPEN handling in Linux/NFS
NFS4ERR_FILE_OPEN is return by the server when an operation cannot be
performed because the file is currently open and local (to the server)
semantics prohibit the operation while the file is open.
A typical case is a RENAME operation on an MS-Windows platform, which
prevents rename while the file is open.

While it is possible that such a condition is transitory, it is also
very possible that the file will be held open for an extended period
of time thus preventing the operation.

The current behaviour of Linux/NFS is to retry the operation
indefinitely.  This is not appropriate - we do not expect a rename to
take an arbitrary amount of time to complete.

Rather, and error should be returned.  The most obvious error code
would be EBUSY, which is a legal at least for 'rename' and 'unlink',
and accurately captures the reason for the error.

This patch allows a few retries until about 2 seconds have elapsed,
then returns EBUSY.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-03 21:26:36 -05:00
Trond Myklebust
1185a552e3 NFSv4: Ensure nfs4_close_context() is declared as static
Fix another 'sparse' warning in fs/nfs/nfs4proc.c

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-03 15:54:02 -05:00
Trond Myklebust
a9ed2e2583 NFSv4: Handle NFS4ERR_GRACE when recovering an expired lease.
If our lease expires, and the server reboots while we're recovering, we
need to be able to wait until the grace period is over.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-12-03 15:53:21 -05:00
Trond Myklebust
96d25e5322 NFSv4: Fix a cache validation bug which causes getcwd() to return ENOENT
Changeset a65318bf3a (NFSv4: Simplify some
cache consistency post-op GETATTRs) incorrectly changed the getattr
bitmap for readdir().
This causes the readdir() function to fail to return a
fileid/inode number, which again exposed a bug in the NFS readdir code that
causes spurious ENOENT errors to appear in applications (see
http://bugzilla.kernel.org/show_bug.cgi?id=14541).

The immediate band aid is to revert the incorrect bitmap change, but more
long term, we should change the NFS readdir code to cope with the
fact that NFSv4 servers are not required to support fileids/inode numbers.

Reported-by: Daniel J Blueman <daniel.blueman@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-11-11 16:15:42 +09:00
Trond Myklebust
141aeb9f26 NFSv4: Fix two unbalanced put_rpccred() issues.
Commits 29fba38b (nfs41: lease renewal) and fc01cea9 (nfs41: sequence
operation) introduce a couple of put_rpccred() calls on credentials for
which there is no corresponding get_rpccred().

See http://bugzilla.kernel.org/show_bug.cgi?id=14249

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-10-26 08:09:46 -04:00
Trond Myklebust
52567b03ca NFSv4: Fix a bug when the server returns NFS4ERR_RESOURCE
RFC 3530 states that when we recieve the error NFS4ERR_RESOURCE, we are not
supposed to bump the sequence number on OPEN, LOCK, LOCKU, CLOSE, etc
operations. The problem is that we map that error into EREMOTEIO in the XDR
layer, and so the NFSv4 middle-layer routines like seqid_mutating_err(),
and nfs_increment_seqid() don't recognise it.

The fix is to defer the mapping until after the middle layers have
processed the error.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-10-23 14:46:42 -04:00
Alexey Dobriyan
2bcd57ab61 headers: utsname.h redux
* remove asm/atomic.h inclusion from linux/utsname.h --
   not needed after kref conversion
 * remove linux/utsname.h inclusion from files which do not need it

NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
due to some personality stuff it _is_ needed -- cowardly leave ELF-related
headers and files alone.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-23 18:13:10 -07:00
Trond Myklebust
62ab460cf5 NFSv4: Add 'server capability' flags for NFSv4 recommended attributes
If the NFSv4 server doesn't support a POSIX attribute, the generic NFS code
needs to know that, so that it don't keep trying to poll for it.

However, by the same count, if the NFSv4 server does support that
attribute, then we should ensure that the inode metadata is appropriately
labelled as being untrusted. For instance, if we don't know the correct
value of the file's uid, we should certainly not be caching ACLs or ACCESS
results.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09 15:06:19 -04:00
Trond Myklebust
a78cb57a10 NFSv4: Don't loop forever on state recovery failure...
If the server is broken, then retrying forever won't fix it. We
should just give up after a while, and return an error to the user.
We set the number of retries to 10 for now...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-08-09 15:06:19 -04:00
Trond Myklebust
d953126a28 NFSv4: Fix a problem whereby a buggy server can oops the kernel
We just had a case in which a buggy server occasionally returns the wrong
attributes during an OPEN call. While the client does catch this sort of
condition in nfs4_open_done(), and causes the nfs4_atomic_open() to return
-EISDIR, the logic in nfs_atomic_lookup() is broken, since it causes a
fallback to an ordinary lookup instead of just returning the error.

When the buggy server then returns a regular file for the fallback lookup,
the VFS allows the open, and bad things start to happen, since the open
file doesn't have any associated NFSv4 state.

The fix is firstly to return the EISDIR/ENOTDIR errors immediately, and
secondly to ensure that we are always careful when dereferencing the
nfs_open_context state pointer.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-07-21 19:22:38 -04:00
Trond Myklebust
fccba80455 NFSv4: Fix an NFSv4 mount regression
Commit 008f55d0e0 (nfs41: recover lease in
_nfs4_lookup_root) forces the state manager to always run on mount. This is
a bug in the case of NFSv4.0, which doesn't require us to send a
setclientid until we want to grab file state.

In any case, this is completely the wrong place to be doing state
management. Moving that code into nfs4_init_session...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-07-21 16:48:07 -04:00
Alexey Dobriyan
405f55712d headers: smp_lock.h redux
* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
  It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

  This will make hardirq.h inclusion cheaper for every PREEMPT=n config
  (which includes allmodconfig/allyesconfig, BTW)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-12 12:22:34 -07:00
Benny Halevy
578e458568 nfs41: Move initialization of nfs4_opendata seq_res to nfs4_init_opendata_res
nfs4_open_recover_helper clears opendata->o_res
before calling nfs4_init_opendata_res, thus causing
NFSv4.0 OPEN operations to be sent rather than nfsv4.1.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-20 14:55:12 -04:00
Trond Myklebust
1f84603c09 Merge branch 'devel-for-2.6.31' into for-2.6.31
Conflicts:
	fs/nfs/client.c
	fs/nfs/super.c
2009-06-18 18:13:44 -07:00
James Morris
4bf259e3ae nfs: remove unnecessary NFS_INO_INVALID_ACL checks
Unless I'm mistaken, NFS_INO_INVALID_ACL is being checked twice during
getacl calls (i.e. first via nfs_revalidate_inode() and then by each all
site).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 18:02:14 -07:00
Ricardo Labiaga
f8625a6a4b nfs41: Backchannel: Add a backchannel slot table to the session
Defines a new 'struct nfs4_slot_table' in the 'struct nfs4_session'
for use by the backchannel.  Initializes, resets, and destroys the backchannel
slot table in the same manner the forechannel slot table is initialized,
reset, and destroyed.

The sequenceid for each slot in the backchannel slot table is initialized
to 0, whereas the forechannel slotid's sequenceid is set to 1.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2009-06-17 14:11:42 -07:00
Ricardo Labiaga
050047ce71 nfs41: Backchannel: Refactor nfs4_init_slot_table()
Generalize nfs4_init_slot_table() so it can be used to initialize the
backchannel slot table in addition to the forechannel slot table.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2009-06-17 14:11:42 -07:00
Ricardo Labiaga
b73dafa7ac nfs41: Backchannel: Refactor nfs4_reset_slot_table()
Generalize nfs4_reset_slot_table() so it can be used to reset the
backchannel slot table in addition to the forechannel slot table.

Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2009-06-17 14:11:41 -07:00
Andy Adamson
5a0ffe544c nfs41: Release backchannel resources associated with session
Frees the preallocated backchannel resources that are associated with
this session when the session is destroyed.

A backchannel is currently created once per session. Destroy the backchannel
only when the session is destroyed.

Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com>
Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2009-06-17 14:11:34 -07:00
Andy Adamson
0f91421e8e nfs41: Client indicates presence of NFSv4.1 callback channel.
Set the SESSION4_BACK_CHAN flag to indicate the client supports a backchannel.

Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2009-06-17 14:11:33 -07:00
Trond Myklebust
965b5d6791 NFSv4: Handle more errors when recovering open file and locking state
It is possible for servers to return NFS4ERR_BAD_STATEID when
the state management code is recovering locks or is reclaiming state when
returning a delegation. Ensure that we handle that case.
While we're at it, add in handlers for NFS4ERR_STALE,
NFS4ERR_ADMIN_REVOKED, NFS4ERR_OPENMODE, NFS4ERR_DENIED and
NFS4ERR_STALE_STATEID, since the protocol appears to allow for them too.

Also handle ENOMEM...

Finally, rather than add new NFSv4.0-specific errors and error handling into
the generic delegation code, move that open file and locking state error
handling into the NFSv4 layer.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 13:22:59 -07:00
Trond Myklebust
d5122201a7 NFSv4: Move error handling out of the delegation generic code
The NFSv4 delegation recovery code is required by the protocol to handle
more errors. Rather than add NFSv4.0 specific errors into 'generic'
delegation code, we should move the error handling into the NFSv4 layer.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 13:22:58 -07:00
Benny Halevy
34dc1ad752 nfs41: increment_{open,lock}_seqid
Unlike minorversion0, in nfsv4.1 the open and lock seqids need
not be incremented by the client and should always be set to zero.

This is implemented using a new nfs_rpc_ops methods -
increment_open_seqid and increment_lock_seqid

Signed-off-by: Rahul Iyer <iyer@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: check for session not minorversion]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:43:45 -07:00
Benny Halevy
008f55d0e0 nfs41: recover lease in _nfs4_lookup_root
This creates the nfsv4.1 session on mount.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:25:13 -07:00
Andy Adamson
b4b82607ff nfs41: get_clid_cred for EXCHANGE_ID
Unlike SETCLIENTID, EXCHANGE_ID requires a machine credential. Do not search
for credentials other than the machine credential.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:25:13 -07:00
Andy Adamson
90a16617ee nfs41: add a get_clid_cred function to nfs4_state_recovery_ops
EXCHANGE_ID has different credential requirements than SETCLIENTID.
Prepare for a separate credential function.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:25:12 -07:00
Andy Adamson
591d71cbde nfs41: establish sessions-based clientid
nfsv4.1 clientid is established via EXCHANGE_ID rather than
SETCLIENTID{,_CONFIRM}

This is implemented using a new establish_clid method in
nfs4_state_recovery_ops.

nfs41: establish clientid via exchange id only if cred != NULL

>From 2.6.26 reclaimer() uses machine cred for setting up the client id
therefore it is never expected to be NULL.

Signed-off-by: Rahul Iyer <iyer@netapp.com>
[removed dprintk]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: lease renewal]
[revamped patch for new nfs4_state_manager design]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:25:11 -07:00
Andy Adamson
a7b721037f nfs41: introduce get_state_renewal_cred
Use the machine cred for sending SEQUENCE to renew
the client's lease.

[revamp patch for new state management design starting 2.6.29]
[nfs41: support minorversion 1 for nfs4_check_lease]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: get cred in exchange_id when cred arg is NULL]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: use cl_machined_cred instead of cl_ex_cred]
    Since EXCHANGE_ID insists on using the machine credential, cl_ex_cred is
    not needed. nfs4_proc_exchange_id() is only called if the machine credential
    is available. Remove the credential logic from nfs4_proc_exchange_id.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:25:11 -07:00
Benny Halevy
8e69514f29 nfs41: support minorversion 1 for nfs4_check_lease
[moved nfs4_get_renew_cred related changes to
 "nfs41: introduce get_state_renewal_cred"]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:25:10 -07:00
Benny Halevy
29fba38b79 nfs41: lease renewal
Send a NFSv4.1 SEQUENCE op rather than RENEW that was deprecated in
minorversion 1.
Use the nfs_client minorversion to select reboot_recover/
network_partition_recovery/state_renewal ops.

Note: we use reclaimer to create the nfs41 session before there are any
cl_superblocks for the nfs_client.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: check for session not minorversion]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[revamped patch for new nfs4_state_manager design]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: obliterate nfs4_state_recovery_ops.renew_lease method]
    moved to nfs4_state_maintenance_ops
[also undid per-minorversion nfs4_state_recovery_ops here]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:25:09 -07:00
Andy Adamson
b069d94af7 nfs41: schedule async session reset
Define a new session reset state which is set upon a sequence operation error
in both the sync and async error handlers.

Place all new requests and all but the last outstanding rpc on the
slot_tbl_waitq. Spawn the recovery thread when the last slot is free.
Call nfs4_proc_destroy_session, reinitialize the session, call
nfs4_proc_create_session, clear the session reset state, and wake up the next
task on the slot_tbl_waitq.

Return the nfs4_proc_destroy_session status to the session reclaimer and
check for NFS4ERR_BADSESSION and NFS4ERR_DEADSESSION. Other destroy session
errors should be handled in nfs4_proc_destroy_session where the call can
be retried with adjusted arguments.

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
nfs41: make nfs4_wait_bit_killable public]
    nfs4_wait_bit_killable to be used by NFSv4.1 session recover logic.
Signed-off-by: Rahul Iyer <iyer@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: have create_session work on nfs_client]
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfs41: trigger the state manager for session reset]
    Replace the session reset state with the NFS4CLNT_SESSION_SETUP cl_state.
    Place all rpc tasks to sleep on the slot table waitqueue until the slot
    table is drained, then schedule state recovery and wait for it to complete.
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfs41: remove nfs41_session_recovery [ch]
Replaced by using the nfs4_state_manager.
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfs41: nfs4_wait_bit_killable only used locally]
[nfs41: keep nfs4_wait_bit_killable static]
[nfs41: keep const nfs_server in nfs4_handle_exception]
[nfs41: remove session parameter from nfs4_find_slot]
Signed-off-by: Andy Adamson <andros@netapp.com
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: resset the session from nfs41_setup_sequence]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:25:09 -07:00
Andy Adamson
4745e3154b nfs41: kick start nfs41 session recovery when handling errors
Remove checking for any errors that the SEQUENCE operation does not return.
-NFS4ERR_STALE_CLIENTID, NFS4ERR_EXPIRED, NFS4ERR_CB_PATH_DOWN, NFS4ERR_BACK_CHAN_BUSY, NFS4ERR_OP_NOT_IN_SESSION.

SEQUENCE operation error recovery is very primative, we only reset the session.

Remove checking for any errors that are returned by the SEQUENCE operation, but
that resetting the session won't address.
NFS4ERR_RETRY_UNCACHED_REP, NFS4ERR_SEQUENCE_POS,NFS4ERR_TOO_MANY_OPS.

Add error checking for missing SEQUENCE errors that a session reset will
address.
NFS4ERR_BAD_HIGH_SLOT, NFS4ERR_DEADSESSION, NFS4ERR_SEQ_FALSE_RETRY.

A reset of the session is currently our only response to a SEQUENCE operation
error. Don't reset the session on errors where a new session won't help.

Don't reset the session on errors where a new session won't help.

[nfs41: nfs4_async_handle_error update error checking]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: trigger the state manager for session reset]
    Replace session state bit with nfs_client state bit.  Set the
    NFS4CLNT_SESSION_SETUP bit upon a session related error in the sync/async
    error handlers.
[nfs41: _nfs4_async_handle_error fix session reset error list]
Sequence operation errors that session reset could help.
NFS4ERR_BADSESSION
NFS4ERR_BADSLOT
NFS4ERR_BAD_HIGH_SLOT
NFS4ERR_DEADSESSION
NFS4ERR_CONN_NOT_BOUND_TO_SESSION
NFS4ERR_SEQ_FALSE_RETRY
NFS4ERR_SEQ_MISORDERED

Sequence operation errors that a session reset would not help

NFS4ERR_BADXDR
NFS4ERR_DELAY
NFS4ERR_REP_TOO_BIG
NFS4ERR_REP_TOO_BIG_TO_CACHE
NFS4ERR_REQ_TOO_BIG
NFS4ERR_RETRY_UNCACHED_REP
NFS4ERR_SEQUENCE_POS
NFS4ERR_TOO_MANY_OPS

Signed-off-by: Andy Adamson <andros@netapp.com>
[nfs41 nfs4_handle_exception fix session reset error list]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[moved nfs41_sequece_call_done code to nfs41: sequence operation]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:25:08 -07:00
Andy Adamson
eedc020e71 nfs41: use rpc prepare call state for session reset
[nfs41: change nfs4_restart_rpc argument]
[nfs41: check for session not minorversion]
[nfs41: trigger the state manager for session reset]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[always define nfs4_restart_rpc]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:25:07 -07:00
Andy Adamson
76db6d9500 nfs41: add session setup to the state manager
At mount, nfs_alloc_client sets the cl_state NFS4CLNT_LEASE_EXPIRED bit
and nfs4_alloc_session sets the NFS4CLNT_SESSION_SETUP bit, so both bits are
set when nfs4_lookup_root calls nfs4_recover_expired_lease which schedules
the nfs4_state_manager and waits for it to complete.

Place the session setup after the clientid establishment in nfs4_state_manager
so that the session is setup right after the clientid has been established
without rescheduling the state manager.

Unlike nfsv4.0, the nfs_client struct is not ready to use until the session
has been established.  Postpone marking the nfs_client struct to NFS_CS_READY
until after a successful CREATE_SESSION call so that other threads cannot use
the client until the session is established.

If the EXCHANGE_ID call fails and the session has not been setup (the
NFS4CLNT_SESSION_SETUP bit is set), mark the client with the error and return.

If the session setup CREATE_SESSION call fails with NFS4ERR_STALE_CLIENTID
which could occur due to server reboot or network partition inbetween the
EXCHANGE_ID and CREATE_SESSION call, reset the NFS4CLNT_LEASE_EXPIRED and
NFS4CLNT_SESSION_SETUP bits and try again.

If the CREATE_SESSION call fails with other errors, mark the client with
the error and return.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>

[nfs41: NFS_CS_SESSION_SETUP cl_cons_state for back channel setup]
  On session setup, the CREATE_SESSION reply races with the server back channel
  probe which needs to succeed to setup the back channel. Set a new
  cl_cons_state NFS_CS_SESSION_SETUP just prior to the CREATE_SESSION call
  and add it as a valid state to nfs_find_client so that the client back channel
  can find the nfs_client struct and won't drop the server backchannel probe.
  Use a new cl_cons_state so that NFSv4.0 back channel behaviour which only
  sets NFS_CS_READY is unchanged.
  Adjust waiting on the nfs_client_active_wq accordingly.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>

[nfs41: rename NFS_CS_SESSION_SETUP to NFS_CS_SESSION_INITING]
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfs41: set NFS_CL_SESSION_INITING in alloc_session]
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfs41: move session setup into a function]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[moved nfs4_proc_create_session declaration here]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:25:06 -07:00
Andy Adamson
ac72b7b3b3 nfs41: reset the session slot table
Separated from nfs41: schedule async session reset

Do not kfree the session slot table upon session reset, just re-initialize it.
Add a boolean to nfs4_proc_create_session to inidicate if this is a
session reset or a session initialization.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:25:05 -07:00
Andy Adamson
fc01cea963 nfs41: sequence operation
Implement the sequence operation conforming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26

Check returned sessionid, slotid and slot sequenceid in decode_sequence.

If the server returns different values for sessionID, slotID or slot sequence
number than what was sent, the server is looney tunes.

Pass the sequence operation status to nfs41_sequence_done in order to
determine when to increment the slot sequence ID.

Free slot is separated from sequence done.

Signed-off-by: Rahul Iyer <iyer@netapp.com>
Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com>
Signed-off-by: Andy Adamson<andros@umich.edu>
[nfs41: sequence res use slotid]
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfs41: deref slot table in decode_sequence only for minorversion!=0]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: nfs4_call_sync]
[nfs41: remove SEQ4_STATUS_USE_TK_STATUS]
[nfs41: return ESERVERFAULT in decode_sequence]
[no sr_session, no sr_flags]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: use nfs4_call_sync_sequence to renew session lease]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: remove nfs4_call_sync_sequence forward definition]
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfs41: use struct nfs_client for nfs41_proc_async_sequence]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: pass *session in seq_args and seq_res]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41 nfs41_sequence_call_done update error checking]
[nfs41 nfs41_sequence_done update error checking]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: remove switch on error from nfs41_sequence_call_done]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:25:04 -07:00
Andy Adamson
8328d59f38 nfs41: enable nfs_client only nfs4_async_handle_error
The session is per struct nfs_client, not per nfs_server. Allow the handler
to be called with no nfs_server which simplifies the nfs4_proc_async_sequence session renewal call and will let it be used by pnfs file layout data servers.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:25:04 -07:00
Andy Adamson
0f3e66c6a6 nfs41: destroy_session operation
Implement the destroy_session operation conforming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26

Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com>
Signed-off-by: Andy Adamson<andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: remove extraneous rpc_clnt pointer]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41; NFS_CS_READY required for DESTROY_SESSION]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: pass *session in seq_args and seq_res]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
[nfs41: fix encode_destroy_session's xdr Xcoding pointer type]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2009-06-17 12:24:55 -07:00
Andy Adamson
8d35301d7d nfs41: verify session channel attribues
Invalidate the session if the server returns invalid fore or back channel
attributes.

Use a KERN_WARNING to report the fatal session estabishment error.

Signed-off-by: Andy Adamson <andros@netapp.com>
[refactor nfs4_verify_channel_attrs]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:24:52 -07:00
Andy Adamson
fc931582c2 nfs41: create_session operation
Implement the create_session operation conforming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26

Set the real fore channel max operations to preserve server resources.
Note: If the server returns < NFS4_MAX_OPS, the client will very soon
get an NFS4ERR_TOO_MANY_OPS. A later patch will handle this.

Set the max_rqst_sz and max_resp_sz to PAGE_SIZE - we preallocate the buffers.

Set the back channel max_resp_sz_cached to zero to force the client to
always set csa_cachethis to FALSE because the current implementation
of the back channel DRC only supports caching the CB_SEQUENCE operation.

The client back channel server supports one slot, and desires 2 operations
per compound.

Signed-off-by: Ricardo Labiaga <ricardo.labiaga@netapp.com>
Signed-off-by: Andy Adamson<andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: remove extraneous rpc_clnt pointer]
Use the struct nfs_client cl_rpcclient.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: nfs4_init_channel_attrs, just use nfs41_create_session_args]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: use rsize and wsize for session channel attributes]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: set channel max operations]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: set back channel attributes]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: obliterate nfs4_adjust_channel_attrs]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: have create_session work on nfs_client]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: move CONFIG_NFS_V4_1 endif]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: pass *session in seq_args and seq_res]
[moved nfs4_init_slot_table definition here]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: use kcalloc to allocate slot table]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
[nfs41: fix Xcode_create_session's xdr Xcoding pointer type]
[nfs41: refactor decoding of channel attributes]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2009-06-17 12:24:34 -07:00
Andy Adamson
2050f0cc07 nfs41: get_lease_time
get_lease_time uses the FSINFO rpc operation to
get the lease time attribute.

nfs4_get_lease_time() is only called from the state manager on session setup
so don't recover from clientid or sequence level errors.

We do need to recover from NFS4ERR_DELAY or NFS4ERR_GRACE.
Use NFS4_POLL_RETRY_MIN - the Linux server returns NFS4ERR_DELAY when an
upcall is needed to resolve an uncached export referenced by a file handle.

[nfs41: sequence res use slotid]
Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: remove extraneous rpc_clnt pointer]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: have get_lease_time work on nfs_client]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: get_lease_time recover from NFS4ERR_DELAY]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: pass *session in seq_args and seq_res]
[define nfs4_get_lease_time_{args,res}]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 12:24:32 -07:00
Benny Halevy
99fe60d062 nfs41: exchange_id operation
Implement the exchange_id operation conforming to
http://tools.ietf.org/html/draft-ietf-nfsv4-minorversion1-26

Unlike NFSv4.0, NFSv4.1 requires machine credentials. RPC_AUTH_GSS machine
credentials will be passed into the kernel at mount time to be available for
the exchange_id operation.

RPC_AUTH_UNIX root mounts can use the UNIX root credential. Store the root
credential in the nfs_client struct.

Without a credential, NFSv4.1 state renewal fails.

[nfs41: establish clientid via exchange id only if cred != NULL]
Signed-off-by: Andy Adamson<andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfsd41: move nfstime4 from under CONFIG_NFS_V4_1]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: do not wait a lease time in exchange id]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: pass *session in seq_args and seq_res]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
[nfs41: Ignoring impid in decode_exchange_id is missing a READ_BUF]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: fix Xcode_exchange_id's xdr Xcoding pointer type]
[nfs41: get rid of unused struct nfs41_exchange_id_res members]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2009-06-17 12:23:57 -07:00
Andy Adamson
938e101091 nfs41 delegreturn sequence setup done support
Separate delegreturn calls from nfs41: sequence setup/done support

Implement the delegreturn rpc_call_prepare method for
asynchronuos nfs rpcs, call nfs41_setup_sequence from
respective rpc_call_validate_args methods.

Call nfs4_sequence_done from respective rpc_call_done methods.

Note that we need to pass a pointer to the nfs_server in calls data
for passing on to nfs4_sequence_done.

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[pnfs: client data server write validate and release]
Signed-off-by: Andy Adamson<andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:51 -07:00
Andy Adamson
21d9a851aa nfs41 commit sequence setup done support
Separate commit calls from nfs41: sequence setup/done support

Implement the commit rpc_call_prepare method for
asynchronuos nfs rpcs, call nfs41_setup_sequence from
respective rpc_call_validate_args methods.

Call nfs4_sequence_done from respective rpc_call_done methods.

Note that we need to pass a pointer to the nfs_server in calls data
for passing on to nfs4_sequence_done.

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[pnfs: client data server write validate and release]
Signed-off-by: Andy Adamson<andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: Support sessions with O_DIRECT.]
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: separate free slot from sequence done]
[nfs41: nfs4_sequence_free_slot use nfs_client for data server]
Signed-off-by: Andy Adamson<andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:50 -07:00
Andy Adamson
def6ed7ef4 nfs41 write sequence setup done support
Separate write calls from nfs41: sequence setup/done support

Implement the write rpc_call_prepare method for
asynchronuos nfs rpcs, call nfs41_setup_sequence from
respective rpc_call_validate_args methods.

Call nfs4_sequence_done from respective rpc_call_done methods.

Note that we need to pass a pointer to the nfs_server in calls data
for passing on to nfs4_sequence_done.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[pnfs: client data server write validate and release]
Signed-off-by: Andy Adamson <andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[move the nfs4_sequence_free_slot call in nfs_readpage_retry from]
[nfs41: separate free slot from sequence done
Signed-off-by: Andy Adamson <andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: Support sessions with O_DIRECT.]
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: nfs4_sequence_free_slot use nfs_client for data server]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:49 -07:00
Andy Adamson
f11c88af26 nfs41: read sequence setup/done support
Implement the read rpc_call_prepare method for
asynchronuos nfs rpcs, call nfs41_setup_sequence from
respective rpc_call_validate_args methods.

Call nfs4_sequence_done from respective rpc_call_done methods.

Note that we need to pass a pointer to the nfs_server in calls data
for passing on to nfs4_sequence_done.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[pnfs: client data server write validate and release]
Signed-off-by: Andy Adamson <andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[move the nfs4_sequence_free_slot call in nfs_readpage_retry from]
[nfs41: separate free slot from sequence done]
[remove nfs_readargs.nfs_server, use calldata->inode instead]
Signed-off-by: Andy Adamson <andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: Support sessions with O_DIRECT]
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: nfs4_sequence_free_slot use nfs_client for data server]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:48 -07:00
Andy Adamson
472cfbd9b9 nfs41: unlink sequence setup/done support
Implement the rpc_call_prepare methods for
asynchronuos nfs rpcs, call nfs41_setup_sequence from
respective rpc_call_validate_args methods.

Call nfs4_sequence_done from respective rpc_call_done methods.

Note that we need to pass a pointer to the nfs_server in calls data
for passing on to nfs4_sequence_done.

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[pnfs: client data server write validate and release]
Signed-off-by: Andy Adamson<andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: separate free slot from sequence done]
[nfs41: sequence res use slotid]
[nfs41: remove SEQ4_STATUS_USE_TK_STATUS]
[nfs41: nfs4_sequence_free_slot use nfs_client for data server]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:47 -07:00
Andy Adamson
a893693c15 nfs41: locku sequence setup/done support
Separate nfs4_locku calls from nfs41: sequence setup/done support
Call nfs4_sequence_done from respective rpc_call_done methods.

Note that we need to pass a pointer to the nfs_server in calls data
for passing on to nfs4_sequence_done.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[pnfs: client data server write validate and release]
Signed-off-by: Andy Adamson <andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: nfs4_sequence_free_slot use nfs_client for data server]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:46 -07:00
Andy Adamson
66179efee3 nfs41: lock sequence setup/done support
Separate nfs4_lock calls from nfs41: sequence setup/done support
Call nfs4_sequence_done from respective rpc_call_done methods.

Note that we need to pass a pointer to the nfs_server in calls data
for passing on to nfs4_sequence_done.

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[pnfs: client data server write validate and release]
[use nfs4_sequence_done_free_slot]
Signed-off-by: Andy Adamson<andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:45 -07:00
Andy Adamson
d898528cdb nfs41: open sequence setup/done support
Separate nfs4_open calls from nfs41: sequence setup/done support
Call nfs4_sequence_done from respective rpc_call_done methods.

Note that we need to pass a pointer to the nfs_server in calls data
for passing on to nfs4_sequence_done.

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[pnfs: client data server write validate and release]
[use nfs4_sequence_done_free_slot]
Signed-off-by: Andy Adamson<andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:44 -07:00
Andy Adamson
19ddab06ed nfs41: close sequence setup/done support
Separate nfs4_close calls from nfs41: sequence setup/done support
Call nfs4_sequence_done from respective rpc_call_done methods.

Note that we need to pass a pointer to the nfs_server in calls data
for passing on to nfs4_sequence_done.

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[pnfs: client data server write validate and release]
Signed-off-by: Andy Adamson<andros@umich.edu>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: separate free slot from sequence done]
[nfs41: sequence res use slotid]
[nfs41: remove SEQ4_STATUS_USE_TK_STATUS]
[nfs41: nfs4_sequence_free_slot use nfs_client for data server]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:43 -07:00
Andy Adamson
69ab40c4c3 nfs41: nfs41_call_sync_done
Implement nfs4.1 synchronous rpc_call_done method
that essentially just calls nfs4_sequence_done, that turns
around and calls nfs41_sequence_done for minorversion1 rpcs.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: check for session not minorversion]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[move adding nfs4_sequence_free_slot from nfs41-separate-free-slot-from-sequence-done]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: nfs41_call_sync_data use nfs_client not nfs_server]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:42 -07:00
Andy Adamson
b0df806c0f nfs41: nfs41_sequence_done
Handle session level errors, update slot sequence id and
sessions bookeeping, free slot.

[nfs41: sequence res use slotid]
Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: remove SEQ4_STATUS_USE_TK_STATUS]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: check for session not minorversion]
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfs41: bail out early out of nfs41_sequence_done if !res->sr_session]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[move nfs4_sequence_done from nfs41: nfs41_call_sync_done]
Signed-off-by: Andy Adamson <andros@netapp.com>
[move nfs4_sequence_free_slot from nfs41: separate free slot from sequence done]
    Don't free the slot until after all rpc_restart_calls have completed.
    Session reset will require more work.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[moved reset sr_slotid to nfs41_sequence_free_slot]
[free slot also on unexpectecd error]
[remove seq_res.sr_session member, use nfs_client's instead]
[ditch seq_res.sr_flags until used]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[look at sr_slotid for bailing out early from nfs41_sequence_done]
[nfs41: rpc_wake_up_next if sessions slot was not consumed.]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: nfs4_sequence_free_slot use nfs_client for data server]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: remove unused error checking in nfs41_sequence_done]
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfs41: remove nfs4_has_session check in nfs41_sequence_done]
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfs41: remove nfs_client pointer check]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:41 -07:00
Andy Adamson
13615871cd nfs41: nfs41_sequence_free_slot
[from nfs41: separate free slot from sequence done]

Don't free the slot until after all rpc_restart_calls have completed.
Session reset will require more work.

As noted by Trond, since we're using rpc_wake_up_next rather than
rpc_wake_up() we must always wake up the next task in the queue
either by going through nfs4_free_slot, or just calling
rpc_wake_up_next if no slot is to be freed.

[nfs41: sequence res use slotid]
[nfs41: remove SEQ4_STATUS_USE_TK_STATUS]
[got rid of nfs4_sequence_res.sr_session, use nfs_client.cl_session instead]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: rpc_wake_up_next if sessions slot was not consumed.]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: nfs4_sequence_free_slot use nfs_client for data server]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:41 -07:00
Andy Adamson
e2c4ab3ce2 nfs41: free slot
Free a slot in the slot table.

Mark the slot as free in the bitmap-based allocation table
by clearing a bit corresponding to the slotid.

Update lowest_free_slotid if freed slotid is lower than that.
Update highest_used_slotid.  In the case the freed slotid
equals the highest_used_slotid, scan downwards for the next
highest used slotid using the optimized fls* functions.

Finally, wake up thread waiting on slot_tbl_waitq for a free slot
to become available.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: free slot use slotid]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: use find_first_zero_bit for nfs4_find_slot]
    While at it, obliterate lowest_free_slotid and fix-up related comments.
    As per review comment 21/85.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: use __clear_bit for nfs4_free_slot]
    While at it, fix-up function comment.
    Part of review comment 22/85.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: use find_last_bit in nfs4_free_slot to determine highest used slot.]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: rpc_sleep_on slot_tbl_waitq must be called under slot_tbl_lock]
    Otherwise there's a race (we've hit) with nfs4_free_slot where
    nfs41_setup_sequence sees a full slot table, unlocks slot_tbl_lock,
    nfs4_free_slots happen concurrently and call rpc_wake_up_next
    where there's nobody to wake up yet, context goes back to
    nfs41_setup_sequence which goes to sleep when the slot table
    is actually empty now and there's no-one to wake it up anymore.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:39 -07:00
Andy Adamson
fbcd4abcb3 nfs41: setup_sequence method
Allocate a slot in the session slot table and set the sequence op arguments.

Called at the rpc prepare stage.

Add a status to nfs41_sequence_res, initialize it to one so that we catch
rpc level failures which do not go through decode_sequence which sets
the new status field.

Note that upon an rpc level failure, we don't know if the server processed the
sequence operation or not. Proceed as if the server did process the sequence
operation.

Signed-off-by: Rahul Iyer <iyer@netapp.com>
[nfs41: sequence args use slotid]
[nfs41: find slot return slotid]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: remove SEQ4_STATUS_USE_TK_STATUS]
As per 11-14-08 review
[move extern declaration from nfs41: sequence setup/done support]
[removed sa_session definition, changed sa_cache_this into a u8 to reduce footprint]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: rpc_sleep_on slot_tbl_waitq must be called under slot_tbl_lock]
    Otherwise there's a race (we've hit) with nfs4_free_slot where
    nfs41_setup_sequence sees a full slot table, unlocks slot_tbl_lock,
    nfs4_free_slots happen concurrently and call rpc_wake_up_next
    where there's nobody to wake up yet, context goes back to
    nfs41_setup_sequence which goes to sleep when the slot table
    is actually empty now and there's no-one to wake it up anymore.
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:39 -07:00
Benny Halevy
510b81756f nfs41: find slot
Find a free slot using bitmap-based allocation.
Use the optimized ffz function to find a zero bit
in the bitmap that indicates a free slot, starting
the search from the 'lowest_free_slotid' position.

If found, mark the slot as used in the bitmap, get
the slot's slotid and seqid, and update max_slotid
to be used by the SEQUENCE operation.

Also, update lowest_free_slotid for next search.

If no free slot was found the caller has to wait
for a free slot (outside the scope of this function)

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: find slot return slotid]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[use find_first_zero_bit for nfs4_find_slot as per review comment 21/85.]
[use NFS4_MAX_SLOT_TABLE rather than NFS4_NO_SLOT]
[nfs41: rpc_sleep_on slot_tbl_waitq must be called under slot_tbl_lock]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:37 -07:00
Andy Adamson
ce5039c1be nfs41: nfs4_setup_sequence
Perform the nfs4_setup_sequence in the rpc_call_prepare state.

If a session slot is not available, we will rpc_sleep_on the
slot wait queue leaving the tk_action as rpc_call_prepare.

Once we have a session slot, hang on to it even through rpc_restart_calls.
Ensure the nfs41_sequence_res sr_slot pointer is NULL before rpc_run_task is
called as nfs41_setup_sequence will only find a new slot if it is NULL.

A future patch will call free slot after any rpc_restart_calls, and handle the
rpc restart that result from a sequence operation error.

Signed-off-by: Rahul Iyer <iyer@netapp.com>
[nfs41: sequence res use slotid]
Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: simplify nfs4_call_sync]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: nfs4_call_sync]
[nfs41: check for session not minorversion]
[nfs41: remove rpc_message from nfs41_call_sync_args]
[moved NFS4_MAX_SLOT_TABLE logic into nfs41_setup_sequence]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: nfs41_call_sync_data use nfs_client not nfs_server]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: expose nfs4_call_sync_session for lease renewal]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: remove unnecessary return check]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:36 -07:00
Andy Adamson
5f7dbd5c75 nfs41: set up seq_res.sr_slotid
Initialize nfs4_sequence_res sr_slotid to NFS4_MAX_SLOT_TABLE.

[was nfs41: sequence res use slotid]
Signed-off-by: Andy Adamson <andros@netapp.com>
[pulled definition of struct nfs4_sequence_res.sr_slotid to here]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:30 -07:00
Benny Halevy
f3752975ca nfs41: nfs41: pass *session in seq_args and seq_res
To be used for getting the rpc's minorversion and for nfs41 xdr
{en,de}coding of the sequence operation.
Reset the seq session ptrs for minorversion=0 rpc calls.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:29 -07:00
Andy Adamson
cccef3b96a nfs41: introduce nfs4_call_sync
Use nfs4_call_sync rather than rpc_call_sync to provide
for a nfs41 sessions-enabled interface for sessions manipulation.

The nfs41 rpc logic uses the rpc_call_prepare method to
recover and create the session, as well as selecting a free slot id
and the rpc_call_done to free the slot and update slot table
related metadata.

In the coming patches we'll add rpc prepare and done routines
for setting up the sequence op and processing the sequence result.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: nfs4_call_sync]
As per 11-14-08 review.
Squash into "nfs41: introduce nfs4_call_sync" and "nfs41: nfs4_setup_sequence"
Define two functions one for v4 and one for v41
add a pointer to struct nfs4_client to the correct one.
Signed-off-by: Andy Adamson <andros@netapp.com>
[added BUG() in _nfs4_call_sync_session if !CONFIG_NFS_V4_1]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: check for session not minorversion]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[group minorversion specific stuff together]
Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Andy Adamson <andros@netapp.com>
[nfs41: fixup nfs4_clear_client_minor_version]
[introduce nfs4_init_client_minor_version() in this patch]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[cleaned-up patch: got rid of nfs_call_sync_t, dprintks, cosmetics, extra server defs]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:28 -07:00
Benny Halevy
22958463d5 nfs41: use nfs4_fs_locations_res
In preparation for nfs41 sequence processing.

Signed-off-by: Andy Admason <andros@netapp.com>
[find nfs4_fs_locations_res]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:27 -07:00
Benny Halevy
73c403a9a9 nfs41: use nfs4_setaclres
In preparation for nfs41 sequence processing.

Signed-off-by: Andy Admason <andros@netapp.com>
[define nfs_setaclres]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:26 -07:00
Benny Halevy
663c79b3cd nfs41: use nfs4_getaclres
In preparation for nfs41 sequence processing.

Signed-off-by: Andy Admason <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: embed resp_len in nfs_getaclres]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:25 -07:00
Benny Halevy
d45b2989a7 nfs41: use nfs4_pathconf_res
In preparation for nfs41 sequence processing.

Signed-off-by: Andy Admason <andros@netapp.com>
[define nfs4_pathconf_res]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:24 -07:00
Benny Halevy
3dda5e4347 nfs41: use nfs4_fsinfo_res
In preparation for nfs41 sequence processing.

Signed-off-by: Andy Admason <andros@netapp.com>
[define nfs4_fsinfo_res]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:23 -07:00
Benny Halevy
24ad148a0f nfs41: use nfs4_statfs_res
In preparation for nfs41 sequence processing.

Signed-off-by: Andy Admason <andros@netapp.com>
[define nfs4_statfs_res]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:22 -07:00
Benny Halevy
f50c700081 nfs41: use nfs4_readlink_res
In preparation for nfs41 sequence processing.

Signed-off-by: Andy Admason <andros@netapp.com>
[define nfs4_readlink_res]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:21 -07:00
Benny Halevy
43652ad553 nfs41: use nfs4_server_caps_arg
In preparation for nfs41 sequence processing.

Signed-off-by: Andy Admason <andros@netapp.com>
[define nfs4_server_caps_arg]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:20 -07:00
Andy Adamson
557134a39c nfs41: sessions client infrastructure
NFSv4.1 Sessions basic data types, initialization, and destruction.

The session is always associated with a struct nfs_client that holds
the exchange_id results.

Signed-off-by: Rahul Iyer <iyer@netapp.com>
Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[remove extraneous rpc_clnt pointer, use the struct nfs_client cl_rpcclient.
remove the rpc_clnt parameter from nfs4 nfs4_init_session]
Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[Use the presence of a session to determine behaviour instead of the
minorversion number.]
Signed-off-by: Andy Adamson <andros@netapp.com>
[constified nfs4_has_session's struct nfs_client parameter]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[Rename nfs4_put_session() to nfs4_destroy_session() and call it from nfs4_free_client() not nfs4_free_server().
Also get rid of nfs4_get_session() and the ref_count in nfs4_session struct as keeping track of nfs_client should be sufficient]
Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com>
[nfs41: pass rsize and wsize into nfs4_init_session]
Signed-off-by: Andy Adamson <andros@netapp.com>
[separated out removal of rpc_clnt parameter from nfs4_init_session ot a
 patch of its own]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[Pass the nfs_client pointer into nfs4_alloc_session]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: don't assign to session->clp->cl_session in nfs4_destroy_session]
[nfs41: fixup nfs4_clear_client_minor_version]
[introduce nfs4_clear_client_minor_version() in this patch]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[Refactor nfs4_init_session]
    Moved session allocation into nfs4_init_client_minor_version, called from
    nfs4_init_client.
    Leave rwise and wsize initialization in nfs4_init_session, called from
    nfs4_init_server.
    Reverted moving of nfs_fsid definition to nfs_fs_sb.h
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: Move NFS4_MAX_SLOT_TABLE define from under CONFIG_NFS_V4_1]
[Fix comile error when CONFIG_NFS_V4_1 is not set.]
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[moved nfs4_init_slot_table definition to "create_session operation"]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: alloc session with GFP_KERNEL]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-06-17 10:46:19 -07:00
Trond Myklebust
95baa25c73 NFSv4: Fix the case where NFSv4 renewal fails
If the asynchronous lease renewal fails (usually due to a soft timeout),
then we _must_ schedule state recovery in order to ensure that we don't
lose the lease unnecessarily or, if the lease is already lost, that we
recover the locking state promptly...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-05-26 14:51:00 -04:00
Linus Torvalds
8fe74cf053 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  Remove two unneeded exports and make two symbols static in fs/mpage.c
  Cleanup after commit 585d3bc06f
  Trim includes of fdtable.h
  Don't crap into descriptor table in binfmt_som
  Trim includes in binfmt_elf
  Don't mess with descriptor table in load_elf_binary()
  Get rid of indirect include of fs_struct.h
  New helper - current_umask()
  check_unsafe_exec() doesn't care about signal handlers sharing
  New locking/refcounting for fs_struct
  Take fs_struct handling to new file (fs/fs_struct.c)
  Get rid of bumping fs_struct refcount in pivot_root(2)
  Kill unsharing fs_struct in __set_personality()
2009-04-02 21:09:10 -07:00
Al Viro
ce3b0f8d5c New helper - current_umask()
current->fs->umask is what most of fs_struct users are doing.
Put that into a helper function.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2009-03-31 23:00:26 -04:00
Trond Myklebust
7fe5c398fc NFS: Optimise NFS close()
Close-to-open cache consistency rules really only require us to flush out
writes on calls to close(), and require us to revalidate attributes on the
very last close of the file.

Currently we appear to be doing a lot of extra attribute revalidation
and cache flushes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-19 15:35:50 -04:00
Trond Myklebust
72cb77f4a5 NFS: Throttle page dirtying while we're flushing to disk
The following patch is a combination of a patch by myself and Peter
Staubach.

Trond: If we allow other processes to dirty pages while a process is doing
a consistency sync to disk, we can end up never making progress.

Peter: Attached is a patch which addresses a continuing problem with
the NFS client generating out of order WRITE requests.  While
this is compliant with all of the current protocol
specifications, there are servers in the market which can not
handle out of order WRITE requests very well.  Also, this may
lead to sub-optimal block allocations in the underlying file
system on the server.  This may cause the read throughputs to
be reduced when reading the file from the server.

Peter: There has been a lot of work recently done to address out of
order issues on a systemic level.  However, the NFS client is
still susceptible to the problem.  Out of order WRITE
requests can occur when pdflush is in the middle of writing
out pages while the process dirtying the pages calls
generic_file_buffered_write which calls
generic_perform_write which calls
balance_dirty_pages_rate_limited which ends up calling
writeback_inodes which ends up calling back into the NFS
client to writes out dirty pages for the same file that
pdflush happens to be working with.

Signed-off-by: Peter Staubach <staubach@redhat.com>
[modification by Trond to merge the two similar patches]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-11 14:10:30 -04:00
Trond Myklebust
a65318bf3a NFSv4: Simplify some cache consistency post-op GETATTRs
Certain asynchronous operations such as write() do not expect
(or care) that other metadata such as the file owner, mode, acls, ...
change. All they want to do is update and/or check the change attribute,
ctime, and mtime.
By skipping the file owner and group update, we also avoid having to do a
potential idmapper upcall for these asynchronous RPC calls.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-11 14:10:28 -04:00
Trond Myklebust
69aaaae18f NFSv4: A referral is assumed to always point to a directory.
Fix a bug whereby we would fail to create a mount point for a referral.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-11 14:10:28 -04:00
WANG Cong
46f72f57d2 fs/nfs/nfs4proc.c: make nfs4_map_errors() static
nfs4_map_errors() can become static.

Signed-off-by: WANG Cong <wangcong@zeuux.org>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-30 16:35:55 -05:00
Trond Myklebust
027b6ca021 NFSv4: Fix an infinite loop in the NFS state recovery code
Marten Gajda <marten.gajda@fernuni-hagen.de> states:

I tracked the problem down to the function nfs4_do_open_expired.
Within this function _nfs4_open_expired is called and may return
-NFS4ERR_DELAY. When a further call to _nfs4_open_expired is
executed and does not return -NFS4ERR_DELAY the "exception.retry"
variable is not reset to 0, causing the loop to iterate again
(and as long as err != -NFS4ERR_DELAY, probably forever)

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23 16:04:13 -05:00
Trond Myklebust
dc0b027dfa NFSv4: Convert the open and close ops to use fmode
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23 15:21:56 -05:00
Trond Myklebust
b7391f44f2 NFSv4: Return unreferenced delegations more promptly
If the client is not using a delegation, the right thing to do is to return
it as soon as possible. This helps reduce the amount of state the server
has to track, as well as reducing the potential for conflicts with other
clients.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23 15:21:52 -05:00
Trond Myklebust
e005e8041c NFSv4: Rename the state reclaimer thread
It is really a more general purpose state management thread at this point.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23 15:21:48 -05:00
Trond Myklebust
9e33bed552 NFSv4: Add recovery for individual stateids
NFSv4 defines a number of state errors which the client does not currently
handle. Among those we should worry about are:
  NFS4ERR_ADMIN_REVOKED - the server's administrator revoked our locks
  			  and/or delegations.
  NFS4ERR_BAD_STATEID - the client and server are out of sync, possibly
                        due to a delegation return racing with an OPEN
			request.
  NFS4ERR_OPENMODE - the client attempted to do something not sanctioned
  		     by the open mode of the stateid. Should normally just
		     occur as a result of a delegation return race.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23 15:21:46 -05:00
Trond Myklebust
95d35cb4c4 NFSv4: Remove nfs_client->cl_sem
Now that we're using the flags to indicate state that needs to be
recovered, as well as having implemented proper refcounting and spinlocking
on the state and open_owners, we can get rid of nfs_client->cl_sem. The
only remaining case that was dubious was the file locking, and that case is
now covered by the nfsi->rwsem.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23 15:21:45 -05:00
Trond Myklebust
19e03c570e NFSv4: Ensure that file unlock requests don't conflict with state recovery
The unlock path is currently failing to take the nfs_client->cl_sem read
lock, and hence the recovery path may see locks disappear from underneath
it.
Also ensure that it takes the nfs_inode->rwsem read lock so that it there
is no conflict with delegation recalls.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23 15:21:44 -05:00
Trond Myklebust
65de872ed6 NFS: Remove the unnecessary argument to nfs4_wait_clnt_recover()
...and move some code around in order to clear out an unnecessary
forward declaration.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23 15:21:44 -05:00
Trond Myklebust
7eff03aec9 NFSv4: Add a recovery marking scheme for state owners
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23 15:21:43 -05:00
Trond Myklebust
e598d843c0 NFSv4: Remove redundant RENEW calls if we know the lease has expired
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23 15:21:42 -05:00
Trond Myklebust
b79a4a1b45 NFSv4: Fix state recovery when the client runs over the grace period
If the client for some reason is not able to recover all its state within
the time allotted for the grace period, and the server reboots again, the
client is not allowed to recover the state that was 'lost' using reboot
recovery.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23 15:21:41 -05:00
Trond Myklebust
15c831bf1a NFS: Use atomic bitops when changing struct nfs_delegation->flags
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23 15:21:39 -05:00
Trond Myklebust
343104308a NFSv4: Fix up another delegation related race
When we can update_open_stateid(), we need to be certain that we don't
race with a delegation return. While we could do this by grabbing the
nfs_client->cl_lock, a dedicated spin lock in the delegation structure
will scale better.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23 15:21:38 -05:00
Neil Brown
504e518953 Make nfs_file_cred more robust.
As not all files have an associated open_context (e.g. device special
files), it is safest to test for the existence of the open context
before de-referencing it.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-10-17 13:06:45 -04:00
Trond Myklebust
bba67e0e3f NFS: Remove BKL usage from open()
All the NFSv4 stateful operations are already protected by other locks (in
particular by the rpc_sequence locks.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-15 18:10:53 -04:00
Benny Halevy
77e03677ac nfs: initialize timeout variable in nfs4_proc_setclientid_confirm
gcc (4.3.0) rightfully warns about this:
/usr0/export/dev/bhalevy/git/linux-pnfs-bh-nfs41/fs/nfs/nfs4proc.c: In function nfs4_proc_setclientid_confirm:
/usr0/export/dev/bhalevy/git/linux-pnfs-bh-nfs41/fs/nfs/nfs4proc.c:2936: warning: timeout may be used uninitialized in this function

nfs4_delay that's passed a pointer to 'timeout' is looking at its value
and sets it up to some value in the range: NFS4_POLL_RETRY_MIN..NFS4_POLL_RETRY_MAX
	if (*timeout <= 0)
		*timeout = NFS4_POLL_RETRY_MIN;
	if (*timeout > NFS4_POLL_RETRY_MAX)
		*timeout = NFS4_POLL_RETRY_MAX;

Therefore it will end up set to some sane, though rather indeterministic, value.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09 12:09:30 -04:00
Trond Myklebust
f41f741838 NFS: Ensure we zap only the access and acl caches when setting new acls
...and ensure that we obey the NFS_INO_INVALID_ACL flag when retrieving the
acls.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09 12:09:19 -04:00
Trond Myklebust
2e96d28672 NFS: Fix a warning in nfs4_async_handle_error
We're not modifying the nfs_server when we call nfs_inc_server_stats and
friends, so allow the compiler to pass 'const' pointers too.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09 12:09:18 -04:00
Trond Myklebust
46cb650c22 NFS: Remove the redundant file_open entry from struct nfs_rpc_ops
All instances are set to nfs_open(), so we should just remove the redundant
indirection. Ditto for the file_release op

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09 12:09:16 -04:00
Trond Myklebust
659bfcd6dd NFS: Fix the ftruncate() credential problem
ftruncate() access checking is supposed to be performed at open() time,
just like reads and writes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09 12:09:14 -04:00
Trond Myklebust
57dc9a5747 NFS: Reduce the stack usage in NFSv4 create operations
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-09 12:08:37 -04:00
Jan Blunck
31f31db1a1 nfs: path_{get,put}() cleanups
Here are some more places where path_{get,put}() can be used instead of
dput()/mntput() pair.

Signed-off-by: Jan Blunck <jblunck@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-05-16 09:43:30 -07:00
Harvey Harrison
3110ff8048 nfs: replace remaining __FUNCTION__ occurrences
__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-05-16 09:43:29 -07:00
Trond Myklebust
78ea323be6 NFSv4: Don't use cred->cr_ops->cr_name in nfs4_proc_setclientid()
With the recent change to generic creds, we can no longer use
cred->cr_ops->cr_name to distinguish between RPCSEC_GSS principals and
AUTH_SYS/AUTH_NULL identities. Replace it with the rpc_authops->au_name
instead...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:54:53 -04:00
Trond Myklebust
536ff0f809 NFSv4: Ensure we don't corrupt fl->fl_flags in nfs4_proc_unlck
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:53:33 -04:00
Trond Myklebust
c1d519312d NFSv4: Only increment the sequence id if the server saw it
It is quite possible that the OPEN, CLOSE, LOCK, LOCKU,... compounds fail
before the actual stateful operation has been executed (for instance in the
PUTFH call). There is no way to tell from the overall status result which
operations were executed from the COMPOUND.

The fix is to move incrementing of the sequence id into the XDR layer,
so that we do it as we process the results from the stateful operation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:53:15 -04:00
Trond Myklebust
35d05778e2 NFSv4: Remove bogus call to nfs4_drop_state_owner() in _nfs4_open_expired()
There should be no need to invalidate a perfectly good state owner just
because of a stale filehandle. Doing so can cause the state recovery code
to break, since nfs4_get_renew_cred() and nfs4_get_setclientid_cred() rely
on finding active state owners.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-04-19 16:53:12 -04:00
Trond Myklebust
98a8e32394 SUNRPC: Add a helper rpcauth_lookup_generic_cred()
The NFSv4 protocol allows clients to negotiate security protocols on the
fly in the case where an administrator on the server changes the export
settings and/or in the case where we may have a filesystem migration event.

Instead of having the NFS client code cache credentials that are tied to a
particular AUTH method it is therefore preferable to have a generic credential
that can be converted into whatever AUTH is in use by the RPC client when
the read/write/sillyrename/... is put on the wire.

We do this by means of the new "generic" credential, which basically just
caches the minimal information that is needed to look up an RPCSEC_GSS,
AUTH_SYS, or AUTH_NULL credential.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-03-14 13:42:49 -04:00
Trond Myklebust
5d00837b90 SUNRPC: Run rpc timeout functions as callbacks instead of in softirqs
An audit of the current RPC timeout functions shows that they don't really
ever need to run in the softirq context. As long as the softirq is
able to signal that the wakeup is due to a timeout (which it can do by
setting task->tk_status to -ETIMEDOUT) then the callback functions can just
run as standard task->tk_callback functions (in the rpciod/process
context).

The only possible border-line case would be xprt_timer() for the case of
UDP, when the callback is used to reduce the size of the transport
congestion window. In testing, however, the effect of moving that update
to a callback would appear to be minor.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 21:40:44 -08:00
Trond Myklebust
fda1393938 SUNRPC: Convert users of rpc_wake_up_task to use rpc_wake_up_queued_task
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 21:40:42 -08:00
Trond Myklebust
101070ca2f NFS: Ensure that the asynchronous RPC calls complete on nfsiod.
We want to ensure that rpc_call_ops that involve mntput() are run on nfsiod
rather than on rpciod, so that they don't deadlock when the resulting
umount calls rpc_shutdown_client(). Hence we specify that read, write and
commit calls must complete on nfsiod.
Ditto for NFSv4 open, lock, locku and close asynchronous calls.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-02-25 21:40:37 -08:00
Jan Blunck
4ac9137858 Embed a struct path into struct nameidata instead of nd->{dentry,mnt}
This is the central patch of a cleanup series. In most cases there is no good
reason why someone would want to use a dentry for itself. This series reflects
that fact and embeds a struct path into nameidata.

Together with the other patches of this series
- it enforced the correct order of getting/releasing the reference count on
  <dentry,vfsmount> pairs
- it prepares the VFS for stacking support since it is essential to have a
  struct path in every place where the stack can be traversed
- it reduces the overall code size:

without patch series:
   text    data     bss     dec     hex filename
5321639  858418  715768 6895825  6938d1 vmlinux

with patch series:
   text    data     bss     dec     hex filename
5320026  858418  715768 6894212  693284 vmlinux

This patch:

Switch from nd->{dentry,mnt} to nd->path.{dentry,mnt} everywhere.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix cifs]
[akpm@linux-foundation.org: fix smack]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-14 21:13:33 -08:00
Linus Torvalds
75659ca0c1 Merge branch 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc
* 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc: (22 commits)
  Remove commented-out code copied from NFS
  NFS: Switch from intr mount option to TASK_KILLABLE
  Add wait_for_completion_killable
  Add wait_event_killable
  Add schedule_timeout_killable
  Use mutex_lock_killable in vfs_readdir
  Add mutex_lock_killable
  Use lock_page_killable
  Add lock_page_killable
  Add fatal_signal_pending
  Add TASK_WAKEKILL
  exit: Use task_is_*
  signal: Use task_is_*
  sched: Use task_contributes_to_load, TASK_ALL and TASK_NORMAL
  ptrace: Use task_is_*
  power: Use task_is_*
  wait: Use TASK_NORMAL
  proc/base.c: Use task_is_*
  proc/array.c: Use TASK_REPORT
  perfmon: Use task_is_*
  ...

Fixed up conflicts in NFS/sunrpc manually..
2008-02-01 11:45:47 +11:00
Trond Myklebust
e6f8107595 NFS: Add an asynchronous delegreturn operation for use in nfs_clear_inode
Otherwise, there is a potential deadlock if the last dput() from an NFSv4
close() or other asynchronous operation leads to nfs_clear_inode calling
the synchronous delegreturn.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:12 -05:00
J. Bruce Fields
3d1c550874 nfs4: allow nfsv4 acls on non-regular-files
The rfc doesn't give any reason it shouldn't be possible to set an
attribute on a non-regular file.  And if the server supports it, then it
shouldn't be up to us to prevent it.

Thanks to Erez for the report and Trond for further analysis.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Tested-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:06:10 -05:00
Trond Myklebust
69dd716c5f NFSv4: Add socket proto argument to setclientid
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:58 -05:00
Chuck Lever
d4d3c50749 NFS: Enable NFS client to generate CLIENTID strings with IPv6 addresses
We recently added methods to RPC transports that provide string versions of
the remote peer address information.  Convert the NFSv4 SETCLIENTID
procedure to use those methods instead of building the client ID out of
whole cloth.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:51 -05:00
Trond Myklebust
bfc69a4566 NFS: define a function to update nfsi->cache_change_attribute
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:47 -05:00
Trond Myklebust
5138fde011 NFS/SUNRPC: Convert all users of rpc_call_setup()
Replace use of rpc_call_setup() with rpc_init_task(), and in cases where we
need to initialise task->tk_action, with rpc_call_start().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:32 -05:00
Trond Myklebust
bdc7f021f3 NFS: Clean up the (commit|read|write)_setup() callback routines
Move the common code for setting up the nfs_write_data and nfs_read_data
structures into fs/nfs/read.c, fs/nfs/write.c and fs/nfs/direct.c.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:32 -05:00
Trond Myklebust
c970aa85e7 SUNRPC: Clean up rpc_run_task
Make it use the new task initialiser structure instead of acting as a
wrapper.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:30 -05:00
Trond Myklebust
2f74c0a056 NFSv4: Clean up the OPEN/CLOSE serialisation code
Reduce the time spent locking the rpc_sequence structure by queuing the
nfs_seqid only when we are ready to take the lock (when calling
nfs_wait_on_sequence).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-30 02:05:24 -05:00
Trond Myklebust
e6e21970ba NFSv4: Fix open_to_lock_owner sequenceid allocation...
NFSv4 file locking is currently completely broken since it doesn't respect
the OPEN sequencing when it is given an unconfirmed lock_owner and needs to
do an open_to_lock_owner. Worse: it breaks the sunrpc rules by doing a
GFP_KERNEL allocation inside an rpciod callback.

Fix is to preallocate the open seqid structure in nfs4_alloc_lockdata if we
see that the lock_owner is unconfirmed.
Then, in nfs4_lock_prepare() we wait for either the open_seqid, if
the lock_owner is still unconfirmed, or else fall back to waiting on the
standard lock_seqid.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-03 09:37:17 -05:00
Trond Myklebust
bb22629ee8 NFSv4: nfs4_open_confirm must not set the open_owner as confirmed on error
RFC3530 states that the open_owner is confirmed if and only if the client
sends an OPEN_CONFIRM request with the appropriate sequence id and stateid
within the lease period.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-01-03 09:37:17 -05:00
Matthew Wilcox
150030b78a NFS: Switch from intr mount option to TASK_KILLABLE
By using the TASK_KILLABLE infrastructure, we can get rid of the 'intr'
mount option.  We have to use _killable everywhere instead of _interruptible
as we get rid of rpc_clnt_sigmask/sigunmask.

Signed-off-by: Liam R. Howlett <howlett@gmail.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2007-12-06 17:40:25 -05:00
Trond Myklebust
a49c3c7736 NFSv4: Ensure that we wait for the CLOSE request to complete
Otherwise, we do end up breaking close-to-open semantics. We also end up
breaking some of the silly-rename tests in Connectathon on some setups.

Please refer to the bug-report at
	http://bugzilla.linux-nfs.org/show_bug.cgi?id=150

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-19 17:19:25 -04:00
Trond Myklebust
565277f63c NFS: Fix a race in sillyrename
lookup() and sillyrename() can race one another because the sillyrename()
completion cannot take the parent directory's inode->i_mutex since the
latter may be held by whoever is calling dput().

We therefore have little option but to add extra locking to ensure that
nfs_lookup() and nfs_atomic_open() do not race with the sillyrename
completion.
If somebody has looked up the sillyrenamed file in the meantime, we just
transfer the sillydelete information to the new dentry.

Please refer to the bug-report at
	http://bugzilla.linux-nfs.org/show_bug.cgi?id=150

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-19 17:19:16 -04:00
Trond Myklebust
40d2470409 NFS: Fix a connectathon regression in NFSv3 and NFSv4
We're failing basic test6 against Linux servers because they lack a correct
change attribute. The fix is to assume that we always want to invalidate
the readdir caches when we call update_changeattr and/or
nfs_post_op_update_inode on a directory.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:20:47 -04:00
Trond Myklebust
9e08a3c5ae NFS: Use nfs_refresh_inode() in ops that aren't expected to change the inode
nfs_post_op_update_inode() is really only meant to be used if we expect the
inode and its attributes to have changed in some way.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:20:45 -04:00
Trond Myklebust
d75340cc4d NFSv4: Fix nfs_atomic_open() to set the verifier on negative dentries too
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:20:06 -04:00
Trond Myklebust
d4d9cdcb47 NFS: Don't hash the negative dentry when optimising for an O_EXCL open
We don't want to leave an unverified hashed negative dentry if the
exclusive create fails to complete.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:19:27 -04:00
Trond Myklebust
70ca88521f NFS: Fake up 'wcc' attributes to prevent cache invalidation after write
NFSv2 and v4 don't offer weak cache consistency attributes on WRITE calls.
In NFSv3, returning wcc data is optional. In all cases, we want to prevent
the client from invalidating our cached data whenever ->write_done()
attempts to update the inode attributes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:19:15 -04:00
Trond Myklebust
8850df999c NFS: Fix atime revalidation in read()
NFSv3 will correctly update atime on a read() call, so there is no need to
set the NFS_INO_INVALID_ATIME flag unless the call to nfs_refresh_inode()
fails.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:19:06 -04:00
Trond Myklebust
c481299839 NFS: Fix atime revalidation in readdir()
NFSv3 will correctly update atime on a readdir call, so there is no need to
set the NFS_INO_INVALID_ATIME flag unless the call to nfs_refresh_inode()
fails.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:19:03 -04:00
Trond Myklebust
17cadc9537 NFS: Don't force a dcache revalidation if nfs_wcc_update_inode succeeds
The reason is that if the weak cache consistency update was successful,
then we know that our client must be the only one that changed the
directory, and we've already updated the dcache to reflect the change.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:55 -04:00
Trond Myklebust
76b32999df NFSv4: Make NFSv4 ACCESS calls return attributes too...
It doesn't really make sense to cache an access call without also
revalidating the attributes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:38 -04:00
Trond Myklebust
af22f94ae0 NFSv4: Simplify _nfs4_do_access()
Currently, _nfs4_do_access() is just a copy of nfs_do_access() with added
conversion of the open flags into an access mask. This patch merges the
duplicate functionality.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:34 -04:00
Trond Myklebust
cd3758e37d NFS: Replace file->private_data with calls to nfs_file_open_context()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:18:31 -04:00
Peter Staubach
4e769b934e 64 bit ino support for NFS client
Hi.

Attached is a patch to modify the NFS client code to support
64 bit ino's, as appropriate for the system and the NFS
protocol version.

The code basically just expand the NFS interfaces for routines
which handle ino's from using ino_t to u64 and then uses the
fileid in the nfs_inode instead of i_ino in the inode.  The
code paths that were updated are in the getattr method and
the readdir methods.

This should be no real change on 64 bit platforms.  Since
the ino_t is an unsigned long, it would already be 64 bits
wide.

    Thanx...

           ps

Signed-off-by: Peter Staubach <staubach@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-10-09 17:15:29 -04:00
Trond Myklebust
deee9369b9 NFSv4: Ensure that we pass the correct dentry to nfs4_intent_set_file
This patch fixes an Oops that was reported by Gabriel Barazer.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-09-01 10:14:38 -04:00
Trond Myklebust
65bbf6bdbb NFSv4: Fix a typo in _nfs4_do_open_reclaim
This should fix the following Oops reported by Jeff Garzik:

kernel BUG at fs/nfs/nfs4xdr.c:1040!
invalid opcode: 0000 [1] SMP 
CPU 0 
Modules linked in: nfs lockd sunrpc af_packet
ipv6 cpufreq_ondemand acpi_cpufreq battery floppy nvram sg snd_hda_intel
ata_generic snd_pcm_oss snd_mixer_oss snd_pcm i2c_i801 snd_page_alloc e1000
firewire_ohci ata_piix i2c_core sr_mod cdrom sata_sil ahci libata sd_mod
scsi_mod ext3 jbd ehci_hcd uhci_hcd
Pid: 16353, comm: 10.10.10.1-recl Not tainted 2.6.23-rc3 #1
RIP: 0010:[<ffffffff88240980>] [<ffffffff88240980>] :nfs:encode_open+0x1c0/0x330
RSP: 0018:ffff8100467c5c60  EFLAGS: 00010202
RAX: ffff81000f89b8b8 RBX: 00000000697a6f6d RCX: ffff81000f89b8b8
RDX: 0000000000000004 RSI: 0000000000000004 RDI: ffff8100467c5c80
RBP: ffff8100467c5c80 R08: ffff81000f89bc30 R09: ffff81000f89b83f
R10: 0000000000000001 R11: ffffffff881e79e0 R12: ffff81003cbd1808
R13: ffff81000f89b860 R14: ffff81005fc984e0 R15: ffffffff88240af0
FS:  0000000000000000(0000) GS:ffffffff8052a000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002adb9e51a030 CR3: 000000007ea7e000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process 10.10.10.1-recl (pid: 16353, threadinfo ffff8100467c4000, task ffff8100038ce780)
Stack:  ffff81004aeb6a40 ffff81003cbd1808 ffff81003cbd1808 ffffffff88240b5d
 ffff81000f89b8bc ffff81005fc984e8 ffff81000f89bc30 ffff81005fc984e8
 0000000300000000 0000000000000000 0000000000000000 ffff81003cbd1800
Call Trace:
 [<ffffffff88240b5d>] :nfs:nfs4_xdr_enc_open_noattr+0x6d/0x90
 [<ffffffff881e74b7>] :sunrpc:rpcauth_wrap_req+0x97/0xf0
 [<ffffffff88240af0>] :nfs:nfs4_xdr_enc_open_noattr+0x0/0x90
 [<ffffffff881df57a>] :sunrpc:call_transmit+0x18a/0x290
 [<ffffffff881e5e7b>] :sunrpc:__rpc_execute+0x6b/0x290
 [<ffffffff881dff76>] :sunrpc:rpc_do_run_task+0x76/0xd0
 [<ffffffff882373f6>] :nfs:_nfs4_proc_open+0x76/0x230
 [<ffffffff88237a2e>] :nfs:nfs4_open_recover_helper+0x5e/0xc0
 [<ffffffff88237b74>] :nfs:nfs4_open_recover+0xe4/0x120
 [<ffffffff88238e14>] :nfs:nfs4_open_reclaim+0xa4/0xf0
 [<ffffffff882413c5>] :nfs:nfs4_reclaim_open_state+0x55/0x1b0
 [<ffffffff882417ea>] :nfs:reclaimer+0x2ca/0x390
 [<ffffffff88241520>] :nfs:reclaimer+0x0/0x390
 [<ffffffff8024e59b>] kthread+0x4b/0x80
 [<ffffffff8020cad8>] child_rip+0xa/0x12
 [<ffffffff8024e550>] kthread+0x0/0x80
 [<ffffffff8020cace>] child_rip+0x0/0x12


Code: 0f 0b eb fe 48 89 ef c7 00 00 00 00 02 be 08 00 00 00 e8 79 
RIP  [<ffffffff88240980>] :nfs:encode_open+0x1c0/0x330
 RSP <ffff8100467c5c60>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-09-01 10:14:37 -04:00
Trond Myklebust
45328c354e NFS: Fix NFSv4 open stateid regressions
Do not allow cached open for O_RDONLY or O_WRONLY unless the file has been
previously opened in these modes.

Also Fix the calculation of the mode in nfs4_close_prepare. We should only
issue an OPEN_DOWNGRADE if we're sure that we will still be holding the
correct open modes. This may not be the case if we've been doing delegated
opens.

Finally, there is no need to adjust the open mode bit flags in
nfs4_close_done(): that has already been done in nfs4_close_prepare().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-08-07 15:13:19 -04:00
Trond Myklebust
e4eff1a622 SUNRPC: Clean up the sillyrename code
Fix a couple of bugs:
 - Don't rely on the parent dentry still being valid when the call completes.
   Fixes a race with shrink_dcache_for_umount_subtree()

 - Don't remove the file if the filehandle has been labelled as stale.

Fix a couple of inefficiencies
 - Remove the global list of sillyrenamed files. Instead we can cache the
   sillyrename information in the dentry->d_fsdata
 - Move common code from unlink_setup/unlink_done into fs/nfs/unlink.c

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-19 15:21:39 -04:00
Trond Myklebust
4fdc17b2a7 NFS: Introduce struct nfs_removeargs+nfs_removeres
We need a common structure for setting up an unlink() rpc call in order to
fix the asynchronous unlink code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-19 15:21:39 -04:00
Trond Myklebust
9936781d01 NFSv4: Try to recover from getfh failures in nfs4_xdr_dec_open
Try harder to recover the open state if the server failed to return a
filehandle.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-19 15:09:03 -04:00
Trond Myklebust
56659e9926 NFSv4: 'constify' lookup arguments.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-19 15:09:03 -04:00
Trond Myklebust
6f220ed5a8 NFSv4: Fix open state recovery
Ensure that opendata->state is always initialised when we do state
recovery.

Ensure that we set the filehandle in the case where we're doing an
"OPEN_CLAIM_PREVIOUS" call due to a server reboot.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-19 15:09:03 -04:00
Frank Filz
137d6acaa6 NFSv4: Make sure unlock is really an unlock when cancelling a lock
I ran into a curious issue when a lock is being canceled. The
cancellation results in a lock request to the vfs layer instead of an
unlock request. This is particularly insidious when the process that
owns the lock is exiting. In that case, sometimes the erroneous lock is
applied AFTER the process has entered zombie state, preventing the lock
from ever being released. Eventually other processes block on the lock
causing a slow degredation of the system. In the 2.6.16 kernel this was
investigated on, the problem is compounded by the fact that the cl_sem
is held while blocking on the vfs lock, which results in most processes
accessing the nfs file system in question hanging.

In more detail, here is how the situation occurs:

first _nfs4_do_setlk():

static int _nfs4_do_setlk(struct nfs4_state *state, int cmd, struct file_lock *fl, int reclaim)
...
        ret = nfs4_wait_for_completion_rpc_task(task);
        if (ret == 0) {
...
        } else
                data->cancelled = 1;

then nfs4_lock_release():

static void nfs4_lock_release(void *calldata)
...
        if (data->cancelled != 0) {
                struct rpc_task *task;
                task = nfs4_do_unlck(&data->fl, data->ctx, data->lsp,
                                data->arg.lock_seqid);

The problem is the same file_lock that was passed in to _nfs4_do_setlk()
gets passed to nfs4_do_unlck() from nfs4_lock_release(). So the type is
still F_RDLCK or FWRLCK, not F_UNLCK. At some point, when cancelling the
lock, the type needs to be changed to F_UNLCK. It seemed easiest to do
that in nfs4_do_unlck(), but it could be done in nfs4_lock_release().
The concern I had with doing it there was if something still needed the
original file_lock, though it turns out the original file_lock still
needs to be modified by nfs4_do_unlck() because nfs4_do_unlck() uses the
original file_lock to pass to the vfs layer, and a copy of the original
file_lock for the RPC request.

It seems like the simplest solution is to force all situations where
nfs4_do_unlck() is being used to result in an unlock, so with that in
mind, I made the following change:

Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:49 -04:00
Trond Myklebust
8bda4e4c98 NFSv4: Fix up stateid locking...
We really don't need to grab both the state->so_owner and the
inode->i_lock.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:43 -04:00
Trond Myklebust
1ac7e2fd35 NFSv4: Clean up the callers of nfs4_open_recover_helper()
Rely on nfs4_try_open_cached() when appropriate.

Also fix an RCU violation in _nfs4_do_open_reclaim()

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:43 -04:00
Trond Myklebust
6ee4126890 NFSv4: Don't call OPEN if we already have an open stateid for a file
If we already have a stateid with the correct open mode for a given file,
then we can reuse that stateid instead of re-issuing an OPEN call without
violating the close-to-open caching semantics.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:43 -04:00
Trond Myklebust
aac00a8d0a NFSv4: Check for the existence of a delegation in nfs4_open_prepare()
We should not be calling open() on an inode that has a delegation unless
we're doing a reclaim.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:43 -04:00
Trond Myklebust
3e309914a1 NFSv4: Clean up _nfs4_proc_open()
Use a flag instead of the 'data->rpc_status = -ENOMEM hack.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:42 -04:00
Trond Myklebust
1b370bc28f NFSv4: Allow nfs4_opendata_to_nfs4_state to return errors.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:42 -04:00
Trond Myklebust
6f43ddccb3 NFSv4: Improve the debugging of bad sequence id errors...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:42 -04:00
Trond Myklebust
003707c722 NFSv4: Always use the delegation if we have one
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:41 -04:00
Trond Myklebust
0f9f95e0ad NFSv4: Clean up confirmation of sequence ids...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:41 -04:00
Trond Myklebust
13437e12fb NFSv4: Support recalling delegations by stateid part 2
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:41 -04:00
Trond Myklebust
2ced46c270 NFSv4: Fix up a bug in nfs4_open_recover()
Don't clobber the delegation info...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:40 -04:00
Trond Myklebust
549d6ed5e8 NFSv4: set the delegation in nfs4_opendata_to_nfs4_state
This ensures that nfs4_open_release() and nfs4_open_confirm_release()
can now handle an eventual delegation that was returned with out open.
As such, it fixes a delegation "leak" when the user breaks out of an open
call.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:40 -04:00
Trond Myklebust
1b45c46cf7 NFSv4: Fix atomic open for execute...
Currently we do not check for the FMODE_EXEC flag as we should. For that
particular case, we need to perform an ACCESS call to the server in order
to check that the file is executable.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:40 -04:00
Trond Myklebust
9f958ab885 NFSv4: Reduce the chances of an open_owner identifier collision
Currently we just use a 32-bit counter.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:39 -04:00
Trond Myklebust
4e56e082dd NFSv4: Clean up _nfs4_proc_lookup() vs _nfs4_proc_lookupfh()
They differ only slightly in the arguments they take. Why have they not
been merged?

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:38 -04:00
Trond Myklebust
c6d00e639b NFSv4: Convert struct nfs4_opendata to use struct kref
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:28 -04:00
Jeff Layton
aa53ed541a NFS4: on a O_EXCL OPEN make sure SETATTR sets the fields holding the verifier
The Linux NFS4 client simply skips over the bitmask in an O_EXCL open
call and so it doesn't bother to reset any fields that may be holding
the verifier. This patch has us save the first two words of the bitmask
(which is all the current client has #defines for). The client then
later checks this bitmask and turns on the appropriate flags in the
sattr->ia_verify field for the following SETATTR call.

This patch only currently checks to see if the server used the atime
and mtime slots for the verifier (which is what the Linux server uses
for this). I'm not sure of what other fields the server could
reasonably use, but adding checks for others should be trivial.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:25 -04:00
Trond Myklebust
b39e625b6e NFSv4: Clean up nfs4_call_async()
Use rpc_run_task() instead of doing it ourselves.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:24 -04:00
Trond Myklebust
4a35bd41af NFSv4: Ensure that nfs4_do_close() doesn't race with umount
nfs4_do_close() does not currently have any way to ensure that the user
won't attempt to unmount the partition while the asynchronous RPC call
is completing. This again may cause Oopses in nfs_update_inode().

Add a vfsmount argument to nfs4_close_state to ensure that the partition
remains mounted while we're closing the file.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:24 -04:00
Trond Myklebust
ad389da79f NFSv4: Ensure asynchronous open() calls always pin the mountpoint
A number of race conditions may currently ensue if the user presses ^C
and then unmounts the partition while an asynchronous open() is in
progress.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:24 -04:00
Trond Myklebust
539cd03a57 NFSv4: Cleanup: pass the nfs_open_context to open recovery code
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:24 -04:00
Trond Myklebust
88be9f990f NFS: Replace vfsmount and dentry in nfs_open_context with struct path
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-07-10 23:40:23 -04:00
Trond Myklebust
10afec9081 NFS: Fix some 'sparse' warnings...
- fs/nfs/dir.c:610:8: warning: symbol 'nfs_llseek_dir' was not declared.
   Should it be static?
 - fs/nfs/dir.c:636:5: warning: symbol 'nfs_fsync_dir' was not declared.
   Should it be static?
 - fs/nfs/write.c:925:19: warning: symbol 'req' shadows an earlier one
 - fs/nfs/write.c:61:6: warning: symbol 'nfs_commit_rcu_free' was not
   declared. Should it be static?
 - fs/nfs/nfs4proc.c:793:5: warning: symbol 'nfs4_recover_expired_lease'
   was not declared. Should it be static?

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-05-14 19:33:46 -04:00
Linus Torvalds
2d56d3c43c Merge branch 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux
* 'server-cluster-locking-api' of git://linux-nfs.org/~bfields/linux:
  gfs2: nfs lock support for gfs2
  lockd: add code to handle deferred lock requests
  lockd: always preallocate block in nlmsvc_lock()
  lockd: handle test_lock deferrals
  lockd: pass cookie in nlmsvc_testlock
  lockd: handle fl_grant callbacks
  lockd: save lock state on deferral
  locks: add fl_grant callback for asynchronous lock return
  nfsd4: Convert NFSv4 to new lock interface
  locks: add lock cancel command
  locks: allow {vfs,posix}_lock_file to return conflicting lock
  locks: factor out generic/filesystem switch from setlock code
  locks: factor out generic/filesystem switch from test_lock
  locks: give posix_test_lock same interface as ->lock
  locks: make ->lock release private data before returning in GETLK case
  locks: create posix-to-flock helper functions
  locks: trivial removal of unnecessary parentheses
2007-05-07 12:34:24 -07:00
J. Bruce Fields
70cc6487a4 locks: make ->lock release private data before returning in GETLK case
The file_lock argument to ->lock is used to return the conflicting lock
when found.  There's no reason for the filesystem to return any private
information with this conflicting lock, but nfsv4 is.

Fix nfsv4 client, and modify locks.c to stop calling fl_release_private
for it in this case.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Cc: "Trond Myklebust" <Trond.Myklebust@netapp.com>"
2007-05-06 17:38:19 -04:00
J. Bruce Fields
08efa202eb NFS4: invalidate cached acl on setacl
The ACL that the server sets may not be exactly the one we set--for
example, it may silently turn off bits that it does not support.  So we
should remove any cached ACL so that any subsequent request for the ACL
will go to the server.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-05-02 07:36:09 -07:00
Trond Myklebust
d9bc125caf Merge branch 'master' of /home/trondmy/kernel/linux-2.6/
Conflicts:

	net/sunrpc/auth_gss/gss_krb5_crypto.c
	net/sunrpc/auth_gss/gss_spkm3_token.c
	net/sunrpc/clnt.c

Merge with mainline and fix conflicts.
2007-02-12 22:43:25 -08:00
Arjan van de Ven
92e1d5be91 [PATCH] mark struct inode_operations const 2
Many struct inode_operations in the kernel can be "const".  Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data.  In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-12 09:48:46 -08:00
Trond Myklebust
e148582e10 NFSv4: Add lockdep checks to nfs4_wait_clnt_recover()
Attempt to detect deadlocks due to caller holding locks on clp->cl_sem

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-02-03 15:35:08 -08:00
Trond Myklebust
a6a352e93d NFSv4: Don't start state recovery in nfs4_close_done()
We might not even have any open files at this point...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-02-03 15:35:08 -08:00
Trond Myklebust
8e0969f045 NFS: Remove nfs_readpage_sync()
It makes no sense to maintain 2 parallel systems for reading in pages.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-02-03 15:35:06 -08:00
Trond Myklebust
c228fd3aee NFSv4: Cleanups for fs_locations code.
Start long arduous project...  What the hell is

	struct dentry = {};

all about?

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2007-02-03 15:35:06 -08:00
Robert P. J. Day
5cbded585d [PATCH] getting rid of all casts of k[cmz]alloc() calls
Run this:

	#!/bin/sh
	for f in $(grep -Erl "\([^\)]*\) *k[cmz]alloc" *) ; do
	  echo "De-casting $f..."
	  perl -pi -e "s/ ?= ?\([^\)]*\) *(k[cmz]alloc) *\(/ = \1\(/" $f
	done

And then go through and reinstate those cases where code is casting pointers
to non-pointers.

And then drop a few hunks which conflicted with outstanding work.

Cc: Russell King <rmk@arm.linux.org.uk>, Ian Molton <spyro@f2s.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Paul Fulghum <paulkf@microgate.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Karsten Keil <kkeil@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Ian Kent <raven@themaw.net>
Cc: Steven French <sfrench@us.ibm.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Jaroslav Kysela <perex@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-12-13 09:05:58 -08:00
Trond Myklebust
200baa2112 NFS: Remove nfs_writepage_sync()
Maintaining two parallel ways of doing synchronous writes is rather
pointless. This patch gets rid of the legacy nfs_writepage_sync(), and
replaces it with the faster asynchronous writes.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:38 -05:00
Frank Filz
cae823c4c0 NFS: Remove use of the Big Kernel Lock around calls to rpc_call_sync
Remove use of the Big Kernel Lock around calls to rpc_call_sync.

Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:31 -05:00
Trond Myklebust
e6b3c4db6f Fix a second potential rpc_wakeup race...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-12-06 10:46:25 -05:00
Al Viro
bc4785cd47 [PATCH] nfs: verifier is network-endian
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:40 -07:00
Al Viro
0dbb4c6799 [PATCH] xdr annotations: NFS readdir entries
on-the-wire data is big-endian

[in large part pulled from Alexey's patch]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:40 -07:00
Chuck Lever
b87c0adfea [PATCH] NFS: remove unused check in nfs4_open_revalidate
Coverity spotted a superfluous error check in nfs4_open_revalidate().  Remove
it.

Coverity: #cid 847

Test plan:
Code inspection; another pass through Coverity.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-10-20 10:26:39 -07:00
Trond Myklebust
2066fe89b4 NFSv4: Poll more aggressively when handling NFS4ERR_DELAY
Change the initial retry delay from 1s to 0.1s (and then back off
exponentially).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:04 -04:00
Trond Myklebust
c514983d8d NFSv4: Handle the condition NFS4ERR_FILE_OPEN
Retry a few times before we give up: the error is usually due to ordering
issues with asynchronous RPC calls.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:03 -04:00
Trond Myklebust
6b30954ebb NFSv4: Retry lease recovery if it failed during a synchronous operation.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:03 -04:00
Chuck Lever
94a6d75320 NFS: Use cached page as buffer for NFS symlink requests
Now that we have a copy of the symlink path in the page cache, we can pass
a struct page down to the XDR routines instead of a string buffer.

Test plan:
Connectathon, all NFS versions.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:53 -04:00
Chuck Lever
4f390c152b NFS: Fix double d_drop in nfs_instantiate() error path
If the LOOKUP or GETATTR in nfs_instantiate fail, nfs_instantiate will do a
d_drop before returning.  But some callers already do a d_drop in the case
of an error return.  Make certain we do only one d_drop in all error paths.

This issue was introduced because over time, the symlink proc API diverged
slightly from the create/mkdir/mknod proc API.  To prevent other coding
mistakes of this type, change the symlink proc API to be more like
create/mkdir/mknod and move the nfs_instantiate call into the symlink proc
routines so it is used in exactly the same way for create, mkdir, mknod,
and symlink.

Test plan:
Connectathon, all versions of NFS.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:52 -04:00
David Howells
54ceac4515 NFS: Share NFS superblocks per-protocol per-server per-FSID
The attached patch makes NFS share superblocks between mounts from the same
server and FSID over the same protocol.

It does this by creating each superblock with a false root and returning the
real root dentry in the vfsmount presented by get_sb(). The root dentry set
starts off as an anonymous dentry if we don't already have the dentry for its
inode, otherwise it simply returns the dentry we already have.

We may thus end up with several trees of dentries in the superblock, and if at
some later point one of anonymous tree roots is discovered by normal filesystem
activity to be located in another tree within the superblock, the anonymous
root is named and materialises attached to the second tree at the appropriate
point.

Why do it this way? Why not pass an extra argument to the mount() syscall to
indicate the subpath and then pathwalk from the server root to the desired
directory? You can't guarantee this will work for two reasons:

 (1) The root and intervening nodes may not be accessible to the client.

     With NFS2 and NFS3, for instance, mountd is called on the server to get
     the filehandle for the tip of a path. mountd won't give us handles for
     anything we don't have permission to access, and so we can't set up NFS
     inodes for such nodes, and so can't easily set up dentries (we'd have to
     have ghost inodes or something).

     With this patch we don't actually create dentries until we get handles
     from the server that we can use to set up their inodes, and we don't
     actually bind them into the tree until we know for sure where they go.

 (2) Inaccessible symbolic links.

     If we're asked to mount two exports from the server, eg:

	mount warthog:/warthog/aaa/xxx /mmm
	mount warthog:/warthog/bbb/yyy /nnn

     We may not be able to access anything nearer the root than xxx and yyy,
     but we may find out later that /mmm/www/yyy, say, is actually the same
     directory as the one mounted on /nnn. What we might then find out, for
     example, is that /warthog/bbb was actually a symbolic link to
     /warthog/aaa/xxx/www, but we can't actually determine that by talking to
     the server until /warthog is made available by NFS.

     This would lead to having constructed an errneous dentry tree which we
     can't easily fix. We can end up with a dentry marked as a directory when
     it should actually be a symlink, or we could end up with an apparently
     hardlinked directory.

     With this patch we need not make assumptions about the type of a dentry
     for which we can't retrieve information, nor need we assume we know its
     place in the grand scheme of things until we actually see that place.

This patch reduces the possibility of aliasing in the inode and page caches for
inodes that may be accessed by more than one NFS export. It also reduces the
number of superblocks required for NFS where there are many NFS exports being
used from a server (home directory server + autofs for example).

This in turn makes it simpler to do local caching of network filesystems, as it
can then be guaranteed that there won't be links from multiple inodes in
separate superblocks to the same cache file.

Obviously, cache aliasing between different levels of NFS protocol could still
be a problem, but at least that gives us another key to use when indexing the
cache.

This patch makes the following changes:

 (1) The server record construction/destruction has been abstracted out into
     its own set of functions to make things easier to get right.  These have
     been moved into fs/nfs/client.c.

     All the code in fs/nfs/client.c has to do with the management of
     connections to servers, and doesn't touch superblocks in any way; the
     remaining code in fs/nfs/super.c has to do with VFS superblock management.

 (2) The sequence of events undertaken by NFS mount is now reordered:

     (a) A volume representation (struct nfs_server) is allocated.

     (b) A server representation (struct nfs_client) is acquired.  This may be
     	 allocated or shared, and is keyed on server address, port and NFS
     	 version.

     (c) If allocated, the client representation is initialised.  The state
     	 member variable of nfs_client is used to prevent a race during
     	 initialisation from two mounts.

     (d) For NFS4 a simple pathwalk is performed, walking from FH to FH to find
     	 the root filehandle for the mount (fs/nfs/getroot.c).  For NFS2/3 we
     	 are given the root FH in advance.

     (e) The volume FSID is probed for on the root FH.

     (f) The volume representation is initialised from the FSINFO record
     	 retrieved on the root FH.

     (g) sget() is called to acquire a superblock.  This may be allocated or
     	 shared, keyed on client pointer and FSID.

     (h) If allocated, the superblock is initialised.

     (i) If the superblock is shared, then the new nfs_server record is
     	 discarded.

     (j) The root dentry for this mount is looked up from the root FH.

     (k) The root dentry for this mount is assigned to the vfsmount.

 (3) nfs_readdir_lookup() creates dentries for each of the entries readdir()
     returns; this function now attaches disconnected trees from alternate
     roots that happen to be discovered attached to a directory being read (in
     the same way nfs_lookup() is made to do for lookup ops).

     The new d_materialise_unique() function is now used to do this, thus
     permitting the whole thing to be done under one set of locks, and thus
     avoiding any race between mount and lookup operations on the same
     directory.

 (4) The client management code uses a new debug facility: NFSDBG_CLIENT which
     is set by echoing 1024 to /proc/net/sunrpc/nfs_debug.

 (5) Clone mounts are now called xdev mounts.

 (6) Use the dentry passed to the statfs() op as the handle for retrieving fs
     statistics rather than the root dentry of the superblock (which is now a
     dummy).

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:37 -04:00
David Howells
8fa5c000d7 NFS: Move rpc_ops from nfs_server to nfs_client
Move the rpc_ops from the nfs_server struct to the nfs_client struct as they're
common to all server records of a particular NFS protocol version.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:35 -04:00
David Howells
1f163415dc NFS: Make better use of inode* dereferencing macros
Make better use of inode* dereferencing macros to hide dereferencing chains
(including NFS_PROTO and NFS_CLIENT).

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:35 -04:00
David Howells
509de81116 NFS: Add extra const qualifiers
Add some extra const qualifiers into NFS.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:34 -04:00
David Howells
24c8dbbb5f NFS: Generalise the nfs_client structure
Generalise the nfs_client structure by:

 (1) Moving nfs_client to a more general place (nfs_fs_sb.h).

 (2) Renaming its maintenance routines to be non-NFS4 specific.

 (3) Move those maintenance routines to a new non-NFS4 specific file (client.c)
     and move the declarations to internal.h.

 (4) Make nfs_find/get_client() take a full sockaddr_in to include the port
     number (will be required for NFS2/3).

 (5) Make nfs_find/get_client() take the NFS protocol version (again will be
     required to differentiate NFS2, 3 & 4 client records).

Also:

 (6) Make nfs_client construction proceed akin to inodes, marking them as under
     construction and providing a function to indicate completion.

 (7) Make nfs_get_client() wait interruptibly if it finds a client that it can
     share, but that client is currently being constructed.

 (8) Make nfs4_create_client() use (6) and (7) instead of locking cl_sem.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:33 -04:00
David Howells
e9326dcab4 NFS: Add a server capabilities NFS RPC op
Add a set_capabilities NFS RPC op so that the server capabilities can be set.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:33 -04:00
David Howells
2b3de4411b NFS: Add a lookupfh NFS RPC op
Add a lookup filehandle NFS RPC op so that a file handle can be looked up
without requiring dentries and inodes and other VFS stuff when doing an NFS4
pathwalk during mounting.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:32 -04:00
David Howells
7539bbab80 NFS: Rename nfs_server::nfs4_state
Rename nfs_server::nfs4_state to nfs_client as it will be used to represent the
client state for NFS2 and NFS3 also.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:32 -04:00
David Howells
adfa6f980b NFS: Rename struct nfs4_client to struct nfs_client
Rename struct nfs4_client to struct nfs_client so that it can become the basis
for a general client record for NFS2 and NFS3 in addition to NFS4.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:31 -04:00
Trond Myklebust
76723de0cf NFSv4: Fix incorrect semaphore release in _nfs4_do_open()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-19 11:54:53 -04:00
Trond Myklebust
16b4289c74 NFSv4: Add v4 exception handling for the ACL functions.
This is needed in order to handle any NFS4ERR_DELAY errors that might be
returned by the server. It also ensures that we map the NFSv4 errors before
they are returned to userland.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
(cherry picked from 71c12b3f0abc7501f6ed231a6d17bc9c05a238dc commit)
2006-08-24 15:54:13 -04:00
Trond Myklebust
01c3b861cd NLM,NFSv4: Wait on local locks before we put RPC calls on the wire
Use FL_ACCESS flag to test and/or wait for local locks before we try
requesting a lock from the server

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-07-05 13:13:18 -04:00
Trond Myklebust
42a2d13eee NFSv4: Ensure nfs4_lock_expired() caches delegated locks
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-07-05 13:13:18 -04:00
Trond Myklebust
9b07357490 NLM,NFSv4: Don't put UNLOCK requests on the wire unless we hold a lock
Use the new behaviour of {flock,posix}_file_lock(F_UNLCK) to determine if
we held a lock, and only send the RPC request to the server if this was the
case.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-07-05 13:13:17 -04:00
David Howells
f7b422b17e NFS: Split fs/nfs/inode.c
As fs/nfs/inode.c is rather large, heterogenous and unwieldy, the attached
patch splits it up into a number of files:

 (*) fs/nfs/inode.c

     Strictly inode specific functions.

 (*) fs/nfs/super.c

     Superblock management functions for NFS and NFS4, normal access, clones
     and referrals.  The NFS4 superblock functions _could_ move out into a
     separate conditionally compiled file, but it's probably not worth it as
     there're so many common bits.

 (*) fs/nfs/namespace.c

     Some namespace-specific functions have been moved here.

 (*) fs/nfs/nfs4namespace.c

     NFS4-specific namespace functions (this could be merged into the previous
     file).  This file is conditionally compiled.

 (*) fs/nfs/internal.h

     Inter-file declarations, plus a few simple utility functions moved from
     fs/nfs/inode.c.

     Additionally, all the in-.c-file externs have been moved here, and those
     files they were moved from now includes this file.

For the most part, the functions have not been changed, only some multiplexor
functions have changed significantly.

I've also:

 (*) Added some extra banner comments above some functions.

 (*) Rearranged the function order within the files to be more logical and
     better grouped (IMO), though someone may prefer a different order.

 (*) Reduced the number of #ifdefs in .c files.

 (*) Added missing __init and __exit directives.

Signed-Off-By: David Howells <dhowells@redhat.com>
2006-06-09 09:34:33 -04:00
Manoj Naik
6b97fd3da1 NFSv4: Follow a referral
Respond to a moved error on NFS lookup by setting up the referral.
Note: We don't actually follow the referral during lookup/getattr, but
later when we detect fsid mismatch in inode revalidation (similar to the
processing done for cloning submounts). Referrals will have fake attributes
until they are actually followed or traversed.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:29 -04:00
Manoj Naik
830b8e33fe NFSv4: Define an fs_locations bitmap
This is (similar to getattr bitmap) but includes fs_locations and
mounted_on_fileid attributes. Use this bitmap for encoding in fs_locations
requests.
Note: We can probably do better by requesting locations as part of fsinfo
itself.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:25 -04:00
Manoj Naik
361e624f6d NFSv4: GETATTR attributes on referral
Per referral draft, only fs_locations, fsid, and mounted_on_fileid can be
requested in a GETATTR on referrals.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:24 -04:00
Manoj Naik
7aaa0b3bd4 NFSv4: convert fs-locations-components to conform to RFC3530
Use component4-style formats for decoding list of servers and pathnames in
fs_locations.

Signed-off-by: Manoj Naik <manoj@almaden.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:23 -04:00
Trond Myklebust
683b57b435 NFSv4: Implement the fs_locations function call
NFSv4 allows for the fact that filesystems may be replicated across
several servers or that they may be migrated to a backup server in case of
failure of the primary server.
fs_locations is an NFSv4 operation for retrieving information about the
location of migrated and/or replicated filesystems.

Based on an initial implementation by Jiaying Zhang <jiayingz@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:22 -04:00
Trond Myklebust
55a975937d NFS: Ensure the client submounts, when it crosses a server mountpoint.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:19 -04:00
Trond Myklebust
38478b24e3 NFS: More page cache revalidation fixups
Whenever the directory changes, we want to make sure that we always
invalidate its page cache. Fix up update_changeattr() and
nfs_mark_for_revalidate() so that they do so.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:09 -04:00
Trond Myklebust
73a3d07c10 NFS: Clean up inode metadata updates
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-06-09 09:34:04 -04:00
Trond Myklebust
95cf959b24 VFS: Fix another open intent Oops
If the call to nfs_intent_set_file() fails to open a file in
nfs4_proc_create(), we should return an error.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-04-19 12:43:46 -04:00
J. Bruce Fields
096455a22a NFSv4: Dont list system.nfs4_acl for filesystems that don't support it.
Thanks to Frank Filz for pointing out that we list system.nfs4_acl extended
attribute even on filesystems where we don't actually support nfs4_acl.
This is inconsistent with the e.g. ext3 POSIX ACL behaviour, and seems to
annoy cp.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 23:23:42 -05:00
Trond Myklebust
7a1218a277 SUNRPC: Ensure rpc_call_async() always calls tk_ops->rpc_release()
Currently this will not happen if we exit before rpc_new_task() was called.
Also fix up rpc_run_task() to do the same (for consistency).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 18:11:10 -05:00
Trond Myklebust
03f28e3a20 NFS: Make nfs_fhget() return appropriate error values
Currently it returns NULL, which usually gets interpreted as ENOMEM. In
fact it can mean a host of issues.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:48 -05:00
Trond Myklebust
51581f3bf9 NFSv4: SETCLIENTID_CONFIRM should handle NFS4ERR_DELAY/NFS4ERR_RESOURCE
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:47 -05:00
Trond Myklebust
3e4f6290ca NFSv4: Send the delegation stateid for SETATTR calls
In the case where we hold a delegation stateid, use that in for inside
SETATTR calls.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:46 -05:00
Trond Myklebust
ec06c096ed NFS: Cleanup of NFS read code
Same callback hierarchy inversion as for the NFS write calls. This patch is
not strictly speaking needed by the O_DIRECT code, but avoids confusing
differences between the asynchronous read and write code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:27 -05:00
Trond Myklebust
788e7a89a0 NFS: Cleanup of NFS write code in preparation for asynchronous o_direct
This patch inverts the callback hierarchy for NFS write calls.

Instead of having the NFSv2/v3/v4-specific code set up the RPC callback
ops, we allow the original caller to do so. This allows for more
flexibility w.r.t. how to set up and tear down the nfs_write_data
structure while still allowing the NFSv3/v4 code to perform error
handling.

The greater flexibility is needed by the asynchronous O_DIRECT code, which
wants to be able to hold on to the original nfs_write_data structures after
the WRITE RPC call has completed in order to be able to replay them if the
COMMIT call determines that the server has rebooted.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:27 -05:00
Chuck Lever
006ea73e5f NFS: add hooks to account for NFSERR_JUKEBOX errors
Make an inode or an nfs_server struct available in the logic that handles
JUKEBOX/DELAY type errors so the NFS client can account for them.

This patch is split out from the main nfs iostat patch to highlight minor
architectural changes required to support this statistic.

Test plan:
None.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:14 -05:00
Trond Myklebust
a162a6b804 NFSv4: Kill braindead gcc warnings
nfs4_open_revalidate: 'res' may be used uninitialized
nfs4_callback_compound: ‘hdr_res.nops’ may be used uninitialized
			'op_nr’ may be used uninitialized
encode_getattr_res: ‘savep’ may be used uninitialized

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:10 -05:00
Jesper Juhl
c8d149f3db NFS: "const static" vs "static const" in nfs4
My previous "const static" vs "static const" cleanup missed a single case,
patch below takes care of it.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:07 -05:00
Trond Myklebust
c12e87f465 [PATCH] NFSv4: fix mount segfault on errors returned that are < -1000
It turns out that nfs4_proc_get_root() may return raw NFSv4 errors instead of
mapping them to kernel errors.  Problem spotted by Neil Horman
<nhorman@tuxdriver.com>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-14 07:57:18 -08:00
Trond Myklebust
fa178f29c0 NFSv4: Ensure DELEGRETURN returns attributes
Upon return of a write delegation, the server will almost always bump the
 change attribute. Ensure that we pick up that change so that we don't
 invalidate our data cache unnecessarily.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:51 -05:00
Trond Myklebust
286d7d6a0c NFSv4: Remove requirement for machine creds for the "setclientid" operation
Use a cred from the nfs4_client->cl_state_owners list.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:47 -05:00
Trond Myklebust
b4454fe1a7 NFSv4: Remove requirement for machine creds for the "renew" operation
In RFC3530, the RENEW operation is allowed to use either

 the same principal, RPC security flavour and (if RPCSEC_GSS), the same
  mechanism and service that was used for SETCLIENTID_CONFIRM

 OR

 Any principal, RPC security flavour and service combination that
 currently has an OPEN file on the server.

 Choose the latter since that doesn't require us to keep credentials for
 the same principal for the entire duration of the mount.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:47 -05:00
Trond Myklebust
58d9714a44 NFSv4: Send RENEW requests to the server only when we're holding state
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:46 -05:00
Trond Myklebust
433fbe4c88 NFSv4: State recovery cleanup
Use wait_on_bit() when waiting for state recovery to complete.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:45 -05:00
Trond Myklebust
26e976a884 NFSv4: OPEN/LOCK/LOCKU/CLOSE will automatically renew the NFSv4 lease
Cut down on the number of unnecessary RENEW requests on the wire.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:45 -05:00
Trond Myklebust
fe650407a8 NFSv4: Make DELEGRETURN an interruptible operation.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:44 -05:00
Trond Myklebust
a5d16a4d09 NFSv4: Convert LOCK rpc call into an asynchronous RPC call
In order to allow users to interrupt/cancel it.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:44 -05:00
Trond Myklebust
911d1aaf26 NFSv4: locking XDR cleanup
Get rid of some unnecessary intermediate structures

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:44 -05:00
Trond Myklebust
864472e9b8 NFSv4: Make open recovery track O_RDWR, O_RDONLY and O_WRONLY correctly
When recovering from a delegation recall or a network partition, we need
 to replay open(O_RDWR), open(O_RDONLY) and open(O_WRONLY) separately.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:43 -05:00
Trond Myklebust
e761692381 NFSv4: Make nfs4_state track O_RDWR, O_RDONLY and O_WRONLY separately
A closer reading of RFC3530 reveals that OPEN_DOWNGRADE must always
 specify a access modes that have been the argument of a previous OPEN
 operation.
 IOW: doing OPEN(O_RDWR) and then OPEN_DOWNGRADE(O_WRONLY) is forbidden
 unless the user called OPEN(O_WRONLY)

 In order to fix that, we really need to track the three possible open
 states separately.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:43 -05:00
Trond Myklebust
cdd4e68b5f NFSv4: Make open_confirm() asynchronous too
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:42 -05:00
Trond Myklebust
24ac23ab88 NFSv4: Convert open() into an asynchronous RPC call
OPEN is a stateful operation, so we must ensure that it always
 completes. In order to allow users to interrupt the operation,
 we need to make the RPC call asynchronous, and then wait on
 completion (or cancel).

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:42 -05:00
Trond Myklebust
e56e0b78eb NFSv4: Allocate OPEN call RPC arguments using kmalloc()
Cleanup in preparation for making OPEN calls interruptible by the user.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:41 -05:00
Trond Myklebust
06f814a3ad NFSv4: Make locku use the new RPC "wait on completion" interface.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:40 -05:00
Trond Myklebust
4ce70ada1f SUNRPC: Further cleanups
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:40 -05:00
Trond Myklebust
963d8fe533 RPC: Clean up RPC task structure
Shrink the RPC task structure. Instead of storing separate pointers
 for task->tk_exit and task->tk_release, put them in a structure.

 Also pass the user data pointer as a parameter instead of passing it via
 task->tk_calldata. This enables us to nest callbacks.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-01-06 14:58:39 -05:00
Trond Myklebust
3b6efee923 NFSv4: Fix an Oops in the synchronous write path
- Missing initialisation of attribute bitmask in _nfs4_proc_write()
 - On success, _nfs4_proc_write() must return number of bytes written.
 - Missing post_op_update_inode() in _nfs4_proc_write()
 - Missing initialisation of attribute bitmask in _nfs4_proc_commit()
 - Missing post_op_update_inode() in _nfs4_proc_commit()

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-12-03 15:20:21 -05:00
Trond Myklebust
ff6040667a NFSv4: Fix typo in lock caching
When caching locks due to holding a file delegation, we must always
 check against local locks before sending anything to the server.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-25 17:11:29 -05:00
Trond Myklebust
6bfc93ef98 NFSv4: Teach NFSv4 to cache locks when we hold a delegation
Now that we have a method of dealing with delegation recalls, actually
 enable the caching of posix and BSD locks.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:39:36 -05:00
Trond Myklebust
888e694c16 NFSv4: Recover locks too when returning a delegation
Delegations allow us to cache posix and BSD locks, however when the
 delegation is recalled, we need to "flush the cache" and send
 the cached LOCK requests to the server.

 This patch sets up the mechanism for doing so.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:38:11 -05:00
Trond Myklebust
2c56617d76 NFSv4: Fix the handling of the error NFS4ERR_OLD_STATEID
Ensure that we retry the failed operation...

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:33:50 -05:00
Trond Myklebust
d530838bfa NFSv4: Fix problem with OPEN_DOWNGRADE
RFC 3530 states that for OPEN_DOWNGRADE "The share_access and share_deny
 bits specified must be exactly equal to the union of the share_access and
 share_deny bits specified for some subset of the OPENs in effect for
 current openowner on the current file.

 Setattr is currently violating the NFSv4 rules for OPEN_DOWNGRADE in that
 it may cause a downgrade from OPEN4_SHARE_ACCESS_BOTH to
 OPEN4_SHARE_ACCESS_WRITE despite the fact that there exists no open file
 with O_WRONLY access mode.

 Fix the problem by replacing nfs4_find_state() with a modified version of
 nfs_find_open_context().

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:33:38 -05:00
Trond Myklebust
4cecb76ff8 NFSv4: Fix a race between open() and close()
We must not remove the nfs4_state structure from the inode open lists
 before we are in sequence lock.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-11-04 15:32:58 -05:00
Trond Myklebust
4f9838c7ec NFSv4: Add post-op attributes to NFSv4 write and commit callbacks.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:44 -04:00
Trond Myklebust
16e429596d NFSv4: Add post-op attributes to nfs4_proc_remove()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:44 -04:00
Trond Myklebust
6caf2c8276 NFSv4: Add post-op attributes to nfs4_proc_rename()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:43 -04:00
Trond Myklebust
91ba2eeec5 NFSv4: Add post-op attributes to nfs4_proc_link()
Optimise attribute revalidation when hardlinking. Add post-op attributes
 for the directory and the original inode.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:42 -04:00
Trond Myklebust
516a6af641 NFS: Add optional post-op getattr instruction to the NFSv4 file close.
"Optional" means that the close call will not fail if the getattr
 at the end of the compound fails.
 If it does succeed, try to refresh inode attributes.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:41 -04:00
Trond Myklebust
56ae19f38f NFSv4: Add directory post-op attributes to the CREATE operations.
Since the directory attributes change every time we CREATE a file,
 we might as well pick up the new directory attributes in the same
 compound.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:40 -04:00
Trond Myklebust
decf491f30 NFS: Don't let nfs_end_data_update() clobber attribute update information
Since we almost always call nfs_end_data_update() after we called
 nfs_refresh_inode(), we now end up marking the inode metadata
 as needing revalidation immediately after having updated it.

 This patch rearranges things so that we mark the inode as needing
 revalidation _before_ we call nfs_refresh_inode() on those operations
 that need it.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:39 -04:00
Trond Myklebust
0e574af1be NFS: Cleanup initialisation of struct nfs_fattr
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-27 22:12:38 -04:00
Trond Myklebust
ec07342828 NFSv4: Fix up locking for nfs4_state_owner
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-20 14:22:47 -07:00
J. Bruce Fields
1d95db8e16 NFSv4: Fix acl buffer size
resp_len is passed in as buffer size to decode routine; make sure it's
 set right in case where userspace provides less than a page's worth of
 buffer.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 23:19:41 -07:00
Trond Myklebust
550f57470c NFSv4: Ensure that we recover from the OPEN + OPEN_CONFIRM BAD_STATEID race
If the server is in the unconfirmed OPEN state for a given open owner
 and receives a second OPEN for the same open owner, it will cancel the
 state of the first request and set up an OPEN_CONFIRM for the second.

 This can cause a race that is discussed in rfc3530 on page 181.

 The following patch allows the client to recover by retrying the
 original open request.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:21 -07:00
Trond Myklebust
b8e5c4c297 NFSv4: If a delegated open fails, ensure that we return the delegation
Unless of course the open fails due to permission issues.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:20 -07:00
Trond Myklebust
642ac54923 NFSv4: Return delegations in case we're changing ACLs
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:19 -07:00
Trond Myklebust
6f926b5ba7 [NFS]: Check that the server returns a valid regular file to our OPEN request
Since it appears that some servers don't...

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:18 -07:00
Trond Myklebust
02a913a73b NFSv4: Eliminate nfsv4 open race...
Make NFSv4 return the fully initialized file pointer with the
 stateid that it created in the lookup w/intent.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:17 -07:00
Trond Myklebust
06735b3454 NFSv4: Fix up handling of open_to_lock sequence ids
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:15 -07:00
Trond Myklebust
faf5f49c2d NFSv4: Make NFS clean up byte range locks asynchronously
Currently we fail to do so if the process was signalled.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:15 -07:00
Trond Myklebust
0a8838f972 NFSv4: Add missing handling of OPEN_CONFIRM requests on CLAIM_DELEGATE_CUR.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:14 -07:00
Trond Myklebust
83c9d41e45 NFSv4: Remove nfs4_client->cl_sem from close() path
We no longer need to worry about collisions between close() and the state
 recovery code, since the new close will automatically recheck the
 file state once it is done waiting on its sequence slot.

 Ditto for the nfs4_proc_locku() procedure.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:13 -07:00
Trond Myklebust
e6dfa553cf NFSv4: Remove obsolete state_owner and lock_owner semaphores
OPEN, CLOSE, etc no longer need these semaphores to ensure ordering of
 requests.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:13 -07:00
Trond Myklebust
9512135df1 NFSv4: Fix a potential CLOSE race
Once the state_owner and lock_owner semaphores get removed, it will be
 possible for other OPEN requests to reopen the same file if they have
 lower sequence ids than our CLOSE call.
 This patch ensures that we recheck the file state once
 nfs_wait_on_sequence() has completed waiting.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:12 -07:00
Trond Myklebust
cee54fc944 NFSv4: Add functions to order RPC calls
NFSv4 file state-changing functions such as OPEN, CLOSE, LOCK,... are all
 labelled with "sequence identifiers" in order to prevent the server from
 reordering RPC requests, as this could cause its file state to
 become out of sync with the client.

 Currently the NFS client code enforces this ordering locally using
 semaphores to restrict access to structures until the RPC call is done.
 This, of course, only works with synchronous RPC calls, since the
 user process must first grab the semaphore.
 By dropping semaphores, and instead teaching the RPC engine to hold
 the RPC calls until they are ready to be sent, we can extend this
 process to work nicely with asynchronous RPC calls too.

 This patch adds a new list called "rpc_sequence" that defines the order
 of the RPC calls to be sent. We add one such list for each state_owner.
 When an RPC call is ready to be sent, it checks if it is top of the
 rpc_sequence list. If so, it proceeds. If not, it goes back to sleep,
 and loops until it hits top of the list.
 Once the RPC call has completed, it can then bump the sequence id counter,
 and remove itself from the rpc_sequence list, and then wake up the next
 sleeper.

 Note that the state_owner sequence ids and lock_owner sequence ids are
 all indexed to the same rpc_sequence list, so OPEN, LOCK,... requests
 are all ordered w.r.t. each other.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-10-18 14:20:12 -07:00
Nishanth Aravamudan
041e0e3b19 [PATCH] fs: fix-up schedule_timeout() usage
Use schedule_timeout_{,un}interruptible() instead of
set_current_state()/schedule_timeout() to reduce kernel size.  Also use helper
functions to convert between human time units and jiffies rather than constant
HZ division to avoid rounding errors.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-10 10:06:36 -07:00
Trond Myklebust
65e4308d25 [PATCH] NFS: Ensure we always update inode->i_mode when doing O_EXCL creates
When the client performs an exclusive create and opens the file for writing,
a Netapp filer will first create the file using the mode 01777. It does this
since an NFSv3/v4 exclusive create cannot immediately set the mode bits.
The 01777 mode then gets put into the inode->i_mode. After the file creation
is successful, we then do a setattr to change the mode to the correct value
(as per the NFS spec).

The problem is that nfs_refresh_inode() no longer updates inode->i_mode, so
the latter retains the 01777 mode. A bit later, the VFS notices this, and calls
remove_suid(). This of course now resets the file mode to inode->i_mode & 0777.
Hey presto, the file mode on the server is now magically changed to 0777. Duh...

Fixes http://bugzilla.linux-nfs.org/show_bug.cgi?id=32

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-16 09:30:58 -07:00
Trond Myklebust
eadf4598e7 [PATCH] NFS: Add debugging code to NFSv4 readdir
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:44 -04:00
Trond Myklebust
8d0a8a9d0e [PATCH] NFSv4: Clean up nfs4 lock state accounting
Ensure that lock owner structures are not released prematurely.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:42 -04:00
Trond Myklebust
08e9eac42e [PATCH] NFSv4: Fix up races in nfs4_proc_setattr()
If we do not hold a valid stateid that is open for writes, there is little
 point in doing an extra open of the file, as the RFC does not appear to
 mandate this...

 Make setattr use the correct stateid if we're holding mandatory byte
 range locks.

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:35 -04:00
Trond Myklebust
202b50dc12 [PATCH] NFSv4: Ensure that propagate NFSv4 state errors to the reclaim code
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:34 -04:00
Andrew Morton
3e9d41543b [PATCH] NFSv4: empty array fix
Older gcc's don't like this.

 fs/nfs/nfs4proc.c:2194: field `data' has incomplete type

 Signed-off-by: Andrew Morton <akpm@osdl.org>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:28 -04:00
Adrian Bunk
b7ef19560f [PATCH] NFSv4: fs/nfs/nfs4proc.c: small simplification
The Coverity checker noticed that such a simplification was possible.

 Signed-off-by: Adrian Bunk <bunk@stusta.de>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:27 -04:00
J. Bruce Fields
e50a1c2e1f [PATCH] NFSv4: client-side caching NFSv4 ACLs
Add nfs4_acl field to the nfs_inode, and use it to cache acls.  Only cache
 acls of size up to a page.  Also prepare for up to a page of acl data even
 when the user doesn't pass in a buffer, as when they want to get the acl
 length to decide what size buffer to allocate.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:15 -04:00
J. Bruce Fields
4b580ee3dc [PATCH] NFSv4: ACL support for the NFSv4 client: write
Client-side write support for NFSv4 ACLs.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:14 -04:00
J. Bruce Fields
aa1870af92 [PATCH] NFSv4: ACL support for the NFSv4 client: read
Client-side support for NFSv4 ACLs.  Exports the raw xdr code via the
 system.nfs4_acl extended attribute.  It is up to userspace to decode the acl
 (and to provide correctly xdr'd acls on setxattr), and to convert to/from
 POSIX ACLs if desired.

 This patch provides only the read support.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:13 -04:00
J. Bruce Fields
6b3b5496d7 [PATCH] NFSv4: Add {get,set,list}xattr methods for nfs4
Add {get,set,list}xattr methods for nfs4.  The new methods are no-ops, to be
 used by subsequent ACL patch.

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:10 -04:00
J. Bruce Fields
92cfc62cb8 [PATCH] NFS: Allow NFS versions to support different sets of inode operations.
ACL support will require supporting additional inode operations in v4
 (getxattr, setxattr, listxattr).  This patch allows different protocol versions
 to support different inode operations by adding a file_inode_ops to the
 nfs_rpc_ops (to match the existing dir_inode_ops).

 Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:09 -04:00
Trond Myklebust
4ce79717ce [PATCH] NFS: Header file cleanup...
- Move NFSv4 state definitions into a private header file.
 - Clean up gunk in nfs_fs.h

 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2005-06-22 16:07:06 -04:00
Linus Torvalds
1da177e4c3 Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!
2005-04-16 15:20:36 -07:00