Re: [2/3] POHMELFS: Documentation.
- From: Sage Weil <sage@xxxxxxxxxxxx>
- Date: Sun, 15 Jun 2008 09:41:44 -0700 (PDT)
On Sun, 15 Jun 2008, Evgeniy Polyakov wrote:
Yes, not only writepage, but any request - if it sends sequest and then
receives reply (i.e. doing send/recv sequence without ability to do
something else in between or allow other users to do sends or receives
into the same socket), then it is synchronous. If it only sends, and
someone else receives, it is possible to send multiple requests from
different users who do reads or writes or lookups or whatever and
asynchronously in different thread receive replies not in particular
order, so this approach I call asynchronous.
Oh, so you just mean that the caller doesn't, say, hold a mutex for the
socket for the duration of the send _and_ recv? I'm kind of shocked that
anyone does that, although I suppose in some cases the protocol
effectively demands it.
Yes, POHMELFS does writing that way.
Nice. I will definitely be taking a look at that.
Not exactly. Transaction in a nutshell is a wrapper on top of command
(or multiple commands if needed like in writing), which contains all
information needed to perform appropriate action. When user calls read()
or 'ls' or write() or whatever, POHMELFS creates transaction for that
operation and tries to perform it (if operation is not cached, in that
case nothing actually happens). When transaction is submitted, it
becomes part of the failover state machine which will check if data has
to be read from different server or written to new one or dropped.
original caller may not even know from which server its data will be
received. If request sending failed in the middle, the whole transaction
will be redirected to new one. It is also possible to redo transaction
against different server, if server sent us error (like I'm busy), but
this functionality was dropped in previous release iirc, this can be
resurrected though. Having generic transaction tree callers do not
bother about how to store theirs requests, how to wait for results and
how to complete them - transactions do it for them. It is not rocket
science, but extrmely effective and simple way to help rule out
asynchronous machinery.
Got it. Tracking pending requests in some generic way is definitely key
to making failure handling sane with multiple servers.
That was somewhat old approach, currently inode numbers and things like
open-by-inode or NFS style open-by-cookie are not used. I tried to
describe caching bits in docuementation I ent, although its a bit rough
and likely incomplete :) Feel free to ask if there are some white areas
there.
So what happens if the user creates a new file, and then does a stat() to
expose i_ino. Does that value change later? It's not just
open-by-inode/cookie that make ino important.
It looks like the client/server protocol is primarily path-based. What
happens if you do something like
hosta$ cd foo
hosta$ touch foo.txt
hostb$ mv foo bar
hosta$ rm foo.txt
Will hosta realize it really needs to do "unlink /bar/foo.txt"?
sage
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- Follow-Ups:
- Re: [2/3] POHMELFS: Documentation.
- From: Evgeniy Polyakov
- Re: [2/3] POHMELFS: Documentation.
- References:
- [0/3] POHMELFS high performance network filesystem. First steps in parallel processing.
- From: Evgeniy Polyakov
- [2/3] POHMELFS: Documentation.
- From: Evgeniy Polyakov
- Re: [2/3] POHMELFS: Documentation.
- From: Jamie Lokier
- Re: [2/3] POHMELFS: Documentation.
- From: Evgeniy Polyakov
- Re: [2/3] POHMELFS: Documentation.
- From: Sage Weil
- Re: [2/3] POHMELFS: Documentation.
- From: Evgeniy Polyakov
- [0/3] POHMELFS high performance network filesystem. First steps in parallel processing.
- Prev by Date: Re: linux-next: Tree for June 13 (soft lockup)
- Next by Date: path to speakup control variables?
- Previous by thread: Re: [2/3] POHMELFS: Documentation.
- Next by thread: Re: [2/3] POHMELFS: Documentation.
- Index(es):
Relevant Pages
|