Re: Linux, the final decision

From: The Ghost In The Machine (ewill_at_sirius.athghost7038suus.net)
Date: 02/02/04


Date: Mon, 02 Feb 2004 01:00:23 GMT

In comp.os.linux.advocacy, Ed Murphy
<emurphy42@socal.rr.com>
 wrote
on Sun, 01 Feb 2004 05:46:31 GMT
<pan.2004.02.01.05.49.34.303969@socal.rr.com>:
> On Sun, 01 Feb 2004 01:00:14 +0000, The Ghost In The Machine wrote:
>
>>>> It is possible to NFS mount a nonexistent machine. More precisely,
>>>> the machine exists at the time of the mount but then goes down;
>>>> the NFS mount then simply sits there, anticipating a restart
>>>> of the down machine. Access attempts to files on that mount result
>>>> in process hangs if one uses a hardmount; the idea at the time
>>>> of the design was that the machine would eventually come back.
>>>>
>>>> Or one can NFS mount a machine, then the mounted machine decides
>>>> to change its IP address by not renewing a DHCP lease or by being
>>>> reconfigured.
>
>> As you might have noticed, there are (at least) two general
>> possibilities for design that are possible.
>>
>> [1] The system pretends the crash didn't happen and waits for
>> the remote box to restart, then continues where it left off.
>>
>> [2] The system notifies the app that something went wrong, which
>> the app then decides to notify the user, confusing him
>> to some extent, then the app terminates.
>>
>> With NFS, one can get either option. :-) (I'd have to look to see
>> how Windows deals with an analogous situation. One advantage
>> of course is that Windows drivers have access to the GUI, which
>> might be of some assistance for notification if a remote node
>> decides to go down.)
>
> I don't think I've ever tried NFS on Windows.

Chameleon had a product at one point. I can't say I've used it
myself, though.

> In general, though,
> Windows' response to an unavailable network share is to keep trying
> for a while and then give up with a failure notification, i.e. your
> [2]. Why NFS (or any other networking system) would do [1] baffles
> me; if a specific app *wants* to wait forever, then it can simply
> try again when the failure notification is received, right?
>

Depends. A naively written app which works on localdisk should
ideally work the same way over the network -- but often doesn't,
under [2].

It's an interesting general philosphical question, and Unix/C
design methods encouraging code such as

   printf("ha ha, I assume it worked");

instead of checking explicitly return codes (or perhaps
throwing exceptions!) doesn't help.

Then again, I'm as bad as the rest in that department... :-)

-- 
#191, ewill3@earthlink.net
It's still legal to go .sigless.