Locking
Advisory locking
Description
Process have to agree on a semaphore.
When the semaphore is set, a process should not attempt to access the
shared resource.
Implementation
- Kernel semaphores
Process increments a variable in shared memory when wanting to lock the
resource. Other process (/thread) must wait for variable to get back to
the 'unlocked' value before incrementing the variable itself (obviously
the increment must be an atomic operation within the kernel).
Disadvantages
- Cannot be enforced.
- Don't usually have shared memory across machines.
- What happens when a process dies and doesn't release its lock ?
Advantages
- Fast.
- Kernel believes it has some control ;)
- Lockfile
When a process (or thread) wants to access the shared resource,
it creates the lockfile (whose name is agreed upon).
When it is finished, it unlinks the file.
Disadvantages
- Cannot be enforced.
- Processes on other machines must have some way to access the file,
e.g. NFS, which is as reliable as a bus timetable.
Advantages
- Works across a network, though this is unreliable
Mandatory locking
Description
Processes request locks from an arbitrator
When locks are set, they are enforced by the arbitrator
Locks may be shared/exclusive (put simply: read/write)
Implementation
- flock() (BSD)
flock() provides mandatory locking - both shared and exclusive.
However, when up/downgrading the lock type, the existing lock is released
and then an attempt to acquire the new lock is made.
Disadvantages
- no-op over NFS, though returns SUCCESS !
- Problem with up/downgrading mentioned above.
Advantages
- Apart from over NFS, and when programmers correctly understand the
behaviour when changing lock type, flock() is reliable and secure.
- The kernel will remove locks when a process exits. Note: what
happens with threads ? With Linux, a thread is really a process, so
should still work, but what about on e.g. Solaris ?
- fcntl() (SYSV/POSIX)
fcntl() allows not only whole file locking, but portion-of-file locking.
You may ask for the lock status on a file and lock/unlock a part / the
whole of a file.
Like flock(), locks may be shared or exclusive.
Disadvantages
- You may not acquire an exclusive lock unless you have already
opened the file. So if you have the file open read-only, you must
re-open it with write access.
- If you unlock a resource, and have it open at another point in your
process, you lose the lock. So your app will silently fail. Lovely.
- Over NFS, it's necessary to have server processes running that
serve up locks. If they're not running, fcntl() will block.
- Same might apply for local systems if there's no kernel support for
fcntl() (It's handled by daemons only).
- If the server daemons die... guess.
Advantages
Locking in Empath
As the reader can see from the above list, there is no locking mechanism that
can be considered reliable. If you're screaming 'but flock() is reliable'
you're just plain wrong. Reliable on the local host but not over NFS is not
reliable. Sometimes NFS is your only option, or removing NFS from a system
would be extreme overkill to give reliable mail.
I'm not going to introduce ``it'll probably work'' into Empath. If
Empath loses / corrupts mail, I'll be held responsible. People do read their
mail over NFS. I've worked at places where that's how the system was set up
when I arrived. The only way to fix it was to move everyone to local mailboxes,
run all the mail clients on one system, or switch everyone to a
Maildir-compliant MUA.
Unfortunately, due to certain management difficulties, changing the mail
clients was impossible, so mailboxes had to be moved onto the host that the
users' MUA was running on (changing the delivery routines and managing the new
storage needed on each host). This was obviously impractical, as anyone who's
managed a huge homogenous system will know.
The answer was to POP mail. Not ideal, I know, but at that point there were no
graphical mailers (no decent ones) that supported Maildir. POP is easy - the
user's mailbox is written wherever they want. I would have put everyone on
mutt, but corporate policy dictated that everyone would have a graphical
desktop. Pig ignorance and indifference to the technical difficulties involved
is difficult to argue with.
Ok, so POP wasn't the best idea. It was easy, and worked, and wasn't going to
be going for long (hahaha where did I hear that before ?) IMAP would have been
a better solution, but again there weren't many IMAP-aware mailers about.
So, you say, why don't you use flock() for local mailboxes ?
Firstly, how do I know it's a local mailbox and not on NFS ?
Whatever, it's not worth it if I can't be sure.
Secondly, flock() might not exist on the target system, or be implemented via
emulation (fcntl()).
There is an answer.
Don't use locks.
Maildir doesn't require locks, and therefore works over NFS too. The
implementation must check writes for success - that's all. If you can't write
to a mail file, it's not a drastic problem. The message is just considered to
be undelivered, rather than lost.
So how do I handle people's mailboxes (mbox, MMDF et al) ?
Just treat them as read-only. I can't damage a mailbox if I don't write to
it.
Sorted.