| Task: | File-system maps |
| Group: | bill,gdmr,cms |
| Stage: | 1 |
Description
A system for naming and mounting remote filesystems and distributing
remote filesystem information will be required.
Issues
The existing DCS system uses the am-utils automounter, with filesystem
information being distributed by hesiod maps. am-utils has support
for LDAP.
Dependencies
Filesystem maps DICE task
Changes
- (for 02/4/02) Added section on decisions taken at meeting on
19/3/02.
- (for 02/4/02) Changed unresolved issues section.
- (For 19/3/02) Added section on unresolved issues at the top of document.
- (For 19/3/02) Minor changes to order of document to bring it into line with
Jeremy's guidelines.
Unresolved Issues
This task is being held up at the moment by 2 things :-
- We need to get the AMD LDAP interface working.
- Simon and Tim have done quite a lot of work on debugging this, but
haven't yet solved all the problems.
- George will look at this when he has time and a working 7.1
machine on his desk. His loaned beowolf machine has been reposessed
and the replacement hasn't yet appeared.
- The Autofs/LDAP stuff seems to work out of the box.
- We need a DICE machine to be able to run tests on. This probably
needs to be a machine that we can break without affecting anyone elses
work.
- We need to decide on names for the filesystems to support rpms.
Decisions Made at 19/3/02 DICE meeting
- /yesterday will mirror at the partition level i.e. for each home
directory partition, there will be an mirror partition for yesterday.
- /yesterday home directory backups will appear as
/yesterday/home/user
- Personal web pages will appear as /public/web/user. This will
not require filesystem map support as /public/web will be one virtual
partition on the web server.
- Non Backed up temporary space will appear as
/volatile. Directories will be added for different temporary areas
under /volatile as needed.
- nsu will exist in DICE stage 1. Group and netgroup information
needed by DICE will have to be decided/stored.
The Current Situation
DCS/LCFG Machines
Essentially file system maps are implemented using the AMD-utils
automounter and hessiod. There are methods for generating the maps
based on scripts and data files.
Main types of filesystem maps supported currently are :-
- Home directories
- Removable media(cds,floppies etc.)
- User web space (/public)
- Local home directories on laptops
- /yesterday (Backup of previous days files)
- srpms (Source rpms live here)
- globalvar (needed for Amanda Backups)
Other filesystem types exist, but are largely obsolete. These are discussed further later in this document.
Buccleuch Place/South Bridge/Forrest Hill
All sites use Sun's automount. Buccleuch Place uses NIS+. Forrest Hill and South Bridge use NIS. There may be other types of filesystems that need to be supported for users on these sites. This is being investigated by the Filesystem Issues task, led by Bill Hewitt,but will be discussed briefly later in this document for completeness.
The DICE model
The key features of the DICE model that affect this task are :-
- Hessiod will not be supported, as it is now considered non standard
- NIS and NIS+ will not be supported
- Remote filesystems will be NFS in stage 1
- There is a stage 2 task to adopt a more secure model
- Directory information services will be provided by LDAP
The DICE model for home filesystems
- Users will choose one of their existing home directories as their
DICE home directory.
- Home directories will continue to be served by legacy servers at
the moment. DICE servers will come later.
- All DICE home directories can be mounted on all DICE machines.
- Users will need to be educated.
- Home directories will be seen by all informatics users.
- Permissions may need to be tightened up.
- Users will have to be aware that data can be compromised by
internal hackers.
- ?Sensitive data may need to be stored on local disk?
- DICE home directories can be mounted on all systems, both
DICE and legacy.
- Legacy systems may choose not to mount home directories and
create usernames for all DICE users.
- All usernames in DICE will be unified University usernames.
- On legacy systems, some users MAY continue to use their legacy
username. It has yet to be decided if home directories for these users
will have legacy or unified names.
The DICE model for Legacy Filesystems
Once we move to unified home directories (and this can be done before the start of the DICE roll out), we will need to provide a mechanism for accessing data from secondary home directories.
The proposal is that these will live under /legacy/domain/home (i.e. /home/{cogsci,dai,dcs}/home. The /legacy directory will only exist for the transition period. As this data will be fairly static, the source can be kept in (a set of) flat file(s). The name of a users home directory under /legacy will be the users legacy name.
Goals of this task
- Decide which automounter(s) to use
- Define the set of filesystem maps and mount points needed to support DICE
- Existing set of maps provided for LCFG machines is full of cruft
- Some filesystem types will no longer need to be supported
- Mount points for filesystems can be simplified. For example home directories will no longer need to be under /homes/remote/site. What structure do we need?
- There may be other filesystem types to be supported that come out of the filesystem issues task, that are required to support features of legacy systems not currently supported on LCFG machines.
- Support will be needed for removable media. Will there be other types in addition to cdrom and floppy?
- Support will be needed for local home files for laptops, exam machines etc.Laptops need to mount filesystems when connected to the network, but be able to work with local home directories when disconnected,
- Define the data to be stored by LDAP
- Determine how automounter uses data from LDAP
- Map generation
- Generate maps for DICE from legacy data
- Does account creation put the correct information in LDAP
- How do maps get updated
- Testing
- Do we provide the necessary maps and mount points?
- Are maps correct and complete?
- Are maps updated correctly?
- Are there performance issues?
- Provide documentation
Automounter Issues
There was a choice between AMD and Autofs. Both have suport for LDAP,
to varying degrees.
A decision was made to go with AMD because
- It probably has the best support for LDAP
- There is already experience of using AMD with LCFG systems
- Some work has already been done on using AMD with LDAP
- There is a DICE stage 2 task to look at replacements for
NFS. A new mechanism for (auto)mounting files will almost certainly be
needed at this point.
Autofs is currently used by LCFG machines to support removable media (cdrom,floppy). There is no reason why this shouldn't continue to be the case under DICE.
Although a decision was made not to investigate both AMD and autofs, if major problems are found with AMD, we might try to fall back to autofs. It is hoped that this will need not be the case.
Filesystem Issues
Unresolved Filesystem Issues
There are a number of unresolved issues that are preventing finalizing the list of filesystems needed by DICE. We would like to know how the following will be supported in DICE
- Personal Web space
- There is a task looking into provision of personal web space. However we won't know what filesystem support is needed until there recommendations are known
- RPM Distribution
- Is this going to be done via NFS, or is there another mechanism? If it is via nfs which directories are needed? Do we need directories for source rmps and build rpms?
- Removable Media
- These are currently supported via autofs and mounted under /mnt/floppy and /mnt/cdrom. Do we expect this to continue, or should we consider other methods for dealing with removable media ? Are /mnt/floppy and /mnt/cdrom good locations for these to be mounted under DICE ?
The other issue is, are we going to allow secondary groups in DICE for data sharing ? AIAI are likely to want to continue to use the shared writable area /project and would need group support for this to work. It would also be easier to manage the transition if other share read/write areas were allowed to exist (under /legacy?) after the introduction of DICE.
Similarly, although we will be recommending downloading corpus data onto local disks via rpms for working on, are we going to allow other corpus data to be browsable on all machines ?
It is likely that there will be a need for shared areas to exist, at least for the transition period.
Which filesystems were considered
We need to come up with a definitive set of filesystems that will be
needed in DICE. Once the set has been approved, changes should only be made according to pre-defined procedures.
We considered the following filesystems
Filesystems included in LCFG setup that may not be needed by DICE
- Under /export/local (/export/remote/...):
-
mac, pc, micros, hppa, hp9000, mips, sun3, sun4, vax, 386bsd, share,
sun4-51, irix5
- These are the master copies of the shared and architecture-specific
binaries and things. Only share, sun4-51 and linux are currently in use.
The first two are for our legacy Suns
- linux
-
Currently the way
the Linux rpms are distributed; We don't know whether the new mechanisms
require it still to be around for DICE.
- /export/local/admin
- /obj/local
- For package-building we used /obj/local (/obj/remote/.../local) on our
legacy Suns, but that's now pretty well fallen into disuse and doesn't
exist in the DICE world.
- /public
- is the way that people get at their exported web pages
/public.
- How web data is dealt with will be considered elsewhere
- /useful
- gave access to external NFS sites (sunsite,
sunsite-extra,ctan). Now obsolete
- penguin
- used to be a local Linux mirror. Now obsolete ?
- srpms
-
Where the source RPMs live. could be a link to /nfs/p ?
- oldsrpms
- oldnews
- risky
- A network-wide /tmp. Still needed ?
- vtkdata
- What's this for? Obsolete?
- edaliases
- A copy of the edaliases database. Probably a better way
to do this?
- globalvar
- A network-wide /var, still in use for Amanda backups. Needed under
DICE?
- /usr/remote/..
- linked through /usr/local: share, sun4-51, dump, tmp, bin, vlsi,
risky, spool, nis. Obsolete?
Filesystems in use at legacy sites that may need to be supported
under DICE
This is under consideration by the filesystem issues task. It is hoped
that these may be able to be supported using cvs, the web, and by
using rpms for software distribution. There may be some things that
will need maps.Candidates for investigation might be
- /corpora
- Cogsci/HCRC have a large area (~75Gb) for read only text corpus and
corpus tools. This is essentially read only (changes
infrequently). For actually using a corpus, probably the most
efficient thing to do is to allow users to download an RPM file to
their local disk. Users may also need to browse corpus data though.
- /usr/local
-
Most of this is either part of the red hat distribution (gnu tools)
or could be provided as RPMs.
- Project
-
DAI have an area used by AIAI for commercially sensitive
information. It's important to restrict access to this.
- Projects
-
Cogsci/HCRC have an area for large projects. There are no quotas on
this area but it is backed up.
- Contrib
-
Cogsci/HCRC have an area for a "trusted" set of users to install
software. This may be replaced by user generated RPMs or user requests
for RPMs to be made ?
- Local (site) Web Pages
-
DAI and Cogsci have web information served locally. Most of this
should eventually move to Informatics web space? These pages could
be maintained by CVS ?
- Print and Mail Spool
- It is assumed that these won't be needed in the dice world. Mail
will be via POP/IMAP and printing will copy data to the central spool?
Filesystems needed for laptop support
Laptops need to be able to work both connected to a DICE network
and disconnected. They need to provide home directory space for their
owner on local disk, but be able to mount filesystems when connected
to the network.
Which Filesystems will be needed by DICE
Functionally there are three kinds of filesystem maps that we need to provide in DICE
- Providing a way for an individual NFS filesystem to be mounted on a client in a specified place.
- Taking several filesystems and making them appear as a unified whole. The main example of this is the home partition, where all a user need to know is that his directory is /home/username. The underlying structure, involving physical servers and partitions is hidden from the user.
- Provide a way of mapping from /home to a combination of system-local home directories and the global (unified) home directory to allow for system-local filesystems on laptops
A minimal set of maps required by DICE is given below. These might be
added to later.
- /home
- /yesterday
- A mirror of the home filesystem frozen each night for
backup/restore.
- Removable Media
- floppy disk
- cdrom
- We may need to provide support for other removable media (e.g. zip, compact flash)
- rpms
- Do we need filesystem(s) to support these ?
- A directory to mount individual nfs directories at a system level
- George has called this /p in his tests (or perhaps this should be /nfs/p or /amd/p??)
- Can be used by COs to look through whole partitions
- Need to decide how to name partitions mounted under /p. Names will probably be logical, but need to be descriptive and unique.
- A directory to provide system level support for laptop home directories
- George has called this /h in his tests (perhaps /nfs/h or /amd/h)
- On laptops the remote mounts would appear in /h, with /home being
generated so that links went to the local disc where necessary, and /h
otherwise
Filesystem Naming
This is not a technical issue, but nonetheless we have to decide where filesystems live in DICE and how they are named. It's fairly clear that we want :-
Some thought needs to go on names and locations of mounts for other things. We need to come up with a policy for this. Particular thought ought to go into using /nfs or /amd for system mounts to allow more than one transport to be used at the same time later on. Also thought needs to go in to how to generate names for partitions that are logical, preferably guessable and unique.
LDAP and AMD Issues
This section discusses issues related to implementing filesystem maps and providing LDAP structures to support them. Firstly some fundamentals
- We don't want AMD to do any work to generate maps
- We want to run an unmodified version of AMD
- For DICE filesystems, both source information and map will be held in LDAP
- For legacy filesystems, the map can be held in LDAP, but the source information should be held in flat files and edited by rfe. Hopefully these files will be relatively static.
- We will use one of the supported schemas to implement storing of final maps in LDAP
We discussed 2 possible models for storing source information
- Store partition information under each user. This presumes that the
location of the home directory is a property of the user
- Store lists of users under each partition. This assumes that the
location of the home directory is a property of the partition
We decided that the first option was more applicable, because the second had difficulties with unified username and legacy username (used in home directory) names being different.
We discussed two kinds of structures that would be needed for storing source information in LDAP
- Partition
- partition (name server partition seealso yesterday)
- e.g. partition(tron12 tron /disk/home/tron12 seealso yesterday)
- The seealso is a LDAP "link" to another partition "record" and could be used if we want to replicate a directory for redundancy
- The yesterday link is similar to the above and is used to point to a partition "record" for the backup mirror.
- There would be one of these "records" for each directory we want to mount
- People
- people (uid partition subdirectory)
- e.g. (12345 tron12 staff/timc)
- The subdirectory bit is needed because DAI and COGSCI have subdirectories (phd,staff,msc,...) in some of the home directory filesystems
- The partition matches the partition in the partition "record"
This model matches up quite well with Georges partition types mentioned earlier in the document.
- partition maps onto /p
- people maps on to /h
George has done some work on a test implementation of /h and /p for an amalgamation of home directories from dcs and dai. He has scripts to generate dbm files from amd maps.
If there are other maps like /h (glueing several nfs partitions together into one) we would need to define another structure like "people" to support it.
We considered /project as another similar case, but decided that that might be best implemented as /legacy/dai/project, with, if needed, a set of links distributed by rpm to link /project/blah to /legacy/dai/project/blah. as it is in /legacy, no source would be stored in LDAP, so no new structure would be needed.
We considered how the information from LDAP would be updated by COs. Options are :-
- LDAP data is dumped into a pseudo file, which is edited and then uploaded into LDAP
- LDAP source information is generated from a file edited by rfe
We preferred the second case because the master file can have comments and structure that would make it easier to edit.
A pseudo file dumped from LDAP would be in a random order and difficuly to navigate. Having said that, the same is true of NIS+, used at Buccleuch Place...
Proposal and timescales
Proposed work needed
- Generate a test home directory map in the format we need it
- Will be based on existing home directories.
- Should include users from all sites/servers/partitions
- Unified DICE home directory data isn't available yet
- Put the test home directory data in LDAP and test on DICE system.
- Simon will help put map data in LDAP
- Prototype DICE systems are not yet available for testing
- Can do some limited testing on LCFG systems?
- Testing is of functionality only
- Investigate provision of home directories on laptops.
- Generate /home map with links to /h except for local home directories.?Mechanism?
- Test in connected and disconnected modes
- Load testing.
- Generate home maps for realistic numbers of users
- Use legacy home directories if unified DICE home directories not available
- Get home directory data into LDAP
- Testing method
- Generate complete list of filesystem maps required.
- Some of this will come out of the filesystem issues task?
- Most of this is defined earlier in this document
- Is the structure of mount points correct? Complete? New Functionality?
- Generate maps to support other filesystems
- Testing of mounts for other filesystems
- Again load testing as well as functionality testing
- Support for removable media
- Can be supported by autofs as in LCFG
- Should these be mounted under /mnt/{floppy,cdrom}
- Are other types of removable devices needed? Zip? Others?
Timescales
- George Ross has already done some work getting existing home directory data into the form we need it in. He has done some testing using rdbm instead of ldap. Distributing rdbm files as rpms is a possible fallback if LDAP and AMD interworking isn't ready?
- Maps are fairly easy to generate once data is available. Real DICE data may not be available for quite a while.
- Test Home directories are ready to be put into LDAP
- Getting test data into LDAP and testing functionality could be done in 1-2 weeks if there are no major problems
- ?Load Testing?
- A lot of the work has been done on defining the set of maps needed. How does this get approved ? File system issues task will not be in a position to report back for at least 1 month, as the survey of usage hasn't started.
Dependencies
- Filesystem Issues Task
- Are there new filesystem types to support
- LDAP
- Need to ensure the right data is stored in LDAP
- Account creation
- Home directory data needs to det into LDAP in the form we want it.
- Prototype DICE systems need to be available.
- Netgroup information needs to be available to allow legacy directories to be shared to DICE machine.
Actions
Current
- From above meeting. Tim agreed to look at schema for people and
partition information. Also Tim will enable support for AMD map
structures in the LDAP server.
- Complete debugging of LDAP interface for AMD maps.
Completed
- Organise a meeting to discuss LDAP support for Filesystem Maps,
for interested parties, mainly from Filesystem maps and LDAP tasks.
|
|
Please contact us with any
comments or corrections.
Unless explicitly stated otherwise, all material is
copyright The University of Edinburgh
|
|