|
| Task: | Faults/support |
| Group: | cc, johnb, bill, lmb |
| Stage: | 1 |
A division wide user support and problem-reporting tracking system, covering both procedures and technology. Existing procedures at all four sites will need to be studied. New procedures should be flexible enough to suit our diverse circumstances, and should encourage rather than impede the flow of user support information and experience between sites. The task's name comes from the email addresses used for user support at the KB site.
There's a home grown remind system that mails the duty officer about messages he hasn't replied to, although doesn't do anything to check that they have been ultimately solved.
Mail to and from help@cogsci is archived, but nothing snazzy is done with these archives to allow users to use them to solve problems. The systems staff aim to distill useful nuggets out of problem solving to put into user documentation (cheat sheets).
There's no fancy fault tracking system installed other than the reminder system.
Users are encouraged to look at the Support web pages for computing support information. In particular there's a Support FAQ there which contains answers to many of the common questions which the team is asked. These pages are maintained by the Support staff.
Most users are assumed to be capable of doing a lot to solve their own computing problems, once they're pointed in the right direction, either by the Support team or by the user community in general on the local newsgroups.
There are two mail addresses which can be used to contact the user support team.
We ask people to report broken things to faults@dcs. This mail goes into a GNATS report-tracking system, in which it can be categorised and assigned to a particular person for investigation. First though the Support Office CSO on duty makes an initial attempt to diagnose and solve the problem, asking for more information if that seems to be necessary. If the CSO needs help, he or she can go to the day's "Duty CO", who will help the CSO to investigate problems, explaining where necessary so that the CSO will be able to handle future instances of similar problems without the need for help. The point of this system is to try to ensure that the CSOs can handle much of the user support themselves, freeing up the COs to spend most of their time doing something other than "fire fighting" or solving immediate day to day problems, hopefully development of services and other work.
That's the theory, anyway. In practice the Duty CO system hasn't always worked quite as well as was initially hoped. Partly this has been due to understaffing. The Duty CO isn't always there when needed, and when present is sometimes reluctant to investigate problems in unfamiliar areas (many of the KB COs are specialists in particular areas and don't know that much about other COs' specialisms, and can be reluctant to interfere for fear of inadvertently breaking something). When the Duty CO does know enough to be able to solve a problem, it's sometimes solved without an adequate explanation being given back to the CSO in the Support Office.
GNATS is also not quite as useful as originally hoped; it gets in the way and obstructs quite as much as it helps. Something better suited to our needs would be welcome (see below).
Any problem not solved by the Support CSO or the Duty CO is referred back to whoever is responsible for the software or hardware in question. This person is identified both by broad category (for example George Ross is in charge of networks, Neil Brown of printing, and so on), and by examining the software: every piece of software is installed using an RPM file, which carries the name of whoever most recently created it, so it's easy to identify the person most recently responsible for any particular bit of software. If he or she can't be contacted, urgent problems are put to other COs more immediately to hand, and non-urgent problems are referred to the responsible person for their later attention.
The other email address provided is support@dcs. This is used for all support queries apart from reports of broken things. Mail to support@dcs is read in a normal mail agent in the "support" user account. No special tracking software is used. Inbound and outbound mail in this account is filed and kept, and can be searched by Support staff, but not by users in general.
Some users are confused about which mail address to use, mailing support@dcs when they should have mailed faults@dcs, and vice versa.
Support queries can also be put to local users using local newsgroups - eduni.dcs.questions is for putting questions to the user community in general, and year-based student groups such as eduni.dcs.cs3 are used by students in a particular year to ask computer usage questions of each other.
Users will need clear explanations of the user-visible differences between DICE and pre-DICE environments. Documents describing these differences and how to cope with them would greatly help a lot of users to make the transition.
2002-02-19 It's not clear who will write these. However we note that the Documentation Framework task says that it may be in a good position to do so.
However we worry that some users won't be able to make the necessary adjustments by themselves. Some don't seem to understand enough about the systems - or perhaps lack a grasp of the terminology? - to be able to cope very well when asked to solve their problems using documentation. These people are going to need some personal help in adapting to DICE. Some of their questions may be answerable by email or over the phone. Other computer queries may be far quicker and easier to answer in person, where a screen can be pointed at, commands typed and icons clicked, to demonstrate problems and solutions.
Arguments for keeping the current separate services:
Arguments for a unified service (whether based at one site or split between several):
It's worth examining the arguments for separate services though. We think that any change to user support arrangements will be easier to introduce with the introduction of DICE than at some other time. We think that with staff and student mobility around the division being encouraged and we assume increasing, and with the turnover of people over time, most users will find it easier to deal with the support service in the same way no matter where they are located. DICE systems can not be simply integrated into a number of separate support services, at least not without losing significant communication between sites and between user support teams and DICE maintainers. Contrary to belief at other sites, KB users are not all computer experts; many of the staff are mathematicians or logicians and are not particularly interested in playing with complicated software or powerful hardware, and many of the students also don't have the time or the inclination to develop computer expertise beyond the level that's absolutely required for satisfactory completion of their course work.
However we do think that a personal presence at each site would be welcome, if it can be managed. There are always some support queries which require or greatly benefit from face to face contact; for example, it's sometimes far easier to understand or explain support queries with a personal demonstration than with a long complicated description.
2002-02-19 A previous DICE meeting pointed out that any move to make User Support correspondence public would have to be accompanied by a prominent warning at the point of user submission of correspondence (the user support web input form). There was general unease about making the user support records public. Perhaps user-to-user help/support systems (newsgroups, a user-maintained FAQ) would be more appropriate.
Currently support teams are contacted by email, by phone or in person.
Whilst we accept the advantages of a web form, we have a number of questions and observations:
2002-02-19 We spoke to a CEG member and he confirmed that the more "casual" interpretation was the correct one: it's envisaged that people will still be able to have face to face contact with user support staff, and will still have the opportunity to put questions to them in person and by phone. If the problem can't be dealt with immediately, the support person may then request the user to fill in the web input form so that the Support team has a proper record of the matter as a reminder. The support person may even fill in the form himself/herself. In other words, the web input form will merely replace email as the way to officially register a problem/query.
Many (but no all) KB-based users are used to using newsgroups for local mass communication, and read it regularly. However local usenet news seems to have more or less died out at other sites.
Alert messages are a familiar form of communication at some sites, but at other sites they're unknown. Any alert message system would have to cope with the wide variety of login environments (well, X window managers) likely to be in use on DICE machines - past alert message systems may have relied on all users using the same login environment, in which an alert message pops up on startup. Alert messages would also not be seen by people who stayed logged in for long periods of time, unless they made a special effort to check them from time to time.
Use of email for general announcements may be acceptable at some sites but certainly not at all of them - it's greatly resented by some KB users, for instance, who class it as junk mail and actively complain about it.
In our experience web-based bulletin board systems can suffer badly from rot, as their contents grow older and older and are never expired.
We wonder if we can use local newsgroups for announcements, but provide a web interface to them for the people for whom use of a conventional newsreader might be a daunting or tediously involved process?
2002-02-19 This was raised at a previous DICE meeting. It was noted there that Divisional policy is for members to be able to receive announcements in whichever one of several formats they prefer. Will this be implemented? How will we handle announcements concerning DICE machines before we have such a system? Mail all systems announcements widely to mailing lists? Set up a systems announcements web page?
2002-02-19 Do we have an answer to this question?
The pre-DICE computer systems will still have significant numbers of users for two years after the introduction of DICE. User support services for them will still be needed. How should people get at these services? The same ways that have been used before? Or should we offer just one route into user support for all systems, DICE or pre-DICE?
The latter would perhaps be simpler to operate, as it lessens two of the main worries which occurred to us as we started looking into this area:
Whatever software is used should have certain characteristics. Some of these reflect our long experience of the GNATS system which has been in use at KB, and to a lesser extent also at SB and FH, for several years.
In order to explain exactly what we need of the software, we need to explain the characteristics of the problems which the user support services will have to deal with. Most problems are either simple or complex:
Currently we have noted that the GNATS software (even the newer versions) fails miserably against a large number of these criteria. It seems far more likely that we will go for other software with a stronger match with the metric above.
RT, on the other hand, meets many of the above. Notable exceptions are out of the box interoperability between itself and most bug tracking software (e.g. Bugzilla, which is likely to be the Software Management task's favoured bug-tracking system for DICE). However transfer of a report from one to the other ought in theory to be a simple matter of exporting from RT's SQL database and importing to Bugzilla's SQL database.
RT (Request tracker) itself doesn't have a front end for users to submit queries to (in the strictest sense), but there are a few configurable frame-based front ends out there. And it's all GPL. We have noted that we'll most likely be creating a web front end for submitting mail to anyway.
One recent offer of help has informally come from EUCS, and the maintainer there has offered us the chance to evaluate it for ourselves. More info when we've done that. The software is in-house, and called Call Management Software (CMS). So far our attempts to arrange a visit to see it have not met with success. However one can get a flavour of its facilities at the CMS web site.
| GNATS | Call Management System | Bugzilla | Request Tracker | |
|---|---|---|---|---|
| Supports email | Yes. | Yes. | Yes. | Yes. |
| Files incoming email replies properly | No. Email replies often go to the personal address of the user support person who responds to the call. Even if mail does go to GNATS, it's only filed properly if it has the correct subject line. | Yes. Mail goes from and to a dedicated CMS mail address for that particular call. | Yes, can be configured to do this. | Yes, can be configured so that replies go back into RT and are filed properly. |
| Deal with trivia quickly, simply | No. | Yes. "Quick call" facility lets the user log a trivial matter very quickly. | No special "quick" facilities that we can see. | No special "quick" facilities that we can see. |
| Flexibility and wide range of options when installing and configuring? | No - many changes seem to require code rewrites. | To some extent. We can customise some of the team-specific parts of the interface. However other parts of CMS are required to be the same across all teams, so we'd be stuck with some unsuitable things. | "Yes. However, modifying some fields, notably those related to bug progression states, also requires adjusting the program logic to compensate for the change." (Bugzilla FAQ) | Yes, the configuration seems extremely flexible. This is one of Request Tracker's strongest points. |
| Interoperable with Bugzilla? How? How easily? | Any application can send data to Bugzilla through its XML API or through the HTTP protocol. | |||
| GNATS calls are more or less just plain text files; these can be massaged then injected into Bugzilla through the XML API or through HTTP. | Data can be extracted from the underlying database through an ODBC link; massaged into the correct format; then injected into Bugzilla through the XML API or through HTTP. | Data can be exported through a custom DTD in XML format. It can then be put into the other Bugzilla through the XML API or through HTTP. | Data can be extracted from RT's underlying SQL database; massaged into the correct format; then injected into Bugzilla through the XML API or through HTTP. | |
| Builds a "knowledge base" or archive of problems, searchable by COs | Yes. Automatically builds record of past problems and has excellent search facilities. | No, CMS doesn't do this. However, data can be extracted through an ODBC link and fed elsewhere, such as into a separate SQL database, where it can be searched. | "You have no idea. Bugzilla's query interface, particularly with the advanced Boolean operators, is incredibly versatile." (Bugzilla FAQ) | Yes. Automatically builds record of past problems and has excellent search facilities. |
| Can it be made to exclude trivia when searching its knowledge base? | Yes. Searching is very flexible and you can exclude or include what you like, assuming it's all been categorised correctly in the first place. | Trivial matters or "quick calls" are easily identifiable by dint of having a customer called "QuickCall", so yes, they could be excluded when searching our separate "knowledge base" SQL database. | Yes - see above | Yes. Searching is very flexible and you can exclude or include what you like, assuming it's all been categorised correctly in the first place. |
| GNATS | Call Management System | Bugzilla | Request Tracker | |
|---|---|---|---|---|
| Files outgoing email replies properly | Sort of. It doesn't file its outgoing mail at the point of sending, but CCs it to the GNATS mail address, where the mail is filed on receipt. But mail about a call is often dealt with outside of GNATS, so copies are often not filed as they should be. | Sort of. It doesn't file its outgoing mail at the point of sending, but CCs it to the CMS mail address for the call, where the mail is filed on receipt. However all mail is dealt with through CMS so replies will be kept and filed properly. | It seems to be possible to configure its email to behave in a wide variety of ways, so we'll assume a "yes" here. | Sort of. It doesn't file its outgoing mail at the point of sending, but it tweaks the From address so that replies go back into RT where they're properly filed. |
| Handles wrongly addressed incoming replies OK | KB's installation of GNATS does not handle this properly: it categorises and numbers a wrongly addressed follow-up message as a brand new problem. It's then the devil's own job to merge this information in to where it ought to be, the Audit Trail of the real problem it relates to. What about other GNATS installations or versions? | This isn't clear to us. | Behaviour is configurable. | |
| Allows manual control where needed; lets the user recover from mistakes. | Not good enough. Too often, once you've clicked the wrong button, you're stuffed. | CMS seems better than GNATS in that the wrong thing is less likely to happen in the first place. But if it does, it may be difficult to undo. | Yes. The RT interface remains fluid until the final "commit". | |
| Sends out reminders to fault-fixers | No. | Yes. Fully configurable reminder system built in. | Can send out mail in all sorts of circumstances. | Yes. Fully configurable reminder system built in. |
| Allows users to follow progress | Yes. | No; CMS access is only allowed for Support team members, not users. We'd have to export the data to a separate SQL database which we could then allow users to search. Or something. | Yes, RT has a decent granularity of access; for example it can be set up so to allow a user access to all her support queries. | |
| Anti-spam facilities. | All can be protected to some extent using procmail and other mail filters. | |||
| No. | Yes, there's a quick spam deletion facility. | No. | ||
| GNATS | Call Management System | Bugzilla | Request Tracker | |
|---|---|---|---|---|
| Any maintenance/upkeep worries? Can we count on the software being actively supported for some years to come? And how easy will it be to move to other software when the time comes? | Open source, part of the GNU Project. Under (slow?) active development. | Written and maintained by a dedicated EUCS team. The future of the EUCS support teams that use CMS is up in the air at the moment. Enough people in the University use and like CMS that one would think it unlikely for development to be stopped and/or the service to be shut down by EUCS senior management; but this would always be a possibility. | Open source; all related software is free and open source. Seems to be very popular bug-tracker software in the open source community, and under active development, so no maintenance worries. | Open source; all related software is free and open source. However RT seems to have been developed by a very small team, so it's not obvious that RT development would continue if their personal circumstances changed. |
| Cost | Free. | Currently unknown. We'd have to pay EUCS, but we won't know how much until we come up with a detailed spec which they can cost. They say it wouldn't be high. | Free. | Free. |
| Documentation | Seems pretty comprehensive and helpful. | The Quick Start Guide is great. The more advanced documentation isn't there yet. | Seems pretty comprehensive and helpful. | Incomplete. The amount of blank space in the RT manual is scary! However there is an RT users' mailing list. |
GNATS has too many negatives in our "must have" table to be considered further.
CMS turned out to be a lot more interesting than we had expected. It seems like a good, useable system which does what it was designed to do very well. The CMS team have done a good job, and the EUCS user support teams have a great tool to use. (We'd like to thank the CMS team here for their wonderful helpfulness when showing CMS to us. Thanks!)
However - you could tell there was a "however" coming, couldn't you - the job which CMS was designed for isn't quite the job we want our software to do. CMS is designed around EUCS working methods and patterns of data flow; it would be awkward to bend it into the slightly different shape which we would want. There's also the discouraging need to shell out money for it (however small the cost turned out to be), and there's a slight question mark over adopting software which is so firmly under the control of one department's management. That it happens to be EUCS is irrelevant; the point is that the other packages are all released under public licences and have some degree of global user community backing them. CMS doesn't. Those are both relatively minor points, however; the main reason we haven't chosen it is simply that it promised in several ways to be more awkward to use than its competitors here.
So the choice comes down to Request Tracker ("RT") or Bugzilla. Both seem to perform well in most areas. Both lack a way of dealing very quickly with trivial calls, which is slightly worrying. Our testing will reveal how much this actually matters. Of the two, RT seems to have the smaller user community. But its glorious flexibility, and its ease of configuration, combined with its having been designed from the ground up as a general purpose message tracking system (rather than specifically as a bug-tracking system) gives RT the edge as far as we're concerned.
We suggest the need for a strong and loud information campaign. It should be aimed at the whole division, and it should be done sooner rather than later. We should say simply and clearly what DICE consists of and what it does not consist of. We could perhaps include a FAQ which tackles basic questions head on (such as "Is DICE a CS takeover?"; "Will backups be worse under DICE?"; "Shall I have to learn a whole new operating system?")
2002-02-15 we've now started a good information campaign - well done. It seems that we may have judged peoples' attitudes to DICE wrongly: actually, most people seem to be completely uninterested! Either that or the DICE web site is so excellent that it answers all of their questions, meaning that they had no need to turn up to any of the DICE publicity events.
2002-02-19 Efforts to give all DICE developers access to Linux machines now seem to be under way.
|
Please contact us with any
comments or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh |
|