Study at Cambridge
About the University
Research at Cambridge

IT Services

Department of Engineering

Log in with Raven
  • Home
  • About
    • IT Strategy
  • Services
    • Administrative and information systems
    • Audio-visual facilities
    • Central Computing System
    • Desktop services
    • Network and telephones
    • Printing services
    • Projects and system development
    • Server room facilities for research
    • User accounts and passwords
  • Support
    • IT Helpdesk
    • Computing help web pages
    • Divisional and research support
    • IT Administration
  • News
  • Blogs
  • People
  • Contact us
Home / Blogs / I/O, I/O, it’s off to work we go…

I/O, I/O, it’s off to work we go…

August 5, 2015 By Paul Taylor 1 Comment

…Or, balancing between infrastructure maintenance and project delivery, and hoping that you don’t fall off

It’s never a good sign when, whatever you do on a computer, all it will say is “I/O Error”. That’s what happened a couple of days ago with one of our older servers, and it soon became clear that things weren’t going to improve.

It was a Monday morning, and the first day of our system manager’s holiday. Leaving it until his return wasn’t an option: the server was one that co-ordinated a number of important processes every night; we had to fix things. Fortunately, there had been some work already done to migrate the processes to a newer server, so there was already a replacement available. But the work had stalled under the pressure to deliver projects with direct and tangible benefits to users – and that’s where the difficulty arises.

Preventative maintenance work such as that migration is like an insurance policy: if it isn’t done, things might go wrong, and then we’ll be in a mess. On the other hand, things might not go wrong – in which case, time spent doing the maintenance work is time that could have been spent doing other things instead. And you can’t tell which it will be: a year from now, a server might have died, or it might not. It’s like Schrödinger’s IT.

Maintenance isn’t glamorous, and people don’t necessarily even know that it’s going on. But if you’re delivering new projects, the benefits are obvious, people like them, and you’re doing what they want. Our difficulty for a long time has been making a case to allocate time for maintenance in the face of ever-increasing requirements for new things: systems, networks, servers, databases and web sites. Maintenance gets squeezed in round the edges and into the evenings, but, even more, it gets squeezed out. If it starts to look difficult, it stalls; if it needs a lot of time, it stalls; and if it needs several people to work on it at once, it may never even start at all.

And that’s where we came in: an old server was still being used because we can’t allocate the time for maintenance…and on Monday, it failed. Three or four people had to drop everything for a day-and-a-half to deal with it. We’ve managed it, of course, and almost no one will have been aware that there was a problem. Some might argue that this is a valid approach to all maintenance: after all, it forces the work to be done; it may break a logjam of debate over the best methods; and it probably gets things done more quickly than they would have been. But it does that by disrupting all other work, and – in some cases – by disrupting services, too. It causes corners to be cut so that the work can be done quickly, and it creates enormous stress for those involved – not to mention long hours. No, it’s not a valid approach.

Our challenge is to find a way to make a case for that maintenance work, and to make its benefits as clear – and quantifiable – as those of the shiny deliverables from new projects. The case has to stand up beside the cases for new projects; it has to be taken seriously; and it has to be given the time and resources that it needs.

Comments

  1. Tim Love says

    August 13, 2015 at 7:23 am

    There’s “knowledge maintenance” as well as “infrastructure maintenance” to worry about as well – less time-critical and even harder to justify. Patrick mentioned at yesterday’s meeting that he was struggling to keep up with understanding new users’ tech needs. This vac I’m trying to catch up with Python, CentOS, etc. in the hope that I can perform quick-fixes if Teaching problems arise.

    Log in to Reply

Leave a Reply Cancel reply

You must be logged in to post a comment.

IT Helpdesk

E-mail: helpdesk@eng.cam.ac.uk
Tel: 32686

  • Helpdesk enquiry form
  • Network connection request form
  • Feedback form

Recent news

Sharepoint links

September 26, 2022

AV Upgrades in LT0

September 29, 2016

AV Upgrade in Boardroom

September 1, 2016

AV Upgrades for Lent Term

January 12, 2016

ITX AV Expo 2015

November 3, 2015

More News...

Recent blog posts

PaperCut Popup setup on Mac

October 25, 2017 By Anna Langley

AV Updates- Easter 2017

April 20, 2017 By Gavin MacKenzie

Security Awareness

June 14, 2016 By Caroline Blackmun

IMAP Issue still affects Outlook 2016

November 11, 2015 By Eleanor Blair

iscsi/CHAP notes

September 15, 2015 By John Sloan

More Blog Posts...

Contact

Engineering Department
Trumpington Street
Cambridge CB2 1PZ
United Kingdom (map)
Tel: +44 1223 748203
Fax: +44 1223 332662
E-mail: comp-admin@eng.cam.ac.uk

Information provided by:
webadmin@eng.cam.ac.uk

Privacy policy

Services

  • Administrative and information systems
  • Audio-visual facilities and support
  • Desktop services
  • Network and telephones
  • Linux System
  • Projects and system development
  • Server room facilities for research groups
  • User accounts and passwords

Support

  • IT Helpdesk
  • Computing help web pages
  • Divisional and research support
  • IT administration


© 2014 University of Cambridge

University A-Z
Contact the University
Freedom of information

About the University

How the University and Colleges work
Visiting the University
Maps
News
Jobs
Giving to Cambridge
Global Cambridge