1: Rectify fault and test
This unit will show you about rectifying faults, testing for the success of the solution and perform acceptance testing of the system to ensure the problem has been satisfactorily solved.
Outcomes for this unit
After completing this learning pack you will be able to
.Rectify possible causes, testing for the success of the solution
.Test the system to ensure the problem has been solved
Activity 1.1: Action Plan
This activity will require you to prepare an action plan for a given fault. The fault is described below. You will need to formulate this plan in fairly generic terms since you would be working without having had exposure to the system described.
The fault
You have been assigned to troubleshoot a network server. The server has been operational for over 18 months and has recently started to experience some problems. The symptoms described are as follows:
.System hangs intermittently when accessing disk drives
.The Windows 2000 Event Log shows several entries relating to I/O and CRC errors
.The lights in front of the RAID enclosure sometimes blink continuously, even when .disk activity is nonexisten.
.You suspect that the RAID subsystem is failing
Q: How would you develop an action plan, which will enable you get to the bottom of this problem?
A: An appropriate action plan would incorporate the following characteristics\
.Identifies the systems or components affected or impacted
.Identify the objectives of the plan (i.e. restore optimum functionality)
.Identifies resources needed, including hardware, software, human, procedures
.Identify severity and criticality, hence priority
.Identifies a timeframe for implementation, according to priority
.Identifies any support contracts that might exist and be applicable to system in question
.Indicates actual remedial steps to be taken. This might include system .reconfiguration, re-installation, software patches, component replacement, consultation with vendors to engage as needed
.Indicate risks including expected disruption as result of remedial action
.Identify a workaround solution in case previous steps failed to provide to rectify fault
Note: that not all items in the list from above should be included, but they should at least be considered. An appropriate way for developing this action plan would to use a pre-existing form, available as an organisational document.
Note: that quite often, highly featured help desk software would include all of the above items as part of the standard description of faults and their management.
Activity 1.2: Rollback strategy
This activity will require you to devise a rollback strategy based on the scenario from the previous activity. The fault is described below.
The fault
.You have been assigned to troubleshoot a network server. The server has been operational for over 18 months and has recently started to experience some problems.
.The symptoms described are as follows:
.System hangs intermittently when accessing disk drives
.The Windows 2000 Event Log shows several entries relating to I/O and CRC errors
.The lights in front of the RAID enclosure sometimes blink continuously, even when
disk activity is nonexistent
You suspect that the RAID subsystem is failing.
Q: How would you develop a rollback strategy for this situation?
A: A rollback strategy is a series of steps or measures that would enable you to restore the system being troubleshot to the state prior to troubleshooting beginning.
In this particular case, you rollback strategies would have considered the following:
.Steps from action plan may be reversed or equivalent system status can be achieved with alternative steps
.No data loss will be incurred. Full system and data backups are to be made before enacting the action plan
.Spare components are available, if needed
.Expertise is available for system reconfiguration. This might include internal .personnel and external (vendors or contractors)
.An alternative solution is available. ie backup server
.The consequences and impact of the rollback are understood
Activity 1.3: Acceptance Testing
This activity will require you to devise an acceptance test procedure based on the scenario from the previous activity. The fault is described below.
The fault
You have been assigned to troubleshoot a network server. The server has been operational for over 18 months and has recently started to experience some problems.
The symptoms described are as follows:
.System hangs intermittently when accessing disk drives
.The Windows 2000 Event Log shows several entries relating to I/O and CRC errors
,.The lights in front of the RAID enclosure sometimes blink continuously, even when disk activity is nonexistent
.You suspect that the RAID subsystem is failing.
Q: How would you develop an acceptance test procedure?
A: The development of an Acceptance Test involves a number of iterative steps:
1.Assess the type of testing required
2.Develop the procedures and instructions for testing
3.Develop the necessary test scripts
4.Execute the test scripts
5.Report any defects
6.Retest any fixes
Your acceptance test procedure might have included some of the following items:
1.Test type to be carried out ie simple, iterative, sequential
2.Instructions to be carried out ie any necessary preparations such as installation of monitoring software, auditing, load testing, benchmarking
3.The sequence (order) of tests to be done
4.Resulting data that will be analysed following the execution of tests ie reports, charts, benchmarking results, system log events
5.Definitions of what constitutes failure. Criteria or metrics to be stipulated here ie repetition of original symptoms, new symptoms
6.Repetition of testing after new fixes actioned
2: Obtain appropriate fault-finding tools
Fault-finding is a crucial skill in the life of the IT professional, no matter what area of IT you are in. Fault finding can be very challenging indeed, yet being able to solve a difficult problem can bring enormous satisfaction and recognition. The good news is that fault-finding skills can be developed. Fault-finding is a skill that will accompany you throughout your professional career.
The aim of this unit is to allow you to develop an understanding for fault-finding tools and methods. You will have an opportunity to practise using fault-finding tools and methods to solve real problems.
In this topic, you will have an opportunity to learn about tools that are used for fault-finding and troubleshooting purposes. You will also learn about generic cyclic fault-finding methods. Additionally, you will have an opportunity to practise fault-finding using commonly available tools for a range of computer systems, both standalone and networked.
Outcomes for this unit
After completing this learning pack you will be able to:
.Analyse and document the system that requires troubleshooting
.Research specifically designed troubleshooting tools for the system
.Investigate generic cyclic fault finding tools
.Obtain required specialist tools
This activity will require you to use the Internet to search for fault-finding tools that might be appropriate for an IT environment.
Use the following as search criteria:
. One software-based tool that performs standalone PC diagnostics. This tool must be freeware/open source/GNU GPL.
. One software-based tool that performs standalone PC diagnostics. This tool must be commercial.
.One software based tool that performs network diagnostics, for example, network discovery, packet capture and analysis. This tool must be freeware/open source/GNU GPL.
. One software based tool that performs network diagnostics, for example, network discovery, packet capture and analysis. This tool must be commercial
Q: What fault finding tools did you find that might be appropriate for an IT environment?
A: There are literally hundreds of software-based tools available. The real challenge is to be able to sort through them all and find the ones that will enhance your ability to find problems and fix them. Some possible answers are listed below:
.Sandra,
.Systemworks,
.Ethereal,
.Fluke Network Inspector, and
.Protocol Inspector.
.Activity 2.2: Hardware tools
The aim of this activity is for you to find out about hardware based tools that can assist you in the troubleshooting process. You will use the Internet, trade magazines and books to find out about hardware tools. Use the following criteria to narrow down your search.
You need to find:
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment