Use a troubleshooting method for more efficient IT support (2024)

Troubleshooting is a crucial skill for IT professionals. There's no way around it: a lot of our time is spent trying to figure out why something that should work doesn't work. Much of our ability to diagnose and solve computer and network problems comes from experience. However, there is also a framework that guides us in finding the answers we need.

While none of this content is exclusive to CompTIA, I would like to point out that virtually all CompTIA certifications include some troubleshooting methods. This troubleshooting process has been built upon experience over the years and serves as a guide for newer members of the IT community in troubleshooting problems.

Here is the CompTIA troubleshooting method:

Identify the problem
Develop a theory about the probable cause
Test the theory to determine the cause
Create an action plan to address the problem and identify possible consequences
Implement the solution or escalate if necessary
Check complete system functionality and implement preventive measures, if applicable
Document results, actions, results and lessons learned

Let's see what these steps entail.

1. Identify the problem

This step is often the easiest. It can be obtained from an incoming phone call from a user, a help desk ticket, an email message, a log file, or a number of other sources. It is not at all unusual for users to alert you to the problem or glitch.

It is important to recognize that the cause of specific problems is not always clear. For example, a failed login attempt seems to indicate a problem with the username or password, when the real problem may instead be a lack of network connectivity, preventing authentication credentials from being checked on a remote server.

As troubleshooters, we make every effort to ensure that we have identified the cause of the error, misconfiguration, or service outage before making any changes.

Specific steps here may include:

Collection of information from log files and error messages
Survey users
Identification of symptoms
Determining recent changes
Duplicate the problem
Approach multiple problems one by one
Reduce the size of the problem

2. Develop a theory of probable cause

I want to start by pointing out the inaccuracy of the wording in this step. Words liketheoryInprobablystating a guess on your part, even if it is a guess supported by data. The way this step is written recognizes that the root cause (step one) may not have been accurately identified. However, the cause is specific enough to start troubleshooting.

This stage may require significant research on your part. Supplier documentation, your organization's own documentation, and an old-fashioned Google search may all be necessary to form the basis of your theory. It is a process of elimination.

3. Test the theory to determine the cause

The most interesting thing about steps one and two is that you don't have to make any configuration changes. They are about gathering information. You should not make any changes until you are reasonably confident that you have a solution that you can implement.

This step is also part of the "information gathering phase".

It is not unusual for experienced administrators to go through steps one, two and three very quickly and informally. Problems and symptoms are often based on common problems, making it easy to guess the likely cause of an error message or faulty device.

At this stage you may find yourself going all the way back to step one: Identify the problem. If you test your theory to find the likely cause and discover you were wrong, you may have to start your investigation all over again. You can check in with users, dig deeper into logs, use Google, etc.

Some workstation troubleshooting involves hardware components such as the CPU, memory (RAM), and storage (solid state and hard drives). You may need to replace these parts with known good parts. Other problems may be software-related, such as operating system (orWindows, Linux of macOS) or with applications (Word, Excel, Chrome or other programs).

A network troubleshooting process is different from troubleshooting standalone workstations or servers. An effective network troubleshooting methodology for network problems starts with an understanding ofOpen Systems Interconnection (OSI)-model. This seven-layer model defines the networking process and is considered a fundamental concept.

4. Create an action plan and implement the solution

Once you think you know the cause of the troubleshooting problem, you can plan how to fix it. Here are some reasons to plan ahead before blindly jumping into a course of action:

Some solutions require a restart or other more significant downtime
You may need to download software, patches, drivers, or entire operating system files before continuing
Your change management procedures may require you to test changes to a system's configuration in a test environment before deploying the solution to production
You may need to document a series of complex steps, commands, and scripts
You may need to back up data that may be at risk during recovery
You may need approval from other IT personnel before making changes

After this step you can start changing the system configuration.

You are now ready to do what you think you need to do to solve the problem. These steps may include:

Run your scripts
Updating your systems or software
Edit configuration files
Change firewall settings

Make sure you have a recovery plan in place if the solution you try doesn't solve the problem. You should be able to change your settings to at least get back to where you started.

In some cases, the implementation of the proposed solution can be faster than the research phases that preceded it. However, these investigation phases are important to ensure that you solve the real problem and minimize downtime.

5. Check complete system functionality and implement preventive measures

I once noticed an error in this phase of troubleshooting. A user called the appropriate support person to investigate a printer that was not working. When he arrived, he noticed that the printer's power cord was unplugged. He plugged it back in, complained that the users didn't understand computers and walked away. What he didn't realize, however, was that the printer was stuck and users had unplugged it while trying to resolve the fault. The tech walked away without verifying functionality.

Let the users who trust the system test the functionality for you where possible. They are the ones who really know how the system should work and can ensure that it meets their specific requirements.

Depending on the problem, you may need to apply the solution to multiple servers or network devices. For example, if you discovered a problem with a device driver on one server, you may need to update the drivers on multiple servers that depend on the same device.

6. Document detection

Documentation is my pet peeve. It comes from working as a network administrator for an organization without documentation. I was the sixth administrator the company had hired in five years, and no one before me wrote anything down. It was a nightmare.

Documenting your troubleshooting steps, changes, updates, theories, and research can all be useful in the future when a similar problem arises (or when it turns out that the same problem wasn't solved after all).

Another reason to keep good documentation as you go through the entire method is to communicate to others what you have tried so far. I once had Microsoft tech support on the phone for a broken Exchange server. The first thing the technician said was, "What have you tried so far?" I had a three-page list of things we didn't need to try again. This systematic approach has saved us a lot of time.

Such documentation is also useful if your changes have unintended consequences. You can more easily undo your changes or change configurations if you have good documentation of exactly what you did.

7. Keep it simple

However, this troubleshooting method is only a guideline. Every network environment is unique, and as you gain experience in that environment, you will be better able to predict the likely causes of problems and apply appropriate troubleshooting techniques.

If I could pass on one piece of wisdom to future support workers, it would be the tidbit above about simply starting to identify possible causes. In my courses, one of the most important troubleshooting checklists I suggested was this:

Is it connected?
Is it on?
Have you restarted it?

It may seem fantastic and overly simple, but these steps are actually worth it (it's even helpful to double-check these steps). The real lesson, however, is not in these three steps, but in the spirit of these tasks, which is to start simply and work towards the more complex ones.

Finally, there's one thing the above troubleshooting method doesn't address: time. In many cases you work within the framework of Service Level Agreements (SLA), legal restrictions or security requirements. In these situations, you should be able to perform the above steps effectively.

By consciously following a troubleshooting method, you can diagnose and resolve system and network problems much more consistently and efficiently. I strongly encourage you to formalize such a method for your support staff.

Learn the troubleshooting skills you need with CompTIA CertMaster Learn.Sign up for a free trial today!