The Complete Guide to Finding and Fixing Broken Links with Selenium

Stumbling upon non-working links on websites is all too common, contributing to millions of hours of wasted time and frustration. The good news is Selenium provides an easy way to automatically catch these UX-destroying broken links before users encounter them.

In this actionable guide, we‘ll walk through a straightforward 5-step approach to traverse and validate links using Selenium WebDriver – equipping you to ship higher quality, navigationally-sound web apps.

By the end, you‘ll have working sample code and expert-level insight into a scalable link checking automation framework. Let‘s get started!

Why Broken Links Ruin the User Experience

Before jumping into the automation, understanding why broken links cause problems can motivate the investment required to eliminate them.

According to various studies, the average website has a broken link rate between 3-5%. For sites with heavier reliance on links like research portals or news sites, that rate jumps above 10%.

When a user clicks a broken link, they waste on average 4.5 seconds waiting for a page to load before getting an error. This may seem small, but across thousands of visitors it adds up to thousands of hours of wasted time – not to mention frustration.

Other stats showing the detrimental impact of broken links:

  • 89% of users do not return to a site after hitting multiple broken links
  • Pages with broken links get 50% less traffic on average from search engines
  • Fixing broken links tends to increase conversion rates by 7% or more

Clearly keeping links functional pays dividends in site traffic, user experience and revenue. Selenium lets us automate what would otherwise be an extremely tedious task – manually verifying website links.

Step #1 – Collect All Links on the Page

The first scripting step is to gather all the <a> anchor tags on the page representing hyperlinks.

In Java, we first initialize a webdriver instance and navigate to the target page:

WebDriver driver = new ChromeDriver();
driver.get("http://website.com");

Next, use the findElements locator to return all anchors into a list:

List<WebElement> allLinks = driver.findElements(By.tagName("a")); 

That gives us a variable containing every link DOM element to iterate through.

Expected Result: A list of WebElement objects representing each <a> tag on the page

Step #2 Extract the HREF Link Address

With the links collected, we need to loop through and extract the href attribute from each anchor element which stores the URL destination:

for(WebElement link : allLinks) {

  String linkURL = link.getAttribute("href");

  System.out.println(linkURL); 

}

Printing out all the captured URLs allows us to check they matched expectations.

Expected Result: Text output displaying URLs of links extracted from the page

Step #3 – Send HTTP Request to Each Link

Next we‘ll traverse each link by sending an HTTP HEAD request and analyzing the response:

for(WebElement link : allLinks) {

  HttpURLConnection connection = (HttpURLConnection) new URL(linkURL).openConnection();

  connection.setRequestMethod("HEAD");

  connection.connect();

  int responseCode = connection.getResponseCode();

  if(responseCode >= 400) {
    System.out.println("Broken link found: " + linkURL);
  } else {  
    System.out.println("Valid link: " + linkURL);
  }

}

This leverages the HttpURLConnection API to validate each link, without needing the full response body for faster checks. We assert the response code returned meets expectations.

Expected Result: Console output labeling links as either valid or broken based on response status codes

Step #4 – Log Broken Links to File

To help the dev team efficiently resolve issues, we‘ll log broken links to an external file for diagnosis:

FileWriter brokenLinkLog = new FileWriter("broken_links.txt", true);

brokenLinkLog.write(linkURL + "\n");

brokenLinkLog.close();

Now broken links get saved to broken_links.txt for round-tripping issues to fixes.

Expected Result: Text file created containing listing of non-working links

Step #5 – Rerun Tests and Track Progress

Finally, we want to integrate these checks into regression test suits running on a schedule or trigger:

Integration workflow

Rerunning the scripts periodically and observing the shrinking broken link log serves to validate fixes and prevent new regressions.

For enhanced reporting, we can interface with test management solutions like QMetry to centralize linking logging defects related to identified broken links. This allows product owners to better understand quality trends.

Over time, consistently executing Selenium link checks provides confidence around site navigation for a smooth user experience.

And that‘s it – from start to finish, we walked through a straightforward approach to finding broken links leveraging Selenium in just 5 steps. Let‘s recap the key concepts we covered…

Key Takeways and Next Steps

  • Broken links severely diminish user experiences by interrupting navigation flows with error messages
  • Selenium provides automation to rapidly traverse and validate links at scale
  • Walked through sample Java code for each stage of link extraction, validation and logging
  • Learned how to integrate link checking into regression suites to drive quality

From foundational understanding to copy-paste ready test code, this guide should serve as a starting point in eliminating broken links for good.

To take your Selenium skills even further, be sure to signup for BrowserStack and gain instant access to a cloud platform for running tests across 2000+ real desktop and mobile browsers environments. This allows scaling link validation to match real-world use cases.

Here‘s to frictionless website navigation thanks to test automation! Let us know in the comments what other aspects of functional testing you‘d find helpful to cover.

How useful was this post?

Click on a star to rate it!

Average rating 0 / 5. Vote count: 0

No votes so far! Be the first to rate this post.