Web Accessibility

Data Processing and Exploration

Pie chart describing average error type percentage per webpage. Structure errors make up 16.6%, Aria errors make up 51.5%, Feature errors make up 13.7%, Alert errors make up 8.64%, Contrast errors make up 5.88%, and general errors make up 3.7%

Goal:

The purpose of this project was to analyze the web accessibility errors present on the most commonly visited websites and to learn more about web accessibility overall.

Methods:

In order to complete this project, a list of the top 1M most commonly visited websites was collected from Tranco, then the top 100 websites (and eventually 36 more) were input into the WAVE API. The output from the WAVE API, a dataset including the webpage name, types of errors found on the page, the count of each type of error found on the page, etc., was further processed into clean usable data. Once this data was cleaned, data visualizations were made to better understand the data. This data exploration project was completed with skills learned from Dr. Yu's Udemy Course, '100 Days of Python: The Complete Python Pro Bootcamp' and the software used were Pycharm and Google Collab.

Challenges:

This project's greatest challenge was collecting enough data to gain meaningful insights. The WAVE API gave 100 credits upon creation of an account, so I used only the top 136 websites (36 of the top 100 blocked the API and therefore did not use up credits, so I fed the API the 101, 102, ..., 136 urls until I used all 100 credits). With a limited data pool, it was difficult to pull meaningful information out of the dataset.

Tools:

  • Languages: Python
  • IDEs: PyCharm and Google Collab
  • Git repository: GitHub

Sources:

Tranco and WAVE API

Improvements:

If I were to continue this project I would buy more credits to analyze more webpages with the WAVE API and I would learn more about web accessibility in general

Github:

See this project on GitHub