One of the key reasons for doing automated testing is to ensure that time is not spent on doing repetitive tasks which can be completed by tools without human intervention. Automation could be one of the most effective tools in your toolbox but it is not a silver bullet that will solve all the problems and improve quality. Automation tools are obedient servants, and as a tester we need to become their master and use them properly to realize their full potential. It is very important to understand that automation tools are only as good as we use them. Converting test cases from manual to automated is not the best use of automation tools. They can be used in much more effective ways.
Creating robust and useful test automation framework is a very difficult task. In the web world, this task becomes even more difficult because things might change overnight. If we follow so called best practices of automation taken from stable, desktop applications, they will not be suitable in web environment and probably will have negative impact on the project’s quality.
Many problems in the web world are identical to one another. For example irrespective of any web application we always need to validate things such as presence of title on all the pages.Depending on your context may be the presence of meta data on every page, presence of tracking code, presence of ad code, size and number of advertising units and so on.
Solution presented in this article can be used to validate all, or any of the rules mentioned above , across all the pages in any domain / website. We were given a mandate to ensure that specific tracking code is present on all the pages of a big website. In a true agile fashion, once this problem was solved it was extended and re-factored to incorporate many rules on all the pages.
This solution was developed using Selenium Remote Control with Python as scripting language. One of the main reason for using tools such as Selenium RC is their ability to allow us to code in any language and this allow us to utilize full power of standard language. For this solution, a python library called Beautiful Soup was used to parse HTML pages. This solution was ported to another tool called Twill to make it faster. Since the initial code was also developed in Python, converting it to Twill was a piece of cake.
Essentially this solution / script is a small web crawler, which will visit all the pages of any website and validate certain rules. As mentioned earlier, problem statement for this is very simple i.e. “ Validate certain rules on every webpage for any given website ”. In order to achieve this, following steps were followed
2. Get All the links
3. Get first link and if link is not external and crawler has not visited it, open link.
4. Get Page Source
5. Validate all the rules you want to validate on this page
6. Repeat 1 to 5 for all the pages.
It is worth mentioning here that rules that can be validated using this framework are the rules, which can be validated by looking at the source code for the page. Some of the rules that can be validated using this script are –
- Make sure that title is present for all the pages and is not generic
- Check the presence of meta tags like keywords and description on all the pages.
- Ensure that instrumentation code is present on all the pages
- Ensure that every image has an alternative text associated with it
- Ensure that ad code is coming from the right server and has all the relevant information we need.
- Ensure that size of the banners and skyscrapers used for advertisement is proper.
- Ensure that every page contain at least two advertisements and no page should have more than four advertisements, except home page.
- Ensure that master CSS is applied on all the pages for a given domain.
- Make sure that all the styles are coming from the CSS files and styles are not present for any element on a web page.
Above mentioned list might give you some idea of what can be achieved using this approach. This list can be extended very easily. It is limited only by your imagination
In the next article, we will look at the code snippets and explain how easily these rules can be customized and validated across all the pages on any given domain.