Google Test Automation Conference 2016

gtac-2016-logoGTAC2016 finished today. I must say it was one of the best test-oriented conferences I have been to.

Some highlights:

  • Single track, which eliminated having regrets over the other session being potentially more interesting.
  • Schedule loose enough to leave ample time for networking, talking to presenters, catching up on work, or just wandering around Google’s beautiful Tech Corners campus.
  • Mixing long (1-hour) and short (15-minutes lightning talk) sessions that kept audience engaged. The Quirkier Side of Testing on day 1 and Code Coverage on day 2 were perfect after-lunch warm ups to make sure we don’t drift away.
  • Wide range of topics covered: from formal test analysis and fuzzing, through elaborate test frameworks over to diversity and democratization of development.
  • Overall high level of presentations.
  • Top-notch organization. Starting with the host Matt Lowrie (I sense Billy Crystal’s Academy Awards was the inspiration here), through AV team, the transcribers, location etc. etc.
  • Using sli.do for handling questions. I need to try it myself one day, too.

My favorite sessions:

  • Day 1:
    • Tanya Jenkins: “Automating Telepresence Robot Driving” – using LIDAR to test robots is a nerd’s dream come true. And Tanya genuinely enjoyed that experience.
    • Nikolai Abalov: “Selenium-based test automation for Windows and Windows Phone” – sweet implementation of WebDriver wrapper for Microsoft’s products. I recommend reviewing other 2gis’es open source projects, outside Winium.
    • Alexander Brauckman and Dan Hislop: ““Can you hear me?” – Surviving Audio Quality Testing” – who wouldn’t want to build a framework to test audio quality!
  • Day 2:
    • Emanuil Slavov: “Need for Speed – Accelerate Automation Tests From 3 Hours to 3 Minutes” – systematic approach to lowering test execution time. It looked easy in Emanuil’s presentation, but one can only imagine the amount of brainpower needed to achieve a 150x test execution time reduction, since in reality they went down from 180 minutes to almost 1-minute execution.
    • Kostya Serebryany: “Finding bugs in C++ libraries using LibFuzzer” – let the computers find bugs in the code! Also very nice of Google to offer to scan open-source projects.
    • Jonathan Abrahams: “How I learned to crash test a server” – was a nice presentation of how MongoDB survives system crashes. And a great lesson on interesting issues you can catch when deliberately breaking the system.

I highly recommend viewing these and other presentations that Google will post on the Google Tech Talks YouTube channel (stay tuned for an announcement on the Google Testing Blog).

 

Things that keep me awake at night – DevOps pipeline Quality Assurance

Gathering thoughts as I organize this topic, and explore it. Suggestions on where to learn more are welcome.

I’ll be simplifying things initially. I plan to start by focusing on the pipeline and tools used in the DevOps culture. I expect the more I dive in this topic, the more precise next posts will be.

For simplicity let’s assume we are dealing with development of a cloud-based web application, having a DevOps toolchain that includes tools like Chef, Puppet, Jenkins, Docker, Packer, AWS, New Relic, Splunk… how do you test a deployment pipeline built on top of these?

I have to start somewhere. I know this: you can approach testing software by dividing the problem into separate areas, researching them, and executing any necessary actions, including finding and resolving issues. The result should hopefully be a high quality, or at least acceptable, product.

Let me try applying these areas to DevOps toolchain, and list the questions/topics that emerge.

Functional
  • Is it working as expected? What does working as expected mean to you? To your stakeholders?
  • Do you have unit tests? Integration? End-to-end? How many is enough?
  • Do you need to do any manual testing after a pipeline step is executed?
Automation / Automatability / Testability
  • Are you going to automate the testing? Why yes? Why not? How much?
  • If yes – which tools will you use? Are they free? What alternatives do you have?
  • Is the toolchain automation-friendly? Was it created with automation in mind?
  • Is it testing-friendly in general? Do you have hooks / breakpoints to make it easy to test?
UI/UX
  • Is there a certain User Experience your DevOps tools should deliver?
  • Is the pipeline error-prone? Can somebody deploy a test build to production by mistake? Can they destroy your current production stack by clicking on a badly described button?
  • Do you need to support keyboard shortcuts? Arrow keys / tabs to navigate?
  • Does the UI support long/short inputs for build names, or components? High build numbers?
User Acceptance
  • Who is your customer? What acceptance do you need from them?
  • Would you do A/B testing for your pipeline?
Installation / Integration / System
  • What do you need to integrate with? For example – would you file JIRA tickets automatically if something goes wrong?
  • Do you need a database? Which version?
  • What operating system will your toolchain run on? What OS will you support for developing it?
  • When depending on a 3rd party – do you accept to rely on their uptime? What if critical cloud-based tool goes down when you urgently need to deploy a hotfix?
  • Will the 3rd party let you know of planned downtime? Is the downtime in a timezone suitable for you?
  • Do you have backup?
Compatibility
  • What platforms should you be compatible with? AWS? OpenStack? Azure? Are you going to test all of them?
  • If your pipeline is web-based – which browsers will you support? Can a bad rendering on Safari cause an error? What about strict Firefox security? What if the users are running Chrome with JS-blocking extension?
  • Any potential compatibility issues between your tools? Should you test every new version with others?
Globalization
  • Do you have any dates or numbers showing up in the pipeline? 1.000 and 1,000 are not the same… same goes for 6/12/2016…
  • Monday is not the first day of the week for everybody. Do you care?
  • If you have user input – does it support non-ASCII characters? Does it have to?
  • Any of your users need a localized UI?
  • If some of your resources are outside your country – would you support them? What if part of the deployment needs a phone number, but it’s in a weird formatting from another country?
Compliance
  • Are you required to meet certain requirements like SOX or HIPAA? Can your DevOps toolchain and code assure at least part of the compliance?
  • Any export regulations you might be violating with your DevOps code? What if certain country requires that data is stored locally, but your tools deploy a server on a different continent?
Stress / Load / Performance
  • Can you deploy 10 servers simultaneously? What about 10000?
  • How long does it take to deploy the infrastructure? Is 1 hour acceptable? What if 10 minutes is too long?
  • Did anyone even define these requirements?
  • Do you track any of the performance metrics?
Security
  • Do you take any user input? Can a malicious user infect other users? Steal their passwords? Admin password?
  • Do you store sensitive data in your Jenkins jobs? Where do you store them securely?
  • How will you prevent users from committing their AWS credentials to public repositories?
  • Do you remove all access when terminating employees?
  • Do you use access control? Do you audit user actions? Should you?
  • Who is really implementing security? Can a single engineer misconfigure firewall on all your production servers?
Supportability
  • Do you have enough logging to know why something went wrong? Do all 3rd party tools have enough logging?
  • Where are your logs?
  • Do you have alerts / notifications in place?
Configuration
  • What are the configuration options for your jobs?
Documentation
  • What documentation do you need? Do you have enough if somebody decides to leave abruptly or falls under a bus?
  • Any public-facing documentation you want to / have to share?
Adoption / Metrics and Instrumentation
  • Any metrics you want to track?
  • Do you need to add instrumentation to the jobs to know where the bottlenecks are?
Upgrade / Rollback
  • How will you test new versions of the tools? Are you ready to roll them back? Will they work after rollback?
Rollout strategy
  • What is your must-have vs nice-to-have? What tools depend on each other?
  • Can you define phases of your DevOps toolchain deployment?
Resources
  • Have you identified all the resources you need for testing?
  • Environmental resources like hardware, and software that you need?
  • What about licenses? Any legal review of these needed?
  • Are you well staffed? Any training your engineers need?
Deliverables
  • Documentation, artifacts… what else do you need to deliver?
Vendor / 3rd party
  • When working with a vendor on your DevOps implementation: how much would you want them to test vs you? What is their testing strategy? How much testing overlap should happen? What to they need to deliver?
Definition of done
  • When can you tell you are happy with the testing of the DevOps toolchain?
  • Do you need to sign off? Who else signs off?

 

These are just some initial thoughts. What do you think of these? What’s missing?

The Volcano of Software Testing

What happens when you combine the Test Automation Pyramid with Exploratory Testing concepts? You get the Volcano of Software Testing:

Volcano of Tests

The volcano represents the fact that both checking and exploration are part of the activity we call software testing.

Note:

  • The shape of the Test Automation Pyramid refers to healthy test ratios and relative importance. The “volcano ash” cloud follows that pattern:
    • It is on the top, as these tests are typically relevant to the end user, can be relatively complex and costly.
    • In my opinion they are key to achieving customer satisfaction, so the cloud is large.
  • Using tools is not limited to automation. Manual testing should be supported by tools when necessary.
  • “Checking” can be both manual and automated.

Send me your thoughts.

Suggested reading:

Testing as risk-reducing activity

What is software testing? There are many ways to answer this question:

  • ISTQB: “Process of executing a program or application with the intent of finding the software bugs.”
  • Wikipedia (Cem Kaner would probably agree): “Investigation conducted to provide stakeholders with information about the quality of the product or service under test.”
  • James Bach: “Lighting the way of the project by evaluating the product.”

I give you here another approach on what software testing can be: a risk-reducing activity.

When you use it, any unknown in the project will show up as potential risk, and testing-related activities will reduce the risk.

This approach can help you answer the “hard” questions of testing: what to test, and when you have tested enough?

Risk classification

Let’s start with classifying risk. The classic approach is to estimate the impact and likelihood of a given risk. For testing purposes it may be enough to asses both on a 3-point scale: low / medium / high. It is up to you to define what these mean for your organization. A loss of a single customer is a low impact in some cases, or a high one if that is your organization’s only client. Same for likelihood: you can try giving specific percentage value, or just do an estimate.

This classification allows you to map risks to a heat map like this one (after OWASP):

Overall risk severity
Impact HIGH Medium High Critical
MEDIUM Low Medium High
LOW Note Low Medium
LOW MEDIUM HIGH
Likelihood
What to test?

Risk identification, a process of calling out risks, will help you with the “what to test” question. It does not have to be hard.

Based on my experience you may want to:

  • Learn from the past: look through past work and see what others have called out as risks in their projects. Also any issues in similar work missed before? These are worth calling out as risks that you need to deal with in the project.
  • Cast a wide net: your organization: developers, product owners, fellow testers…, are great in coming up with potential risks.
  • Search for the unknowns: Socratic “I know that I know nothing” is a good start to other risk-identifying tasks. Assumptions may have to be broken…

After you have identified the risks you can assess them. I would encourage you to do it as group exercise: in-person, offline, using delphi method etc.

The assessment will allow you to map the risks to a heat map and then prioritize the testing. You are most likely to start with items identified as critical risks.

A note: you will never be able to identify all the risks. But you will get better over time. Testing and test planning are risk-reducing on meta-level, too.
When have I tested enough?

Risk management helps to answer this question. There are generally 5 things you can do with the risk:

  1. Mitigate
    Reduce it. For a team this typically means “test it”. An unknown becomes known, and risk is lowered.
  2. Avoid
    Eliminate it. It could be through change of scope, requirements change, etc.
  3. Transfer
    Share it. This may not be obvious, but maybe you can have a third-party certify your product, and take responsibility for issues found later? Transfer can be also achieved through insurance. (yes, there seems to be an insurance that covers bugs)
  4. Accept
    Even if you identified and assessed a risk it does not always have to mean you need to do anything about it. Low risks can be sometimes simply ignored.
  5. Exploit (after DBP Management)
    Turn it into an advantage. Maybe you have identified a risk of handling huge load by your website? But that is something you actually want, so you exploit the risk by making your website even more attractive, useful, easy to reach.

When do you know you have tested enough then? You lower the risk to an acceptable level. And what is an acceptable level? The answer to this question may be a risk you can transfer to your stakeholders :).

Let me know what your take on risk-reducing in software testing is.


 

Additional resources

Risk Assessment in Practice – easy to follow guide

Risk/Impact Probability Chart – risk heatmap example

Perceptions of Risk – people are bad at estimating risk by Bruce Schneier, this should not discourage you though from trying to reduce risks

10 Ways We Get the Odds Wrong – people are bad et estimating risk by Psychology Today to learn more what your brain does with risks

Tester toolbox 101 – Developers

developersI like the developer-tester tension as much as the next person. I think it can be healthy, fun, and lead to improved quality: a prudent tester would make sure to have developers “in their toolbox”.

As with any other tool, it is important to know:

  1. How is it used? What benefits do I get from getting developers’ assistance in this context? Why would I want to ask for it?
  2. When to use it? What is the right time to ask?

Why would you ask for help?

  • To learn

Developers wrote the software you are testing, after all. So you can at least hope they know how it works. After all, you probably want to know how the thing works? A developer-friend would be happy to share their knowledge – not only on how they coded this or that module, but also on development in general, engineering, also on other subjects (that’s how I learned about the best food in Hawaiʻi).

A great developer would be also happy to hear your feedback on their software. And in return give you hints on how to test it better.

  • To build tools

You can try writing all your tools yourself. OR you could get help: have your code reviewed, or get fresh ideas. If you are lucky you may even have tools built for you.

  • To enjoy

It is great to make new friends, and celebrate success together. If you do not agree, I hope you at least like to learn (two bullet points up).

What is the right time to reach out?

  • When you are stuck

Asking for another person’s opinion is one of the best ways I know to get unstuck. They may come up with a fresh idea, or at least listen so that you can organize your thoughts.

  • When you need validation of your work

Unsure whether what you are doing is the right thing? Or maybe you did something awesome and want to share? Both are great opportunities to talk to your developers.

  • Choose your timing

Engineers tend to work best when they are “in the zone”. You do not want to break their focus, nobody likes it. If you see a busy developer – do not interrupt with your questions. Also when they are glaring into the distance (looking for inspiration), reading, sleeping… unless you know exactly when is the right time to approach this specific person – best is not to interrupt them at all.

Managing interruptions is probably part of your work environment culture. Some allow asynchronous requests (calendar invite, email, IM). Others use the headphones rule or some other sign that people are open to interaction.

  • Do your homework first

If you have not googled the solution to your problem, or have not checked the internal issue database – do it before approaching others with ask for help. Otherwise the answer you might get is an lmgtfy.com link, and a lazy person’s reputation.

My rule of thumb is I set a time limit for myself to solve a problem. If I still have not solved it by then, I spend a little extra time to push myself before reaching out for help. This approach may not work in every case, and estimating the correct amount of time is hard. But when I do it, I use this extra effort to try to learn something new. Even if it does not lead to the solution, I don’t count it as wasted.

How do you get along with developers, then?

I know what does not work:

  • “Us” vs “Them” blame game – developing vs testing software is not a zero-sum game. Start thinking about the development team as the unit that delivers value, and you are more likely to help each other. You are all in the same boat, after all. So don’t waste time trying to blame this or that person for not catching the bug, or for introducing it. That does not mean you should not do Root Cause Analysis – you absolutely should, and use it to learn from your mistakes. But not to finger point.
  • “Quality Police” approach – if a development team ships a buggy product you all loose. But if you don’t ship at all, you loose, too. The customer is unhappy in both cases – either they got broken product, or they got nothing at all. Focus on delivering value and try finding ways to move things forward.

What works:

  • Cover the basics – do your job. It may sound trivial, but I have seen many times when a poorly written bug report led to unnecessary back-and-forth between developers and testers. So if you bug reports have to contain certain information (environment, reproducibility rate etc.) – make sure you always give it.
  • Find great bugs – everybody loves these. Interesting bugs are a signal that you care, that you spend time and energy testing others’ hard work.
  • Learn and respond to change – be flexible, be agile, adapt.
  • Small gestures like bake a cake, bring cookies, get some cool stickers… whatever makes your developers happy.

I hope you find this post useful. I would love to hear your opinions on this topic. Do leave a comment if you agree or not.

TL;DR

I am a nice tester, not a mindless bug reporting machine. If I am to change this image, I must first change myself. Developers are friends, not defect producers.

Bruce, the shark tester

Kanban is not for everyone

This post was inspired by the “Kanban: Successful Evolutionary Change for Your Technology Business” book by David J. Anderson.

My thoughts on Kanban and the book:

  • It is a great book, you should read it. I highly recommend it as introduction to Kanban.
  • Kanban is oriented on consistent throughput. Scrum, on the other hand, focuses on feedback/improvement loops.
  • You can apply Kanban to complex projects, but it seems better suited to smaller applications.
  • The layered approach to larger jobs, that is described in the second part of the book, is less intuitive than Scrum, in my opinion. It is not that it would not work, it is just less complex with Scrum.
  • Kanban is a perfect fit for teams that focus on request-based work, where the scope is well understood.
  • Groups that it might fit best:
    • Sustaining Engineering/Maintenance Engineering
    • IT
    • Release Engineering
  • Limiting Work In Progress is crucial. Capping WIP is important to all agile methodologies.
  • Scrum and Kanban should borrow from each other. Standups, WIP limit, reviews, automation… – are practices used regardless of approach used.

Tester toolbox 101 – Fiddler

27windows-live-writer-announcing-the-new-fiddler-logo_efdd-image_3-png-png

Fiddler is my favorite web debugging proxy. It is Windows-only, but in my opinion it justifies keeping a Windows VM just to be able to use it. What I do is I have a VMware Fusion or Oracle VM VirtualBox running on my Mac, with one of the Windows VMs dedicated to running Fiddler.

See below for some of typical use cases of this tool.

Disclaimer: some steps described below may affect your computer’s or network’s security. So be sure to know what you are doing.

Monitor local applications

This happens right out of the box. You may want to enable the HTTPS inspection as one of the first options after starting the tool:

2016-01-04_0114

Fiddler may prompt you to trust the certificate it generated. It is required for HTTPS inspection.

Fiddler2 requires additional steps to monitor Metro-style apps. But with Fiddler4 all should just work automagically.

Monitor remote applications

To monitor the traffic from other computers (like Mac) you need to allow remote computers to connect in Fiddler’s options:

2016-01-04_0116

Take a note of the port Fiddler listens on in the same options page (8888 in my case). The only other information you need is the IP address of the system running the proxy. Then set the other computer to use this IP and port as the proxy. Here’s how to do it on Mac. At this point you should see all the remote traffic going through Fiddler.

For HTTPS inspection from remote computers, remember to export the Fiddler’s root certificate and import it as a trusted Root CA on the remote computer. Otherwise you will get security prompts or your applications may refuse to contact their servers.

2016-01-04_0131

iOS and Android

The setup here is similar to the monitoring remote applications case – you need to allow remote computers to connect. Next, install the CertMaker for iOS and Android add-on. After that you have to restart Fiddler. Once it comes back online – change your mobile device settings to use the Fiddler machine as proxy. Last, visit the http://<Fiddler.machine.ipv4.address>:8888/ page from your mobile device – and install the root certificate from this website.

Modify the traffic on the fly

FiddlerScript is very powerful. One useful case might be simulating server errors.

When I wanted to test whether my application handles server outages gracefully, I was adding rules to OnBeforeResponse function that would fake service issue:
oSession.oResponse.headers.HTTPResponseCode = 503;

Troubleshooting

The one very common issue I have seen with Fiddler is that it may leave the proxy enabled on the local system, even if it is not running. It may not break some applications (Firefox maintains its own proxy settings, for example) but will affect others (IE, Chrome etc.). So if you see network-related issues when Fiddler is not running, check your Control Panel > Internet Options and disable the proxy if needed.

2016-01-04_0133

The add-ons

The list just keeps going on. My favorites are:

  • CertMaker for iOS and Android – makes capturing mobile traffic easy
  • Syntax-Highlighting Add-Ons
  • Watcher – a Passive Security Auditor – generates security report as you click around your web application

The book

Yes, there is a book. I have not read the “Debugging with Fiddler“, yet. But since The Man wrote it, looks like recommended position. The official documentation is also great.

By the way…

Other similar tools I have used, and may suit your needs better:

  • ZAP – by OWASP, and has a Jenkins plugin
  • Burp – has awesome website mapping, and powerful security scanning built-in
  • Charles Proxy – works on Mac (Yay!), but is not free (Nay)

Tester toolbox 101 – Wireshark

shark_squeezed

Wireshark – Go Deep

Indeed, Wireshark allows you to inspect your network traffic in-depth. And that may be very useful in testing. It may not be as easy to use as Fiddler, for example (more on this tool in the future), but at the same time it helps you learn about the basics of networking. This is why I would start with Wireshark before moving to other tools.

Installation

Easy on Windows systems: just get the 32-bit or 64-bit installer, and follow the wizard (do remember to install WinPcap if you plan to capture on local system).

OSX: I suggest using homebrew rather than using the official installers. If you want the UI – do remember to use the –with-qt option like this: brew install wireshark --with-qt

Linux: use your favorite method.

WinPcap, and tcpdump

These go along with Wireshark like peanut butter and jelly. Both are based on the libpcap packet capture library, and are used with Wireshark to capture the network traffic. WinPcap gets installed with the Windows version of Wireshark by default, tcpdump may require additional installation on OSX or Linux.

Capturing the traffic with WinPcap is pretty much automatic, when you are on Windows. All you have to do is use the Capture menu. But let me share my favorite, universal way of running tcpdump to capture traffic on Linux or OSX systems:

tcpdump -s0 -i eth0 -w dump.pcap

The options are:

  • -s0: dump the entire packet, important if you want to inspect the full payload
  • -i eth0: listen on the eth0 interface only, will work in 80% of the cases
  • -w dump.pcap: dump the output to the dump.pcap file, you will open this file in Wireshark later

I would typically run tcpdump only for the brief period of time to capture the traffic I am interested in. But if you cannot trigger the specific network event at will – you may have to filter tcpdump further by specifying port, limiting amount of payload captured etc.

After I have captured the data – I would copy it over to my local system with something like scp for further analysis.

Use cases

Typical Wireshark use case for a tester is better understanding of the application under test by seeing what actually gets sent on the wire. This allows you to increase your knowledge of the application. It may enable you to find interesting bugs, or just become an expert overall.

Another case may be any sort of security testing. For example verifying that all traffic is being sent over secure channel. Or looking for ways to exploit the system.

Some cases where it helped me: investigating Windows Phone 8 certificate-based authentication or debugging an iOS application.

Further information

On top of official documentation, you may want to check out these:

Assumptions testing

A failed browser extension release we did recently reminded me of an interesting presentation a job applicant once gave. A short talk on any topic was part of a hiring process for our team back then, and this candidate decided to talk about assumptions testing. Now I don’t remember the exact slides or words, but I do remember the spirit of the message, and I will try to describe it here.

This person defined assumptions testing as a technique in which you deliberately try to identify the expectations you have about the object of your testing. And try breaking the assumptions in hope of identifying issues with the item under test.

There are two types of assumptions you can make in this context:

  • Explicit assumptions – known and declared. Defined product limitations, for example. They may appear as statements on what user would never do. And the tester performing assumptions testing would naturally not only ask, “why do we believe user would never do it?”, but would actually also perform that very action. The idea behind performing such test is to trying to catch serious product malfunctions before they can do real harm. An example that comes to mind here may be plane doors. One may make an assumption that no passenger would ever attempt to open them in flight. But it happens, many times per year. So a prudent assumptions tester would come up with a scenario to test exactly that.
  • Implicit assumptions – unknown and undeclared. Could be something that is related to engineering culture at your company, or your individual experience. These may be harder to identify than the explicit ones, but there are methods that can help:
    • Peer reviews of test plans and testcases: especially when the reviewer is not directly involved in the project, so less likely to make assumptions about the solution, and is more likely to ask questions that break them.
    • Templates: I am not a big fan of documentation, but in some cases you can prepare templates that trigger questions similar to the ones a peer reviewer would ask.

Not breaking either type of assumptions may be equally catastrophic. In our case of the failed browser extension release the assumption made was an implicit one: this browser extension had never had any issues with upgrading, so testing the upgrade scenario never came to our minds. Fresh installation we tested worked well. But we broke the extension for any user that upgraded it. Lesson learned.

assumptions