vrijdag 29 oktober 2010

5 mins or 30 mins, that's the question...

The week before this one, I commuted to work by train and got inspired for this blog.
So what's the story?

The last couple of weeks our national public transport train company - apparently- had loads of trouble keeping to their timetable (and IMHO still do but that is another story). Every single day for the past weeks the train has been delayed for 5 minutes and that made me 'tweet' the following:

"I find it more annoying to have a 5 min delay every day than to have 30 mins delay once in a while. "

and after that :

"I guess my last tweet about train travel, could very well apply to software testing too??" and "What do you think: Less annoying to have a huge bug once in a while, than numerous ones almost constantly"

I got the following responses:

@santhoshst : "True - I related it to Performance Quality criteria :) #softwaretesting

@santhoshst : "Depends on the context :)

@jahoving : "but even more annoying to have no bugs at all ;-)

@eddybruin : " 1 huge bug is better manageable than 100's of inconvenient little bugs"

There are not many replies, but all replies have something that made me think about this statement more. I have put down some thoughts/ questions that I had and would love you to respond on that!

- No bugs at all can mean a couple of things (amongst others): 1. the utopious perfect programmer has arrived to program the utopious perfect analyst's and designer's work. 2. you have not tested the right part of the software, are not performing the right tests or have automated your tests and this only checks things leaving the really important bugs unfound.

- It's annoying to get bug reports constantly. It's better to report once in a set time than to come running over to the programmer/ manager (etc.) with every single bug found. Also the 'huge bug-rule' can apply here; a huge bug can be reported immediately (it probably has to be fixed with high priority too) but all those little ones? Report it once a couple of days; else it will interrupt the programmers work (and attention) unnecessarily. It will also cloud a managers view; when you come to him constantly with every bug: the really important ones will not seem as important as they really are (famous fable of 'Peter and the Wolfe')

- When looking at performance issues. When you have - for example- a 7 seconds delay at every request this can be seen as not a big issue (in the margin) but a 1 minute delay is seen as problem. As a user 7 seconds can seem like a eternity when your on a deadline especially when you have 100's of request to work through every day. A minute delay once in a while can give you some coffee time; 7 seconds each request is simply a pain in the ...

- Of course the context has to be taken in consideration, as mentioned by santhoshst. Numerous bugs can also be very 'dangerous' when they are in a critical part of a system; even the smaller ones.

- I wondered if this one would be true: 1 bug is better manageable than 100's of little ones. I could think of a bug that had a very complex background: to get it solved it took a lot of time, management, politics and redesigning. In the same time at least 20 others with minor priority where fixed and closed. Overview can be complex for a lot of smaller ones, but it's the real big ones that can push your skill-limits. This is - mind- also related to the role and tasks you have within your organization. When this is only to report the bug and retest it when it's fixed, than the statement could well be true. When you also have to 'guide' the defect through the rework process it will probably be more like described earlier.

- When the timetable of the train at station X would be adjusted to 5 mins later, there wouldn't be a defect/bug. I wonder why they don't do this, because at the next station there is a - almost - ten minutes wait till the train leaves from there. So there is margin to set the time of departure at station X a bit later. And here comes (I know a bit unconventional) statement. Specifications can be changed to 'solve' a bug (dependent on the context/severity etc.).

I'm very anxious to see your replies. I so love discussions! and - needless to say-... please reply in one big one instead of lot's of smaller ones :-)

vrijdag 8 oktober 2010

Continuous Quality Process Software- and Systems (CQPSS)

[Blog entry from CappingITOff]

I was not always into testing. My first choice of studies, before getting entirely intrigued by IT, was that of FoodEngineering. When I got into the testing matter, I was surprised to find that in IT, testing is mostly done during development or when changes occur in the software or system. In the foodindustry there's a continuous quality process during development and during operations, this is named HACCP, the abbriviation for Hazard Analysis and Critical Control Points. HACCP - and I quote Wikipedia here - is a systematic preventive approach to food safety and pharmaceutical safety that addresses physical, chemical, and biological hazards as a means of prevention rather than finished product inspection. HACCP is used in the food industry to identify potential food safety hazards, so that key actions, known as Critical Control Points (CCPs) can be taken to reduce or eliminate the risk of the hazards being realized. The system is used at all stages of food production and preparation processes including packaging, distribution, etc.

I started to wonder why in IT (Testing) there is no such process implemented; our business is riskmitigation isn't it? IT is becoming more and more (or is already) essential in our businessprocesses and daily lives. Failure has such a hughe impact that I find it scary to not have constant monitoring on IT solutions. It is a known fact that testing during development can't be done with a 100% coverage of system- or software, so there are still some flaws in there that could mean disaster to your business...

I thought up the Continuous Quality Process Software- and Systems, in short (because IT likes the use of abbreviations) CQPSS. Of course I used the seven basic principles of HACCP as my baseline, so let's look at those principles, which I will map to CQPSS.

Principle 1: Conduct a hazard analysis.
Plans determine the food safety hazards and identify the preventive measures the plan can apply to control these hazards. A food safety hazard is any biological, chemical, or physical property that may cause a food to be unsafe for human consumption.

In the CQPSS it's almost the same. Every business has critical processes. In the CQPSS plan these criticall processes should be described. When testing during development is done correctly, this riskanalysis should be there and is re-usable. Ofcourse these should be updated when changes occur.

Principle 2: Identify critical control points.
A Critical Control Point (CCP) is a point, step, or procedure in a food manufacturing process at which control can be applied and, as a result, a food safety hazard can be prevented, eliminated, or reduced to an acceptable level.

In CQPSS one should look at the process that has been automated and determine the point, step or procedure where one can perform a check. Preferably these checks should be designed in such a way that this check can be done automated. Checks should be done on various points in the process and not only on the outcome. For example: in data warehouse chains one should not only perform a check on the
reporting but should perform checks on staging, calculation outcomes, the data warehouse itself and the reporting.

Principle 3: Establish critical limits for each critical control point.
A critical limit is the maximum or minimum value to which a physical, biological, or chemical hazard must be controlled at a critical control point to prevent, eliminate, or reduce to an acceptable level.

Every checkpoint from principle 2 has got to have critical limits assigned. When for instance a normal incoming cashflow is reported from sourcesystems at 60K and 130K is highly unlikely than the system should have a critical boundary at 110 or 120K at the point of staging and the processing should be stopped or paused at least, the monitoring system should issue a warning so a business expert can check whether the cashflow is due to a frantic hype or perhaps some system has issued a batch of data twice.

Principle 4: Establish critical control point monitoring requirements.
Monitoring activities are necessary to ensure that the process is under control at each critical control point. In the United States, the FSIS is requiring that each monitoring procedure and its frequency be listed in the HACCP plan.

The output from principle 2 and 3 are used here and in this step the way HOW and HOW OFTEN these points are validated is established. For instance it can be described that a calculation check, described at principle 3, on cachflow is done at end-of-day each day at batch load point with a certain formula using a certain tool.

Principle 5: Establish corrective actions.

These are actions to be taken when monitoring indicates a deviation from an established critical limit. The final rule requires a plant's HACCP plan to identify
the corrective actions to be taken if a critical limit is not met. Corrective actions are intended to ensure that no product injurious to health or otherwise adulterated as a result of the deviation enters commerce.

Again a reference to principle 3; I here stated that for example the system should state a warning when the critical boundary is met. In principle 5 is is explicitly stated what the corrective action on this warning should be. In principle 3 it was stated that in that case a business expert should check the cause. Corrective actions could in this case be: stop load process, business expert check; frantic
hype -> continue batch processing or duplicate batch -> delete batch from flow and issue warning to delivering system.

Principle 6: Establish record keeping procedures.
The HACCP regulation requires that all plants maintain certain documents, including its hazard analysis and written HACCP plan, and records documenting the monitoring of critical control points, critical limits, verification activities, and the handling of processing deviations.

In CQPSS this means that the written CQPSS plan is published within the organisation is know to all the stakeholders of the process that is described and everybody is known with the actions to be taken. The results from the monitoring process should be archived as well; the way how to do this and how long data is to be kept is also to be described in the CQPSS plan. I like to mention here that it is especially of importance to highlight (or record extra in a specific overview) the derivations/ exceptions from the process, this way any changes in frequency or other anomalies in the process can be specificly monitored and acted upon; perhaps even a prediction can be made and adjustments (change request) can be issued.

Principle 7: Establish procedures for ensuring the HACCP system is working as intended.
Validation ensures that the plants do what they were designed to do; that is, they are successful in ensuring the production of safe product. Plants will be required to validate their own HACCP plans. FSIS will not approve HACCP plans in advance, but will review them for conformance with the final rule.

There are various monitoring ánd testtool that can be installed also on a production environment for monitoring purposes. These tools can be outfitted with specific testcases and checkpoints. When a fully automated process is in place, the control on the process is somewhat 'out of sight'. When this process is not working correctly it will never be noticed when regular checks on this process are
not in place. The CQPSS plan and its implemented process should be checked - preferably by an independent party- at regular intervals to check if the quality process is still working as intended. There should also be a validation on relevance of the process; are the checks being performed still covering all the risks that ought to be covered?

By implementing a process like CQPSS the measuring of quality of the system is not only done during the development phase (the traditional testing) but is extended to the complete lifecycle of the product/process. It makes the way to a more safer, more reliable and more trustworthy businessprocess where the monitored IT-component is implemented and - last but not least- makes it easier to apply for ((inter)national) certification of your business.