Over the years, the spreadsheets circulating in a testing or manufacturing company can become a real zoo. This presents a regulatory and client-relevant problem. It is important that you take a risk-based approach and validate your spreadsheets according to their use.
A spreadsheet used for dosing, reporting data or in dilutions may be a “legacy document”: a digital file developed years earlier by someone who may not even be employed with you anymore. It may have been developed in a previous version of Excel. Its workings may include macros that the current users only vaguely understand. Or it may have functions and hidden rows that have not been examined in years.
Although configurable off-the-shelf spreadsheet software such as Excel has many good security and data integrity features, you must validate spreadsheets according to a defined procedure. Regulators and clients increasingly expect this. This article will show you how to get started.
Define the three responsibilities involved: developer, verifier, and quality assurance
Developer
The developer is the person who creates the spreadsheet or who initiates the validation of an existing spreadsheet. This person will be familiar with the use of the spreadsheet and with Excel (or whatever spreadsheet software you use).
The developer is responsible for understanding the requirements in the SOP that defines the development, use and control of spreadsheets. He or she will complete a form, ideally a checklist, that refers to each of these requirements. He or she will also document the spreadsheet’s inputs, processing, and outputs for verification (see below).
This is the lengthiest and most involved step of the three. Making it easier is your job. You can help by making sure the checklist is user-friendly and flows nicely with the SOP!
Verifier
Like the developer, the verifier will also be someone who is familiar with the context the spreadsheet will be used in and with Excel. For example, this person may a technician who uses the spreadsheet every day, a peer of the developer, or an Excel expert within the company.
To save space on the form and to show the verification clearly, you can simply include an additional column where the verifier adds a check mark indicating each requirement was met.
Another term for verifier is simply technical reviewer.
QA
When this verification step is done, QA can serve as the final sign-off. This may include spot-checks, format review, a wiping of personal identifying information and comment history, and administering a final document password that will only be distributed to authorized users.
If problems come up during the spot check, the QA reviewer will take a closer look and possibly reject the document, kicking it back to the developer, who must address it and get the spreadsheet verified again. The QA reviewer should not sign until all issues have been resolved.
When these three people have done their work, you have a validated spreadsheet. And the overall process is called the spreadsheet validation. (You may use different terms than what I have described, of course.)
Break it down into three steps: inputs, processing, and outputs
I recommend this approach because it reminds you to visualize the data flow in three parts. On the left is raw data such as a column of animal weights in grams.
On the right is the outputs, such as the required dose of a test article extract in milliliters.
In the middle is the processing. In our example, the individual animal’s weight is multiplied by a specific factor (according to the protocol or SOP) and then presented in the correct unit. The spreadsheet may also provide the group’s average weight, highlight any animals whose weight is outside a defined range, and present the highest and lowest weights for inclusion in the final report.
What goes on during the processing is like a black box. The spreadsheet validation explains, verifies and documents this processing so that independent reviewers can understand the spreadsheet’s purpose and be assured of the quality of the data that comes out. It brings the black box out into the light!
Define any additional requirements
The above are very general requirements that apply to all spreadsheets. Depending on use, some more specific requirements might include:
- All cells except those needed to enter data or sample information are locked down. This one is very important!
- Data validation is enabled. In other words, nonsensical entries are not permitted or will result in an error message.
- Conditional formatting is appropriate and user-friendly. It may help in the study interpretation or trigger validity issues per protocol, so getting correct conditional formatting is important. Note that conditional formatting rules are not displayed in Excel’s Show Formulas function.
- Drop-down lists are in the desired order
- Rounding, significant figures, and display of digits is appropriate. Only the final value is rounded (there are no averaging of averages, for example).
- Input cells display the value exactly as entered. This helps reviewers who must check transcription.
- There are places to enter sample information. There is a place to enter the user’s name and date (or, better, a function that does this automatically).
- The spreadsheet has been checked for robustness. Expected and extreme values were tried. You attempted to “break” it.
- Personal identifying information has been wiped using the “inspect document” function in Office 2016.
- The print preview was inspected. Saving and copying and pasting data works fine. The font matches the final report table where the results will be copied and pasted, if applicable.
- Pagination, logos and the form identifier are included and correct.
- Printouts are included in the validation paperwork with all the formulas, row numbers, and column numbers displayed.
- This is optional: an additional sheet has been included that helps users understand and use the spreadsheet. (e.g. What raw data do I enter and when? When do I print it? What secure network folder do I save it in?) We like to think every user will read and understand the SOP. But in reality this sheet may be what they rely on!
Create a form where the three responsible groups document their work.
There should be a space for free text where the developer can describe the inputs, processing, and outputs. There should be a checklist for any additional requirements, and space to address the ones that are not applicable. There should be a clear separation of the developing from the checking.
The form should be comprehensive in case the developer and verifier do not actually read the spreadsheet validation SOP (we all know this happens). Finally, in exchange for asking busy people to do this validation, you should make sure the form is easy to use!
Some further reading
Surprisingly, the FDA provides very little guidance on spreadsheet validation:
(https://www.fda.gov/ScienceResearch/FieldScience/ucm174286.htm)
However, the FDA has issued warning letters that cite spreadsheet deficiencies, including spreadsheets whose functions are not locked down:
(https://www.fda.gov/iceci/enforcementactions/warningletters/2013/ucm369409.htm)
Two more good resources:
(http://www.cbinet.com/sites/default/files/files/Workshop%20B_Soto_pres.pdf)
(https://learnaboutgmp.com/validation-of-ms-excel-spreadsheets/)