LIMS: Laboratory information management system
When running large genotyping projects in a multi-platform environment, information management becomes of great importance. Lab technicians constantly face problems such as comparing results from two different genotyping instruments for thousands of samples, management staff may need to find out for which samples a certain SNP has been analyzed and a person responsible for DNA samples may need to find all samples from a project finished several years ago.
To deal with such scenarios and many more, often with very specific requirements, we have developed our own database system to support our service genotyping. The system, which we call Chiasma, is based on Microsoft’s SQL Server database engine running on a dedicated database server machine. Client software on the office computers connects to the database so that lab personnel, bioinformaticians and management staff can store and view information.

Figure 1. Client computers in the offices and labs communicate
with the database server in the server room.
The Chiasma system can be divided into the following key areas:
- Handling of SNP markers and assays
- Management of stock samples and individuals
- Lab work assistance
- Genotype data quality control
- Creation of project reports
- Security and data history logs
Genotyping overview
Handling of SNP markers and assays
Before genotyping begins in a project, information such as flanking sequence data, chromosome names and gene names for the SNPs which will be analyzed are downloaded from public databases and stored in our system. For the low-to-medium throughput instruments we also store assay information (e.g. PCR primers). The system contains built-in functions to make sure that we keep track of on which DNA strand a SNP has been genotyped.

Figure 2. Lab technicians and bioinformaticians can search for assays and see which primers that were used in a particular assay.
Management of stock samples and individuals
All samples and individuals are identified by a unique name in our database. These names are created by combining a customer identification code with the IDs provided by the customer. Aliquots from the same sample submitted to us from the customer are always stored with different names (version numbers) to facilitate quality control and error tracing. The system supports having several samples connected to the same individual, and during quality control the results can be viewed separately for each sample or grouped together for the same individual.
If there is family and/or gender information available for the individuals which the samples are taken from, this information is loaded into our database since it can be used for quality control of the genotype results.
Lab work assistance
When the SNPs, assays, individuals and samples have been properly stored in the Chiasma system, the lab part of the genotyping can begin. DNA concentrations are measured and stored in the system. Chiasma then aids in the process of diluting the samples to achieve an even DNA concentration. Lab technicians can select microtiter plates in the system, type in a certain dilution factor or desired DNA concentration, and the system will calculate how the samples should be diluted and output control files for the liquid handling robots to carry out the sample dilution. Samples can also be rearranged by taking aliquots from samples on separate plates and putting onto one single plate to facilitate e.g. rerun setups. The final microtiter plate layouts can be exported to Microsoft Excel or to the genotyping instrument software.
All microtiter plates are labelled with both a bar code and a human readable name. The physical location of the plates is also stored in Chiasma so that the plates can easily be found when needed.

Figure 3. In order to keep track of all the samples in the lab, the physical location of samples are stored in the Chiasma system. Both the sample’s position in the microtiter plate is stored as well as the location of the microtiter plate itself.
Genotype data quality control
One of the greatest advantages with our database system is the possibility to implement homogenous quality control procedures across all our platforms (except for the very high throughput genotyping methods for which different approaches are required both technically and from a user perspective, see below). The quality control is carried out in a dedicated client program which tests for duplicate errors, inheritance errors, deviations from Hardy-Weinberg equilibrium etc. A wealth of statistics is available for the user to see, such as allele frequencies, success rates and the number of failures in control samples. All of these figures are monitored not only at the end of a project, but on the users request as soon as new genotypes are added to the project. Furthermore, the program can automatically produce a list of samples that must be genotyped again due to low success rate, duplicate errors or suspiciously low allele frequencies.
For genotyping methods which analyze more than a few thousand SNPs in a single run, it is neither technically feasible nor useful to interactively investigate the outcome of the analysis. Data from these methods are however stored in Chiasma and there are functions for performing most of the quality control procedures even on these large data sets, although not in an interactive manner.

Figure 4. An open session in the quality program with SNP statistics to the right, sample statistics to the left and control sample statistics in the top left corner.
Creation of project reports
At the end of a genotyping project, the lab technician responsible for genotyping in the project creates an internal report which contains information about the approved genotypes and some quality statistics. This internal report is loaded into the Chiasma system. The lab manager can then from the internal report generate more visually appealing report files which are sent to the customer. These reports include quality statistics for individuals, quality statistics for SNP markers and a quality overview which is especially useful in very large projects. Perhaps even more importantly, a genotype file can be generated

Figure 5. The figure shows selected parts of the quality overview section of a customer report. The graphs illustrate the distribution over the SNP markers for call rate (upper picture) and Hardy-Weinberg chi2 test (lower picture).
Note that in a full customer report, many more quality figures are listed.
Security and data history logs
The database server is physically located in a locked server room to which few persons have access. In terms of network topology, it is located behind the department firewall on a subnet separated from the rest of the department and with constant log surveillance from IT staff.
Data access security inside Chiasma is controlled on two different levels: the project level and the sample level. The reason for this is that the same sample collection can be used in different genotyping projects. Therefore, when a project is started, the lab technician who will do the genotyping needs access to both the samples which will be analyzed and to the actual project which will contain the genotype data.
All data changes which have relevance to the results or to other important information are saved in the database. It is always possible to see who made the change, when it was made and what the original value was.
Genotyping overview
With over a hundred completed genotyping projects, remembering which samples have been genotyped for which SNPs and in which projects is of course impossible. The Chiasma system therefore provides the possibility for management staff to search the great amounts of genotype data for projects in which a certain sample has been analyzed, list the projects for which a certain genotyping method has been used and much more. This can be helpful when planning for follow-up or collaborative genotyping projects.