With the increasing use of next generation DNA sequencing technologies as a regular part of biological research, vast amounts of DNA sequence data are being generated at an ever increasing rate. But with this added capability and power comes technical challenges around data management, processing and analysis. Genomics work carried out in Landcare is no exception when it comes to encountering such problems. An example of these challenges is found is some recent work led by Landcare Research scientist Thomas Buckley and carried out by post-doctoral researcher Alice Dennis and PhD student Luke Dunning.
Alice and Luke are interested in the functional coding regions of genomes of stick insect species native to New Zealand. Until recently processing their insect DNA sequence data collected from the next generation sequencing-by-synthesis platforms took one whole week per individual; even when using a fast multi-core desktop Linux machine with plenty of RAM.
The length of time to process the sequence data significantly slows down their research program and limits the number of such processing steps that can be undertaken within any given project. To improve on this problem Dan White (Informatics team) worked with Alice and Luke to use the computing resources within the National e-Science Infrastructure (NeSI) to shift their memory absorbing processes to NeSI’s high performance computing (HPC) resources which Landcare Research has access to as a member of NeSI. In comparison to the resources available at Landcare Research, the NeSI HPC resources that are now available include an 80 processor computer with each processor having 12 cores and 96 GB of RAM and access to 200TB of disk storage. For the stick insect case above early tests have shown a reduction in processing time from one week to just over 3 hours; a dramatic and significant improvement! Further improvements are expected as we explore the options of processing multiple files simultaneously and by chaining multi step sequence analyses together in an automated process.
· High powered computing saves the NZ stick insect: With the establishment and increasing use of next generation DNA sequencing technologies as a regular part of biological research vast amounts of DNA sequence data are being generated at an ever increasing rate. But with this added capability and power comes technical challenges around data management, processing and analysis. Genomics work carried out in Landcare is no exception. An example of these challenges is found is some of the work carried out by postdoctoral researcher Alice Dennis, and PhD student Luke Dunning, who work with Thomas Buckley. Alice and Luke are interested in the functional coding regions of genomes of stick insect species native to New Zealand. But processing their insect DNA sequence data collected from the next generation sequencing-by-synthesis platforms took one whole week per individual; even when using a fast multi-core desktop Linux machine with plenty of RAM. This type of delay significantly slows down their research program and limits the number of such processing steps that can be undertaken within any given project. To improve on this problem Dan White (Informatics) worked with Alice and Luke, and the computing resources within the National e-S
High powered computing saves the NZ stick insect: With the establishment and increasing use of next generation DNA sequencing technologies as a regular part of biological research vast amounts of DNA sequence data are being generated at an ever increasing rate. But with this added capability and power comes technical challenges around data management, processing and analysis. Genomics work carried out in Landcare is no exception. An example of these challenges is found is some of the work carried out by postdoctoral researcher Alice Dennis, and PhD student Luke Dunning, who work with Thomas Buckley. Alice and Luke are interested in the functional coding regions of genomes of stick insect species native to New Zealand. But processing their insect DNA sequence data collected from the next generation sequencing-by-synthesis platforms took one whole week per individual; even when using a fast multi-core desktop Linux machine with plenty of RAM. This type of delay significantly slows down their research program and limits the number of such processing steps that can be undertaken within any given project. To improve on this problem Dan White (Informatics) worked with Alice and Luke, and the computing resources within the National e-Science Infrastructure (NeSI) to shift their memory absorbing processes to NeSI’s high powered computing (HPC) resources which LCR has access to as a member of NeSI. In comparison to the resources available at LCR, the NeSI HPC resources that were now available include an 80 processor computer with each processor having 12 cores and 96 GB of RAM and access to 200TB of disk storage. For the stick insect case above early tests have shown a reduction in processing time from one week to just over 3 hours; a dramatic and significant improvement! Further improvements are expected as we exploring the options of running multiple files simultaneously and by chaining multi step sequence analyses together in an automated process.
cience Infrastructure (NeSI) to shift their memory absorbing processes to NeSI’s high powered computing (HPC) resources which LCR has access to as a member of NeSI. In comparison to the resources available at LCR, the NeSI HPC resources that were now available include an 80 processor computer with each processor having 12 cores and 96 GB of RAM and access to 200TB of disk storage. For the stick insect case above early tests have shown a reduction in processing time from one week to just over 3 hours; a dramatic and significant improvement! Further improvements are expected as we exploring the options of running multiple files simultaneously and by chaining multi step sequence analyses together in an automated process.



