[BBC] [Gtpb] ARANGS16 - ARANGS16 Automated and reproducible analysis of NGS data

Pedro Fernandes pfern at igc.gulbenkian.pt
Wed Apr 13 21:25:33 CEST 2016


    ANNOUNCEMENT / REMINDER

Applications for
ARANGS16 -  ARANGS16 Automated and reproducible analysis of NGS data
with Darin London and Rutger Vos  are OPEN

    IMPORTANT DATES for this Course
    Deadline for applications: May 2nd 2016
    Course dates: May 9th - May 13th 2016

Candidates with adequate profile will be accepted in the next 72 hours  
after the application until we reach 20 participants.

Description
Next generation sequencing (NGS) technologies for DNA have resulted in  
a yet bigger deluge of data. Researchers are learning that analyzing  
the data efficiently requires the creation of sophisticated pipelines,  
typically using command line tools in a Linux or other Open source  
Unix variant compute environment. Many researchers have created these  
pipelines to successfully analyse their data. Now they are faced with  
the challenge of making these pipelines available to their colleagues.  
The issue of reproducibility has emerged as a major issue, as  
researchers, peer reviewers, and even pharmaceutical companies  
discover that the software and data used to produce a particular  
research finding are either not available, poorly documented, or  
targeted to specific compute infrastructures that are not available to  
the wider research community. To remedy this, funding agencies and  
journals are creating policies to promote software reproducibility. In  
this brief workshop we will establish several best practices of  
reproducibility in the (comparative) analysis of data obtained by NGS.  
In doing so we will encounter the commonly used technologies that  
enable these best practices by working through use cases that  
illustrate the underlying principles. Building on the basis of an  
existing pipeline of command line utilities, we will illustrate how  
the entire compute environment used to run the pipeline can be  
packaged into a unit that can be shared with other researchers such  
that they can make full use of the environment on their own machines,  
or on standard cloud compute environments such as Amazon or Google.

Best practices
     Command line scripting of analysis steps
     Provisioning systems to standardise software environment requirements
     Packaging of compute environment into static, portable units
     Sharing of compute environment packages

Technologies
     Next generation sequencing platforms
     Command-line executables, command line scripting and batching
     Provisioning Systems: Puppet, Dockerfile
     Virtualization with Virtualbox and Vagrant
     Containerization with Docker

More details at the GTPB website
http://gtpb.igc.gulbenkian.pt/bicourses/ARANGS16/

Best wishes
Pedro Fernandes

-- 
Pedro Fernandes
GTPB Coordinator
Instituto Gulbenkian de Ciência
Apartado 14
2781-901 OEIRAS
PORTUGAL
Tel +351 21 4407912
http://gtpb.igc.gulbenkian.pt





_______________________________________________
GTPB mailing list
GTPB at igc.gulbenkian.pt
https://lists.igc.gulbenkian.pt/mailman/listinfo/gtpb


More information about the BBClist mailing list