[BBC] [Gtpb] ARANGS16 - ARANGS16 Automated and reproducible analysis of NGS data
Pedro Fernandes
pfern at igc.gulbenkian.pt
Wed Apr 13 21:25:33 CEST 2016
ANNOUNCEMENT / REMINDER
Applications for
ARANGS16 - ARANGS16 Automated and reproducible analysis of NGS data
with Darin London and Rutger Vos are OPEN
IMPORTANT DATES for this Course
Deadline for applications: May 2nd 2016
Course dates: May 9th - May 13th 2016
Candidates with adequate profile will be accepted in the next 72 hours
after the application until we reach 20 participants.
Description
Next generation sequencing (NGS) technologies for DNA have resulted in
a yet bigger deluge of data. Researchers are learning that analyzing
the data efficiently requires the creation of sophisticated pipelines,
typically using command line tools in a Linux or other Open source
Unix variant compute environment. Many researchers have created these
pipelines to successfully analyse their data. Now they are faced with
the challenge of making these pipelines available to their colleagues.
The issue of reproducibility has emerged as a major issue, as
researchers, peer reviewers, and even pharmaceutical companies
discover that the software and data used to produce a particular
research finding are either not available, poorly documented, or
targeted to specific compute infrastructures that are not available to
the wider research community. To remedy this, funding agencies and
journals are creating policies to promote software reproducibility. In
this brief workshop we will establish several best practices of
reproducibility in the (comparative) analysis of data obtained by NGS.
In doing so we will encounter the commonly used technologies that
enable these best practices by working through use cases that
illustrate the underlying principles. Building on the basis of an
existing pipeline of command line utilities, we will illustrate how
the entire compute environment used to run the pipeline can be
packaged into a unit that can be shared with other researchers such
that they can make full use of the environment on their own machines,
or on standard cloud compute environments such as Amazon or Google.
Best practices
Command line scripting of analysis steps
Provisioning systems to standardise software environment requirements
Packaging of compute environment into static, portable units
Sharing of compute environment packages
Technologies
Next generation sequencing platforms
Command-line executables, command line scripting and batching
Provisioning Systems: Puppet, Dockerfile
Virtualization with Virtualbox and Vagrant
Containerization with Docker
More details at the GTPB website
http://gtpb.igc.gulbenkian.pt/bicourses/ARANGS16/
Best wishes
Pedro Fernandes
--
Pedro Fernandes
GTPB Coordinator
Instituto Gulbenkian de Ciência
Apartado 14
2781-901 OEIRAS
PORTUGAL
Tel +351 21 4407912
http://gtpb.igc.gulbenkian.pt
_______________________________________________
GTPB mailing list
GTPB at igc.gulbenkian.pt
https://lists.igc.gulbenkian.pt/mailman/listinfo/gtpb
More information about the BBClist
mailing list