📗
Essential Python For Genome Science
  • Before Start
  • Chapter Contents
  • Prerequisites
    • About the UNIX system
    • About python
  • UNDERSTAND RAW DATA
    • Stages of Genome Data Generation
    • From Bulk To Single Cell
    • Introduction To the Datasets
      • bulk RNA-seq
      • single-cell data
  • Work Environment
    • Chapter Ensemble
    • All About Installations
    • Keep Running
    • Coding Environment
    • Git and Github
    • Other Tips
  • Python and UNIX System
    • Run Python
    • File I/O
    • Run Shell Command In Python - I
    • 🎉Case Study: Mapping bulk RNA-seq reads with salmon
  • Data Cleaning
    • 🎉Key Concept of Pandas
    • 🎉Case Study: Aggregate Salmon Quant
    • Case Study: Exploring The Dataset 🚩
    • The "copy" and "inplace" Parameter 🚩
    • Case Study: Extract and Reformat GTF file 🚩
    • the correct vs. the wrong way of using pandas 🚩
    • Case Study: Bulk Sample PCA 🚩
  • PYTHON BASICS
    • Python can be lightning-fast ⚡️ 🚩
    • Run Shell Command In Python - II 🚩
    • Pointers In Python 🚩
    • Everything is an object 🚩
    • Thread and Process 🚩
    • Resource For Intermediate Python Knowledge 🚩
    • Python magic method 🚩
  • Genome Science Data
    • NGS Data Formats and Tools 🚩
      • SAM/BAM 🚩
      • BED 🚩
      • GTF 🚩
      • Bigwig / Bigbed 🚩
      • VCF / BCF 🚩
    • The Python Packages 🚩
  • Data visualization
    • Matplotlib Basics 🚩
    • Seaborn Basics 🚩
    • Interactive Data Visualization 🚩
  • Use R in Python
    • Why? 🚩
    • rpy2 🚩
  • Gotchas
    • Check whether package X is installed
    • BAM to FASTQ
    • Genomic Websites
Powered by GitBook
On this page
  • Different ways to run shell command
  • The subprocess package
  • Why run shell command in python?
  • Stdin, stdout, stderr, and return code
  • subprocess.run()

Was this helpful?

  1. Python and UNIX System

Run Shell Command In Python - I

PreviousFile I/ONextCase Study: Mapping bulk RNA-seq reads with salmon

Last updated 5 years ago

Was this helpful?

This page is the first part of "run shell command in python", I will introduce you the subprocess package and ways to run command in python. In , I will introduce some more advanced knowledge about stdin, stdout and pipe with the subprocess package.

Different ways to run shell command

  1. Run a command in the shell

  2. Run a command in Jupyter Notebook with "!" or with the magic command "%bash". (See Jupyter Notebook)

  3. Run a command with python subprocess package

The subprocess package

When you executed a command, you started a that use specific system resources (CPU, MEM, etc.) to run the job. In addition, all process can spawn child processes, which is called subprocess. The python built-in subprocess package allow you to start subprocess in python. In other words, it allow you to execute other commands within python, just like what you do in the shell.

Why run shell command in python?

The main reason is to run batch commands easier.

When I , I used the pandas to handle all the metadata table, and use python code to construct dozens of command. Finally, I use the subprocess package to execute them (). Because everything is done in python, I don't need to switching environments and all steps are documented in the notebook.

Stdin, stdout, stderr, and return code

You need to understand these terms before reading the subprocess.run() jupyter notebook below.

A UNIX process has three standard streams: stdin, stdout, and stderr.

  • stdin is for taking input

  • stdout is for printing output informations

  • stderr is for printing error informations

These three terms are foundation of UNIX process information flow, I believe you can find great explanations on google if you don't fully understand this.

In addition to the three information stream, all commands finished with a returncode. The UNIX convention is that, if returncode is 0, means command succeed, non-zero returncode means command failed. Different non-zero returncodes mean different kinds of failure, which is defined by the program.

subprocess.run()

Take home message of this notebook:

  • subprocess.run() is the most common API for running shell command in python.

  • set stderr=PIPE and stdout=PIPE to capture the information printed in these two system file handles

  • set encoding='utf8' to make sure you got string but not bytes from stderr and stdout

  • Returncode == 0 means job succeed, otherwise means failure. If check=True, non-zero return code triggers subprocess.CalledProcessError

  • Try ... Except ... allows you catch an error and deal with it

  • By default, shell=False and you need to provide command as a list, redirct and pipe will not work.

  • When shell=True, you can provide command as a string and any command works just like in shell

subprocess.run() is a hard but important python function for beginners, it is not easy to understand, but once you do, you gain a lot of knowledge not only on python, but also on the whole UNIX system!

the second part
process
prepare the bulk RNA-seq data for this book
see this notebook
See Jupyter Notebook