biom-format.org

Summarizing BIOM tables

«  Sparse matrix backends   ::   Contents   ::   BIOM License  »

Summarizing BIOM tables

If you have an existing BIOM file and want to compile a summary of the information in that table, you can use the print_biom_table_summary.py script.

To get help with print_biom_table_summary.py you can call:

print_biom_table_summary.py -h

This script takes a BIOM file or gzipped BIOM file as input, and will print a summary of the count information on a per-sample basis to the terminal (default) or to a new file (if provided via the -o parameter). The example file used in the commands below can be found in the biom-format/examples directory.

Summarizing sample data

To summarize the per-sample data in a BIOM file, you can run:

print_biom_table_summary.py -i rich_sparse_otu_table.biom

This will print the following output to the screen:

Num samples: 6
Num observations: 5
Total count: 27
Table density (fraction of non-zero values): 0.5000
Table md5 (unzipped): e0d1194d2d94ea81994c93b8f0db7f51

Counts/sample summary:
 Min: 3
 Max: 7
 Median: 4.0
 Mean: 4.5
 Std. dev.: 1.5
 Sample Metadata Categories: LinkerPrimerSequence; BarcodeSequence; Description; BODY_SITE
 Observation Metadata Categories: taxonomy

Counts/sample detail:
 Sample2: 3
 Sample5: 3
 Sample3: 4
 Sample6: 4
 Sample4: 6
 Sample1: 7

As you can see, general summary information about the table is provided, including the number of samples, the number of observations, the total count (i.e., the sum of all values in the table), and so on, followed by the per-sample counts.

Summarizing sample data qualitatively

To summarize the per-sample data in a BIOM file qualitatively, where the number of unique observations per sample (rather than the total count of observations per sample) are provided, you can run:

print_biom_table_summary.py -i rich_sparse_otu_table.biom --num_observations

This will print the following output to the screen:

Num samples: 6
Num observations: 5
Table md5 (unzipped): e0d1194d2d94ea81994c93b8f0db7f51

Observations/sample summary:
 Min: 1
 Max: 4
 Median: 2.5
 Mean: 2.5
 Std. dev.: 0.957427107756
 Sample Metadata Categories: LinkerPrimerSequence; BarcodeSequence; Description; BODY_SITE
 Observation Metadata Categories: taxonomy

Observations/sample detail:
 Sample5: 1
 Sample1: 2
 Sample4: 2
 Sample2: 3
 Sample6: 3
 Sample3: 4

«  Sparse matrix backends   ::   Contents   ::   BIOM License  »