csvstat¶
Description¶
Prints descriptive statistics for all columns in a CSV file. Will intelligently determine the type of each column and then print analysis relevant to that type (ranges for dates, mean and median for integers, etc.):
usage: csvstat [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]
[-p ESCAPECHAR] [-z MAXFIELDSIZE] [-e ENCODING] [-S] [-H] [-v]
[--zero] [-y SNIFFLIMIT] [-c COLUMNS] [--max] [--min] [--sum]
[--mean] [--median] [--stdev] [--nulls] [--unique] [--freq]
[--len] [--count]
[FILE]
Print descriptive statistics for each column in a CSV file.
positional arguments:
FILE The CSV file to operate on. If omitted, will accept
input on STDIN.
optional arguments:
-h, --help show this help message and exit
-y SNIFFLIMIT, --snifflimit SNIFFLIMIT
Limit CSV dialect sniffing to the specified number of
bytes. Specify "0" to disable sniffing entirely.
-n, --names Display column names and indices from the input CSV
and exit.
-c COLUMNS, --columns COLUMNS
A comma separated list of column indices or names to
be examined. Defaults to all columns.
--max Only output max.
--min Only output min.
--sum Only output sum.
--mean Only output mean.
--median Only output median.
--stdev Only output standard deviation.
--nulls Only output whether column contains nulls.
--unique Only output counts of unique values.
--freq Only output frequent values.
--len Only output max value length.
--count Only output row count
See also: Arguments common to all tools.
Examples¶
Basic use:
csvstat examples/realdata/FY09_EDU_Recipients_by_State.csv
When an statistic name is passed, only that stat will be printed:
csvstat --freq examples/realdata/FY09_EDU_Recipients_by_State.csv
1. State Name: None
2. State Abbreviate: None
3. Code: None
4. Montgomery GI Bill-Active Duty: 3548.0
5. Montgomery GI Bill- Selective Reserve: 1019.0
6. Dependents' Educational Assistance: 1261.0
7. Reserve Educational Assistance Program: 715.0
8. Post-Vietnam Era Veteran's Educational Assistance Program: 6.0
9. TOTAL: 6520.0
10. _unnamed: None
If a single stat and a single column are requested, only a value will be returned:
csvstat -c 4 --freq examples/realdata/FY09_EDU_Recipients_by_State.csv
3548.0