Parse zipped PostgreSQL's logs and save them in a parquet file

Josep

I'm administrating a large number of PostgreSQL's servers and I get their logs zipped. To analyze them I've done a Spark task for:

In a following post I will show how to query them to get usefull information.

PostgreSQL's logs format

The log format specified in the PostgreSQL's config file is the following:

log_line_prefix = '%t %a %u %d %c '

Special values:

The code can be found in a Jupyter Notebook in my GitHub.

October 25, 2018