Josep
Parse zipped PostgreSQL's logs and save them in a parquet file
I'm administrating a large number of PostgreSQL's servers and I get their logs zipped. To analyze them I've done a Spark task for:
- Unzip the files
- Parse then logs of PostgreSQL
- Save (append) the data into a parquet file
In a following post I will show how to query them to get usefull . . .
Posted in: jupyter notebookpythonspark
Percentage of time series over its SMA (Simple Moving Average) compared against a weighted index
Problem with weighted indexes
One problem with weighted indexes is that few components of the index can move its value when the value of few components is much bigger than the others. That could give misleading conclusions. For example, when small weighted components are not following the trend of the big ones. Some scenarios where . . .
Posted in: jupyter notebookpython
Print Markdown in the HTML widget using Markdown package
I've uploaded a Jupyter Notebook in Github explaining two ways to print Markdown in a Jypyter Notebook:
- Using the IPython
Markdown()
class. - Using the HTML ipywidget and the markdown package.
The first option is straightforward, but the second one is much more powerful because can be used with other widgets, . . .
Posted in: jupyter notebookpython
A Brief introduction to YAML in Powershell
Missing an official YAML powershell module
I made another post about Powershell and YAML some months ago with more code where I explain the different ways to write YAML and how it behaves in powershell, comparing PSYaml and powershell-yaml modules.
After the experience of being working in a module that's compatible with both modules to read YAML files, I decided to write this . . .
Change SERIAL to IDENTITY in PostgreSQL
PostgreSQL 10 implements SQL standard's IDENTITY
In SQL Server is quite common to use IDENTITYs for non-natural primary keys. In PostgreSQL, until version 10, only SERIALs could be used for the same purpose. But that has changed.
Why INDENTITY and not SERIAL and SEQUENCES?
SERIAL is a friendly way to set a SEQUENCE, but at the end, it's a SEQUENCE: an object that . . .
Posted in: postgresql
Export data from Oracle to MongoDB in Python
Introduction
I had to export some data from an Oracle database to a MongoDB. For this reason I created a python function called export_data_from_oracle_to_mongodb
that can be found in my Github.
To make the function more generic, I've there's an optional parameter called transform
,where a function can be specified to . . .
YAML in powershell
powershell doesn't have native support for yaml. Solution: PSYAML and powershell-yaml
UPDATE on 2018-07-06
I recommend reading my second post A Brief introduction to YAML in Powershell: it's shorter and has less code. It's done after working in a module that's compatible with powershell-yaml and PSYaml modules to read YAML files in Powershell.
I only recommend to read this blog if you're new to . . .
Posted in: powershell