Introduction In this post I want to take a look at my Netflix viewing habits. The first step is getting the data and you can request your viewing data from the Accounts section in your Netflix account. Netflix will allow you to download a zip file with many different ways to slice this information. # Import the libraries import pandas as pd import numpy as np import seaborn as sns from pandas.

Continue reading

I needed to parse server logs and create Spark DataFrames to query information from the query string parameters. My naive version kept throwing errors about mismatched number of fields in schema and those in the row being queried. It turns out I was dealing with over 350 different query string params across the logs. This could change over time and there was no way I was going to add these programmatically by hand.

Continue reading

Author's picture

Nitin Ahuja

A programmer’s viewpoint

Forever learning

California