Analysing the most popular film locations in San Francisco. Part 1
Exploring the dataset:
In [1]:
import pandas
In [2]:
df = pandas.read_csv('./Film_Locations_in_San_Francisco.csv')
df.head()
Out[2]:
In [3]:
df.columns
Out[3]:
In [4]:
df.index
Out[4]:
Numpy representation of the dataset:
In [5]:
df.values
Out[5]:
In [6]:
type(df)
Out[6]:
In [7]:
df.shape
Out[7]:
Getting the info about the data frame:
In [8]:
df.info()
Starting to Filtering and Slicing our data:
In [9]:
df['Actor 1']
Out[9]:
Subset one or more columns in the data frame:
In [10]:
subset = df[['Title', 'Release Year', 'Locations', 'Production Company', 'Director']]
In [11]:
subset.head()
Out[11]:
Subset rows:
In [12]:
#loc will show the character that matches, 2 in this case.
df.loc[2]
Out[12]:
In [13]:
df.loc[[2, 0]]
Out[13]:
In [14]:
df.loc[[4, 5]]
Out[14]:
Subseting columns, filtering by rows:
In [15]:
subset = df.loc[:, ['Release Year', 'Locations', 'Director']]
In [16]:
subset.head()
Out[16]:
Getting all the rows but by the year 2015, subseting specific columns:
In [17]:
df.loc[df['Release Year'] == 2015, ['Release Year', 'Locations', 'Director']]
Out[17]:
Comments
Post a Comment