Guide to Pandas Dataframes

pandas.DataFrame( data, index, columns, dtype, copy)
wnba_stats = { 
'first_name': ['Candace', 'Sue', 'Diana', 'Brittney', 'Arike', 'Courtney', "A'ja"],
'last_name': ['Parker', 'Bird', 'Taurasi', 'Griner', 'Ogunbowale', 'Vandersloot', "Wilson"],
'pts': [14.7, 9.8, 18.7, 17.7, 22.8, 13.6, 20.5],
'reb': [9.7, 1.7, 4.2, 7.5, 2.8, 3.5, 8.5],
'ast': [4.6, 5.2, 4.5, 3.0, 3.4, 10.0, 2.0]
df = pandas.DataFrame(wnba_stats)
2020 Player Stats per game from

Attributes and Methods

df (2020 player stats) has 7 rows and 5 columns
datatypes of each column in df
first 3 rows of df
df['first_name'] #returns the first_name column as a Series
#locates rows with labels 0 - 3 and columns with label 'first_name'
df.loc[0:3, 'first_name']
df.iloc[4, :] #locates data with index 4 and all columns
team_stats = [['Parker', 'LA'], ['Bird', 'SEA'], ['Taurasi', 'PHO'], ['Griner', 'PHO'], ['Ogunbowale', 'DAL'], ['Vandersloot', 'CHI'], ['Wilson', 'LV']]team_df = pd.DataFrame(team_stats, columns=['last_name', 'team'])
team data
merged_df = df.merge(team_df, on='last_name')
merged data
df_one = pd.read_csv('WNBA-Stats.csv')
WNBA Player stats Season 2016–2017 from Kaggle
columns_we_want = ['Name', 'Team', 'Pos', 'Age', 'GP', 'PTS', 'REB', 'AST']
df_two = df_one[columns_we_want]
df_two.sample(5) #returns 5 random rows
df_three = df_two.sort_values(by='PTS', ascending=False, ignore_index=True)
data sorted from most to least points scored
df_three['PPG'] = round(df_two['PTS'] / df_two['GP'], 2)
new ppg column rounded to 2 decimals
sea = (df_three['Team'] == 'SEA')
reb = (df_three['REB'] > 100)
df_three[sea & reb]
# pass groupby the name of the column you want to group on
grouped = df_three.groupby('Team')
# Age is the column we want to perform our mean() function on



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store