Evaluate the FAA Dataset

253 阅读4分钟

Evaluate the FAA Dataset

Happy coding!


1: VIew and import the dataset

#Import necessary libraries
import pandas as pd
import numpy as np
#Import the FAA (Federal Aviation Authority) dataset
ds = pd.read_csv('C:/Users/kevin.zhang/Lesson 7 -1/faa_ai_prelim/faa_ai_prelim.csv')

2: View and understand the dataset

#View the dataset shape
ds.shape
(83, 42)
#View the first five observations
ds.head(5)
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
UPDATED ENTRY_DATE EVENT_LCL_DATE EVENT_LCL_TIME LOC_CITY_NAME LOC_STATE_NAME LOC_CNTRY_NAME RMK_TEXT EVENT_TYPE_DESC FSDO_DESC ... PAX_INJ_NONE PAX_INJ_MINOR PAX_INJ_SERIOUS PAX_INJ_FATAL PAX_INJ_UNK GRND_INJ_NONE GRND_INJ_MINOR GRND_INJ_SERIOUS GRND_INJ_FATAL GRND_INJ_UNK
0 No 19-FEB-16 19-FEB-16 00:45:00Z MARSHVILLE North Carolina NaN AIRCRAFT CRASHED INTO TREES, THE 1 PERSON ON B... Accident FAA Charlotte FSDO-68 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 No 19-FEB-16 18-FEB-16 23:55:00Z TAVERNIER Florida NaN AIRCRAFT ON LANDING WENT OFF THE END OF THE RU... Incident FAA Miami FSDO-19 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 No 19-FEB-16 18-FEB-16 22:14:00Z TRENTON New Jersey NaN AIRCRAFT ON FINAL SUSTAINED A BIRD STRIKE, LAN... Incident FAA Philadelphia FSDO-17 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 No 19-FEB-16 18-FEB-16 17:10:00Z ASHEVILLE North Carolina NaN AIRCRAFT ON LANDING, GEAR COLLAPSED, ASHEVILLE... Incident FAA Charlotte FSDO-68 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 No 19-FEB-16 18-FEB-16 00:26:00Z TALKEETNA Alaska NaN AIRCRAFT ON LANDING, NOSE GEAR COLLAPSED, TALK... Incident FAA Anchorage FSDO-03 ... NaN 1.0 NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 42 columns

#View all the columns present in the dataset
ds.columns
Index(['UPDATED', 'ENTRY_DATE', 'EVENT_LCL_DATE', 'EVENT_LCL_TIME',       'LOC_CITY_NAME', 'LOC_STATE_NAME', 'LOC_CNTRY_NAME', 'RMK_TEXT',       'EVENT_TYPE_DESC', 'FSDO_DESC', 'REGIST_NBR', 'FLT_NBR', 'ACFT_OPRTR',       'ACFT_MAKE_NAME', 'ACFT_MODEL_NAME', 'ACFT_MISSING_FLAG',       'ACFT_DMG_DESC', 'FLT_ACTIVITY', 'FLT_PHASE', 'FAR_PART', 'MAX_INJ_LVL',       'FATAL_FLAG', 'FLT_CRW_INJ_NONE', 'FLT_CRW_INJ_MINOR',       'FLT_CRW_INJ_SERIOUS', 'FLT_CRW_INJ_FATAL', 'FLT_CRW_INJ_UNK',       'CBN_CRW_INJ_NONE', 'CBN_CRW_INJ_MINOR', 'CBN_CRW_INJ_SERIOUS',       'CBN_CRW_INJ_FATAL', 'CBN_CRW_INJ_UNK', 'PAX_INJ_NONE', 'PAX_INJ_MINOR',       'PAX_INJ_SERIOUS', 'PAX_INJ_FATAL', 'PAX_INJ_UNK', 'GRND_INJ_NONE',       'GRND_INJ_MINOR', 'GRND_INJ_SERIOUS', 'GRND_INJ_FATAL', 'GRND_INJ_UNK'],
      dtype='object')

3: Extract the following attributes from the dataset:

  1. Aircraft make name
  2. State name
  3. Aircraft model name
  4. Text information
  5. Flight phase
  6. Event description type
  7. Fatal flag
#Create a new dataframe with only the required columns
df = pd.DataFrame(ds, columns=['ACFT_MAKE_NAME','LOC_STATE_NAME','ACFT_MODEL_NAME','RMK_TEXT','FLT_PHASE','EVENT_TYPE_DESC','FATAL_FLAG'])
#View the type of the object
type(df)
pandas.core.frame.DataFrame
#Check if the dataframe contains all the required attributes
df.head()
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
ACFT_MAKE_NAME LOC_STATE_NAME ACFT_MODEL_NAME RMK_TEXT FLT_PHASE EVENT_TYPE_DESC FATAL_FLAG
0 BEECH North Carolina 36 AIRCRAFT CRASHED INTO TREES, THE 1 PERSON ON B... UNKNOWN (UNK) Accident Yes
1 VANS Florida RV7 AIRCRAFT ON LANDING WENT OFF THE END OF THE RU... LANDING (LDG) Incident NaN
2 CESSNA New Jersey 172 AIRCRAFT ON FINAL SUSTAINED A BIRD STRIKE, LAN... APPROACH (APR) Incident NaN
3 LANCAIR North Carolina 235 AIRCRAFT ON LANDING, GEAR COLLAPSED, ASHEVILLE... LANDING (LDG) Incident NaN
4 CESSNA Alaska 172 AIRCRAFT ON LANDING, NOSE GEAR COLLAPSED, TALK... LANDING (LDG) Incident NaN

4. Clean the dataset and replace the fatal flag NaN with “No”

#Replace all Fatal Flag missing values with the required output
df['FATAL_FLAG'].fillna(value='No',inplace=True)
#Verify if the missing values are replaced
df.head()
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
ACFT_MAKE_NAME LOC_STATE_NAME ACFT_MODEL_NAME RMK_TEXT FLT_PHASE EVENT_TYPE_DESC FATAL_FLAG
0 BEECH North Carolina 36 AIRCRAFT CRASHED INTO TREES, THE 1 PERSON ON B... UNKNOWN (UNK) Accident Yes
1 VANS Florida RV7 AIRCRAFT ON LANDING WENT OFF THE END OF THE RU... LANDING (LDG) Incident No
2 CESSNA New Jersey 172 AIRCRAFT ON FINAL SUSTAINED A BIRD STRIKE, LAN... APPROACH (APR) Incident No
3 LANCAIR North Carolina 235 AIRCRAFT ON LANDING, GEAR COLLAPSED, ASHEVILLE... LANDING (LDG) Incident No
4 CESSNA Alaska 172 AIRCRAFT ON LANDING, NOSE GEAR COLLAPSED, TALK... LANDING (LDG) Incident No
#Check the number of observations
df.shape
(83, 7)

5. Remove all the observations where aircraft names are not available

#Drop the unwanted values/observations from the dataset
df = df.dropna(subset=['ACFT_MAKE_NAME'])

6. Find the aircraft types and their occurrences in the dataset

#Check the number of observations now to compare it with the original dataset and see how many values have been dropped
df.shape
(78, 7)
#Group the dataset by aircraft name
df_groupby_ACFT_MAKE_NAME = df.groupby( by='ACFT_MAKE_NAME')
#View the number of times each aircraft type appears in the dataset (Hint: use the size() method)
df_groupby_ACFT_MAKE_NAME.size()
ACFT_MAKE_NAME
AERO COMMANDER             1
AERONCA                    1
AEROSTAR INTERNATIONAL     1
AIRBUS                     1
BEECH                      9
BELL                       2
BOEING                     3
CESSNA                    23
CHAMPION                   2
CHRISTEN                   1
CONSOLIDATED VULTEE        1
EMBRAER                    1
ENSTROM                    1
FAIRCHILD                  1
FLIGHT DESIGN              1
GLOBE                      1
GREAT LAKES                1
GRUMMAN                    1
GULFSTREAM                 1
HUGHES                     1
LANCAIR                    2
MAULE                      1
MOONEY                     4
NORTH AMERICAN             1
PIPER                     10
PITTS                      1
SAAB                       1
SABRELINER                 1
SOCATA                     2
VANS                       1
dtype: int64

7: Display the observations where fatal flag is “Yes”

#Group the dataset by fatal flag
df_groupby_FATAL_FLAG = df.groupby( by='FATAL_FLAG')
#View the total number of fatal and non-fatal accidents
df_groupby_FATAL_FLAG.size()
FATAL_FLAG
No     71
Yes     7
dtype: int64
#Create a new dataframe to view only the fatal accidents (Fatal Flag values = Yes)
df1 = df[df['FATAL_FLAG']=='Yes']
df1
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
ACFT_MAKE_NAME LOC_STATE_NAME ACFT_MODEL_NAME RMK_TEXT FLT_PHASE EVENT_TYPE_DESC FATAL_FLAG
0 BEECH North Carolina 36 AIRCRAFT CRASHED INTO TREES, THE 1 PERSON ON B... UNKNOWN (UNK) Accident Yes
53 PIPER Florida PA28 AIRCRAFT CRASHED UNDER UNKNOWN CIRCUMSTANCES. ... UNKNOWN (UNK) Accident Yes
55 FLIGHT DESIGN California CTLS AIRCRAFT CRASHED UNDER UNKNOWN CIRCUMSTANCES A... UNKNOWN (UNK) Accident Yes
79 NORTH AMERICAN Arizona F51 AIRCRAFT CRASHED UNDER UNKNOWN CIRCUMSTANCES, ... UNKNOWN (UNK) Accident Yes
80 CHAMPION California 8KCAB N9872R, BEECH M35 AIRCRAFT, AND N5057G, BELLAN... UNKNOWN (UNK) Accident Yes
81 BEECH California 35 N9872R, BEECH M35 AIRCRAFT, AND N5057G, BELLAN... UNKNOWN (UNK) Accident Yes
82 CESSNA Alabama 182 N784CP AIRCRAFT CRASHED INTO A WOODED AREA NEA... UNKNOWN (UNK) Accident Yes
df_groupby_FATAL_FLAG.get_group('Yes')
.dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; }
ACFT_MAKE_NAME LOC_STATE_NAME ACFT_MODEL_NAME RMK_TEXT FLT_PHASE EVENT_TYPE_DESC FATAL_FLAG
0 BEECH North Carolina 36 AIRCRAFT CRASHED INTO TREES, THE 1 PERSON ON B... UNKNOWN (UNK) Accident Yes
53 PIPER Florida PA28 AIRCRAFT CRASHED UNDER UNKNOWN CIRCUMSTANCES. ... UNKNOWN (UNK) Accident Yes
55 FLIGHT DESIGN California CTLS AIRCRAFT CRASHED UNDER UNKNOWN CIRCUMSTANCES A... UNKNOWN (UNK) Accident Yes
79 NORTH AMERICAN Arizona F51 AIRCRAFT CRASHED UNDER UNKNOWN CIRCUMSTANCES, ... UNKNOWN (UNK) Accident Yes
80 CHAMPION California 8KCAB N9872R, BEECH M35 AIRCRAFT, AND N5057G, BELLAN... UNKNOWN (UNK) Accident Yes
81 BEECH California 35 N9872R, BEECH M35 AIRCRAFT, AND N5057G, BELLAN... UNKNOWN (UNK) Accident Yes
82 CESSNA Alabama 182 N784CP AIRCRAFT CRASHED INTO A WOODED AREA NEA... UNKNOWN (UNK) Accident Yes