Evaluate the FAA Dataset
Happy coding!
1: VIew and import the dataset
import pandas as pd
import numpy as np
ds = pd.read_csv('C:/Users/kevin.zhang/Lesson 7 -1/faa_ai_prelim/faa_ai_prelim.csv')
2: View and understand the dataset
ds.shape
(83, 42)
ds.head(5)
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
|
UPDATED |
ENTRY_DATE |
EVENT_LCL_DATE |
EVENT_LCL_TIME |
LOC_CITY_NAME |
LOC_STATE_NAME |
LOC_CNTRY_NAME |
RMK_TEXT |
EVENT_TYPE_DESC |
FSDO_DESC |
... |
PAX_INJ_NONE |
PAX_INJ_MINOR |
PAX_INJ_SERIOUS |
PAX_INJ_FATAL |
PAX_INJ_UNK |
GRND_INJ_NONE |
GRND_INJ_MINOR |
GRND_INJ_SERIOUS |
GRND_INJ_FATAL |
GRND_INJ_UNK |
| 0 |
No |
19-FEB-16 |
19-FEB-16 |
00:45:00Z |
MARSHVILLE |
North Carolina |
NaN |
AIRCRAFT CRASHED INTO TREES, THE 1 PERSON ON B... |
Accident |
FAA Charlotte FSDO-68 |
... |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
| 1 |
No |
19-FEB-16 |
18-FEB-16 |
23:55:00Z |
TAVERNIER |
Florida |
NaN |
AIRCRAFT ON LANDING WENT OFF THE END OF THE RU... |
Incident |
FAA Miami FSDO-19 |
... |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
| 2 |
No |
19-FEB-16 |
18-FEB-16 |
22:14:00Z |
TRENTON |
New Jersey |
NaN |
AIRCRAFT ON FINAL SUSTAINED A BIRD STRIKE, LAN... |
Incident |
FAA Philadelphia FSDO-17 |
... |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
| 3 |
No |
19-FEB-16 |
18-FEB-16 |
17:10:00Z |
ASHEVILLE |
North Carolina |
NaN |
AIRCRAFT ON LANDING, GEAR COLLAPSED, ASHEVILLE... |
Incident |
FAA Charlotte FSDO-68 |
... |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
| 4 |
No |
19-FEB-16 |
18-FEB-16 |
00:26:00Z |
TALKEETNA |
Alaska |
NaN |
AIRCRAFT ON LANDING, NOSE GEAR COLLAPSED, TALK... |
Incident |
FAA Anchorage FSDO-03 |
... |
NaN |
1.0 |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
NaN |
5 rows × 42 columns
ds.columns
Index(['UPDATED', 'ENTRY_DATE', 'EVENT_LCL_DATE', 'EVENT_LCL_TIME', 'LOC_CITY_NAME', 'LOC_STATE_NAME', 'LOC_CNTRY_NAME', 'RMK_TEXT', 'EVENT_TYPE_DESC', 'FSDO_DESC', 'REGIST_NBR', 'FLT_NBR', 'ACFT_OPRTR', 'ACFT_MAKE_NAME', 'ACFT_MODEL_NAME', 'ACFT_MISSING_FLAG', 'ACFT_DMG_DESC', 'FLT_ACTIVITY', 'FLT_PHASE', 'FAR_PART', 'MAX_INJ_LVL', 'FATAL_FLAG', 'FLT_CRW_INJ_NONE', 'FLT_CRW_INJ_MINOR', 'FLT_CRW_INJ_SERIOUS', 'FLT_CRW_INJ_FATAL', 'FLT_CRW_INJ_UNK', 'CBN_CRW_INJ_NONE', 'CBN_CRW_INJ_MINOR', 'CBN_CRW_INJ_SERIOUS', 'CBN_CRW_INJ_FATAL', 'CBN_CRW_INJ_UNK', 'PAX_INJ_NONE', 'PAX_INJ_MINOR', 'PAX_INJ_SERIOUS', 'PAX_INJ_FATAL', 'PAX_INJ_UNK', 'GRND_INJ_NONE', 'GRND_INJ_MINOR', 'GRND_INJ_SERIOUS', 'GRND_INJ_FATAL', 'GRND_INJ_UNK'],
dtype='object')
3: Extract the following attributes from the dataset:
- Aircraft make name
- State name
- Aircraft model name
- Text information
- Flight phase
- Event description type
- Fatal flag
df = pd.DataFrame(ds, columns=['ACFT_MAKE_NAME','LOC_STATE_NAME','ACFT_MODEL_NAME','RMK_TEXT','FLT_PHASE','EVENT_TYPE_DESC','FATAL_FLAG'])
type(df)
pandas.core.frame.DataFrame
df.head()
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
|
ACFT_MAKE_NAME |
LOC_STATE_NAME |
ACFT_MODEL_NAME |
RMK_TEXT |
FLT_PHASE |
EVENT_TYPE_DESC |
FATAL_FLAG |
| 0 |
BEECH |
North Carolina |
36 |
AIRCRAFT CRASHED INTO TREES, THE 1 PERSON ON B... |
UNKNOWN (UNK) |
Accident |
Yes |
| 1 |
VANS |
Florida |
RV7 |
AIRCRAFT ON LANDING WENT OFF THE END OF THE RU... |
LANDING (LDG) |
Incident |
NaN |
| 2 |
CESSNA |
New Jersey |
172 |
AIRCRAFT ON FINAL SUSTAINED A BIRD STRIKE, LAN... |
APPROACH (APR) |
Incident |
NaN |
| 3 |
LANCAIR |
North Carolina |
235 |
AIRCRAFT ON LANDING, GEAR COLLAPSED, ASHEVILLE... |
LANDING (LDG) |
Incident |
NaN |
| 4 |
CESSNA |
Alaska |
172 |
AIRCRAFT ON LANDING, NOSE GEAR COLLAPSED, TALK... |
LANDING (LDG) |
Incident |
NaN |
4. Clean the dataset and replace the fatal flag NaN with “No”
df['FATAL_FLAG'].fillna(value='No',inplace=True)
df.head()
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
|
ACFT_MAKE_NAME |
LOC_STATE_NAME |
ACFT_MODEL_NAME |
RMK_TEXT |
FLT_PHASE |
EVENT_TYPE_DESC |
FATAL_FLAG |
| 0 |
BEECH |
North Carolina |
36 |
AIRCRAFT CRASHED INTO TREES, THE 1 PERSON ON B... |
UNKNOWN (UNK) |
Accident |
Yes |
| 1 |
VANS |
Florida |
RV7 |
AIRCRAFT ON LANDING WENT OFF THE END OF THE RU... |
LANDING (LDG) |
Incident |
No |
| 2 |
CESSNA |
New Jersey |
172 |
AIRCRAFT ON FINAL SUSTAINED A BIRD STRIKE, LAN... |
APPROACH (APR) |
Incident |
No |
| 3 |
LANCAIR |
North Carolina |
235 |
AIRCRAFT ON LANDING, GEAR COLLAPSED, ASHEVILLE... |
LANDING (LDG) |
Incident |
No |
| 4 |
CESSNA |
Alaska |
172 |
AIRCRAFT ON LANDING, NOSE GEAR COLLAPSED, TALK... |
LANDING (LDG) |
Incident |
No |
df.shape
(83, 7)
5. Remove all the observations where aircraft names are not available
df = df.dropna(subset=['ACFT_MAKE_NAME'])
6. Find the aircraft types and their occurrences in the dataset
df.shape
(78, 7)
df_groupby_ACFT_MAKE_NAME = df.groupby( by='ACFT_MAKE_NAME')
df_groupby_ACFT_MAKE_NAME.size()
ACFT_MAKE_NAME
AERO COMMANDER 1
AERONCA 1
AEROSTAR INTERNATIONAL 1
AIRBUS 1
BEECH 9
BELL 2
BOEING 3
CESSNA 23
CHAMPION 2
CHRISTEN 1
CONSOLIDATED VULTEE 1
EMBRAER 1
ENSTROM 1
FAIRCHILD 1
FLIGHT DESIGN 1
GLOBE 1
GREAT LAKES 1
GRUMMAN 1
GULFSTREAM 1
HUGHES 1
LANCAIR 2
MAULE 1
MOONEY 4
NORTH AMERICAN 1
PIPER 10
PITTS 1
SAAB 1
SABRELINER 1
SOCATA 2
VANS 1
dtype: int64
7: Display the observations where fatal flag is “Yes”
df_groupby_FATAL_FLAG = df.groupby( by='FATAL_FLAG')
df_groupby_FATAL_FLAG.size()
FATAL_FLAG
No 71
Yes 7
dtype: int64
df1 = df[df['FATAL_FLAG']=='Yes']
df1
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
|
ACFT_MAKE_NAME |
LOC_STATE_NAME |
ACFT_MODEL_NAME |
RMK_TEXT |
FLT_PHASE |
EVENT_TYPE_DESC |
FATAL_FLAG |
| 0 |
BEECH |
North Carolina |
36 |
AIRCRAFT CRASHED INTO TREES, THE 1 PERSON ON B... |
UNKNOWN (UNK) |
Accident |
Yes |
| 53 |
PIPER |
Florida |
PA28 |
AIRCRAFT CRASHED UNDER UNKNOWN CIRCUMSTANCES. ... |
UNKNOWN (UNK) |
Accident |
Yes |
| 55 |
FLIGHT DESIGN |
California |
CTLS |
AIRCRAFT CRASHED UNDER UNKNOWN CIRCUMSTANCES A... |
UNKNOWN (UNK) |
Accident |
Yes |
| 79 |
NORTH AMERICAN |
Arizona |
F51 |
AIRCRAFT CRASHED UNDER UNKNOWN CIRCUMSTANCES, ... |
UNKNOWN (UNK) |
Accident |
Yes |
| 80 |
CHAMPION |
California |
8KCAB |
N9872R, BEECH M35 AIRCRAFT, AND N5057G, BELLAN... |
UNKNOWN (UNK) |
Accident |
Yes |
| 81 |
BEECH |
California |
35 |
N9872R, BEECH M35 AIRCRAFT, AND N5057G, BELLAN... |
UNKNOWN (UNK) |
Accident |
Yes |
| 82 |
CESSNA |
Alabama |
182 |
N784CP AIRCRAFT CRASHED INTO A WOODED AREA NEA... |
UNKNOWN (UNK) |
Accident |
Yes |
df_groupby_FATAL_FLAG.get_group('Yes')
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
|
ACFT_MAKE_NAME |
LOC_STATE_NAME |
ACFT_MODEL_NAME |
RMK_TEXT |
FLT_PHASE |
EVENT_TYPE_DESC |
FATAL_FLAG |
| 0 |
BEECH |
North Carolina |
36 |
AIRCRAFT CRASHED INTO TREES, THE 1 PERSON ON B... |
UNKNOWN (UNK) |
Accident |
Yes |
| 53 |
PIPER |
Florida |
PA28 |
AIRCRAFT CRASHED UNDER UNKNOWN CIRCUMSTANCES. ... |
UNKNOWN (UNK) |
Accident |
Yes |
| 55 |
FLIGHT DESIGN |
California |
CTLS |
AIRCRAFT CRASHED UNDER UNKNOWN CIRCUMSTANCES A... |
UNKNOWN (UNK) |
Accident |
Yes |
| 79 |
NORTH AMERICAN |
Arizona |
F51 |
AIRCRAFT CRASHED UNDER UNKNOWN CIRCUMSTANCES, ... |
UNKNOWN (UNK) |
Accident |
Yes |
| 80 |
CHAMPION |
California |
8KCAB |
N9872R, BEECH M35 AIRCRAFT, AND N5057G, BELLAN... |
UNKNOWN (UNK) |
Accident |
Yes |
| 81 |
BEECH |
California |
35 |
N9872R, BEECH M35 AIRCRAFT, AND N5057G, BELLAN... |
UNKNOWN (UNK) |
Accident |
Yes |
| 82 |
CESSNA |
Alabama |
182 |
N784CP AIRCRAFT CRASHED INTO A WOODED AREA NEA... |
UNKNOWN (UNK) |
Accident |
Yes |