Service Sector Summary
SCM330: SS24
2025-02-03
Assignment ****Introduction
You will develop a data-driven summary of a service sector. The deliverables are a written report and a presentation during the last week of class (or finals week). We may also have mini-updates throughout the course as the schedule allows.
The purpose of this assignment is two fold:
(1) To give you practice scraping internet data and expose you to the inherent challenges of analyzing raw data — most of the rest of your data given to you for cases is clean.
(2) To expose the class to the differences between service sectors.
To accomplish this, you will summarize a service sector of your choice using (primarily) data from the Beareau of Labor Statistics (BLS). I will provide a survey for you to indicate your preferences as to which service sector you’d like to study. I will give preference to those who have, are, or will be 代写SCM330: SS24 Service Sector Summary
employed in a specific service sector (below the subsector).
The BLS classifies services into the following supersectors and sectors as shown here and listed below (note that you should choose ****a ****subsector ****with ****a ****3 ****number ****NAICS ****code, e.g., Truck Transportation: NAICS 484):
• Trade, Transportation, and Utilities
– ****Wholesale Trade (NAICS 42)
– ****Retail Trade (NAICS 44-45)
– ****Transportation and Warehousing (NAICS 48-49)
– ****Utilities (NAICS 22)
• Information
– ****Information (NAICS 51)
• Financial Activities
– ****Finance and Insurance (NAICS 52)
– ****Real Estate and Rental and Leasing (NAICS 53)
• Professional and Business Services
– ****Professional, Scientific, and Technical Services (NAICS 54)
– ****Management of Companies and Enterprises (NAICS 55)
– ****Administrative and Support and Waste Management and Remediation Services (NAICS 56)
• Education and Health Services
– ****Educational Services (NAICS 61)
– ****Health Care and Social Assistance (NAICS 62)
• Leisure and Hospitality
– ****Arts, Entertainment, and Recreation (NAICS 71)
– ****Accommodation and Food Services (NAICS 72)
• Other Services (except Public Administration)
– ****Other Services (except Public Administration) (NAICS 81)
Report ****Instructions
Your analysis report ****and ****presentation ****must ****be ****reproducible. That means it must be completed in code (not excel) and you must access the BLS data through the API (see instructions below). This can be done on DataCamp datalab or through an interactive notebook (Rmarkdown, Quarto, Jupyter Notebooks, etc.). I will run your code to and build your analysis from scratch.
The minimum requirements — completing ****all ****of which ****wi ll ****earn ****an ****undergraduate ****a ****“B” ****— include:
• Identify the primary customer inputs for businesses in this sector and classify them according to the UST
• Download and analyse employment and wage data for your subsector
• Run and interpret at least 1 regression, including assessment of assumptions
• Make at least 1 custom visualization
• Summarize some key recent news or trends in your subsector.
If you ****are ****a ****graduate ****student, ****you ****have ****an ****addi tional ****task ****to ****achieve ****a ****“B” ****.
• If you are an MSBA student, you must apply an analytical method not discussed in this class. You will explain this method to the class in your presentation.
• If you are an MBA student, you must profile at least 1 company in the sector including a summary of their competitive strategy.
All students ****desiring ****an ****“A” ****should bring in additional context using extra data. There many different datasets on the BLS website alone, not to mention other government data. News articles often cite government data in the captions of figures if you need inspiration. You also have free access to Statistica and many other data sources because Lehigh pays for them (Note: you probably need to be on Lehigh WiFi or the VPN to gain full access). Other open source data API’s can be found on CRAN/Task Views/OfficialStatistcs. As an example, if studying Accommodation **and **Food **Services **sector, you could use data from Statistica to analyse the change in customer satisfaction index scores of Starbucks.
TRAC ****Fellows
You are fortunate to have access to TRAC Fellows for this assignment. Since this is a semester long project with some open ended requirements, you can take this many different directions. The TRAC Fellows are trained to help you with the precise kind of assignment. Please check the Syllabus for more details about the TRAC program, what the fellows will and will not do, and how to get the most out of your interactions.
Example ****BLS ****Data ****Pull
I’m going to look at the employment in the Other **Services **(except **Public **Administr ation) **sector. To get this data, I first need to install the package. Documentation about the package can be found here.
install.packages( ' blsR ' )
I then need to load the package (I’m also loading the tidyverse for analysis purposes).
library(blsR)
library ****(tidyverse)
Take a look at the instructions for the main function we’ll use to query the BLS API by running the following command:
?get_series_table
You’ll notice we need to provide a few pieces of information to the function: series id, api key, and although optional, start and end years. Lets start by setting our API key for the session. You will need to get your own API key by following this link.
# **Manually set **API **key **for **session
bls_set_key ****( "enter_your_key_as a character")
One thing to note is that the BLS API limits the number of years within a time series to 20 years. If you try more than 20 years, you’ll get an error.
So, if the time series we’re interested in is greater than 20 years, we can build our own function to interactively query the API on 20 year increments by breaking our time series into chunks of start and end years. We can then compile a cumulative data frame using a function from the purrr package, which is a part of the Tidyverse.
# **Create **function to iterate **the API query
bls_query <- function ****(start, end){
df <- get_series_table ****(
series_id = series,
api_key = bls_get_key ****(), start_year = start,
end_year = end) | >
arrange ****(year, period)
return ****(df) }
Now that we’ve specified the query function, here’s how to get employment data from the Other **Services (except **Public **Administration) **sector for the years 1939 to 2023:
# Time series information
series <- ' CES8000000001 ' time_series_start <- 1939 time_series_end <- 2023
# **Total **time series
timeseries <- seq ****(time_series_start, time_series_end, by = 1)
# **Split the time series **into **increments of 20 years
increments <- split ****(timeseries, ceiling(seq_along(timeseries) / ****20))
# **Set start and end year for the increments
start_years <- sapply ****(increments, head,1); print ****(start_years)
1 2 3 4 5 ## 1939 1959 1979 1999 2019
end_years <- sapply ****(increments, tail,1); print ****(end_years)
1 2 3 4 5 ## 1958 1978 1998 2018 2023
# **Compile a data frame. for the **time series
OtherServices_employment_seasonallyAdj <-
map2_dfr ****(start_years, end_years, bls_query)
To get employment data from the Other **Services **(except **Public **Administration) **sector on the BLS website, I start at the BLS website for service sectors, then click on the Other Services (except Public Administration) sector link, and then “Back Data” button next to Employment, all employees (seasonally adjusted). WX:codinghelp