Design and Implementation of File System Project 3: Design

Project 3: Design and Implementation of File System
April 16, 2024
1 Objectives

Design and implement a basic disk-like secondary storage server.
Design and implement a basic ffle system to act as a client using the disk services provided by
the server designed above.
Study and learn to use the socket API. Sockets are used to provide the communication mechanism
between (i) the client processes and the ffle system, and (ii) the ffle system and the disk storage
server.
2 Problem Statement
This project will be developed in several steps to help you understand the concepts and encourage
modular project development. You must adhere to the speciffcations given below. To share your
experiences effectively, you are recommended to keep a record of your insights during the coding. This
includes writing down any challenges encountered or bugs costing you a long time for debugging. By
doing so, you’ll avoid forgetting the details of your experiences when writing the report.
Step 1: Design a basic disk-storage system
Description: Implement, as an Internet-domain socket server, a simulation of a physical disk. The
simulated disk is organized by cylinder and sector.
• You should include the seek time in your simulation to account for track-to-track time (using
usleep(3C), nanosleep(3R), etc). Make the seek time a value in microseconds, passed as a
command-line parameter to the disk server program.
• You should accept the number of cylinders and the number of sectors per cylinder from command
line par代写Design and Implementation of File System ameters. Assume the sector size is ffxed as 256 Bytes.
• Your simulation should store the actual data in a real disk ffle, so you’ll need to accept a fflename
for this ffle as another command line option.
Hints:
You may ffnd that the mmap(2) system call provides you with the easiest way of manipulating
the actual storage. However, you are allowed to use ffle and ffle API to simulate the storage
representing your disk.
Given the challenges of socket programming, we have provided some functions. Please refer to
the code template provided on Canvas for how to use these functions. If you believe there are
bugs in the provided code, please contact the TA on time.
The Disk Server: The server should be able to understand the following commands and give
corresponding responses:
• I: Information request. The disk returns two integers representing the disk geometry - the number
of cylinders, and the number of sectors per cylinder.
• R c s: Read request for the contents of cylinder c sector s.
– The disk returns Yes followed by a whitespace and those 256 bytes of information, or No
if no such block exists.
– Hint: This will return whatever data happens to be on the disk in a given sector, even if
nothing has ever been explicitly written before.
• W c s l data: Write a request for cylinder c sector s. l is the number of bytes being provided,
with a maximum of 256. The data is those l bytes of data.
– The disk returns Yes to the client if it is a valid write request (legal values of c, s, and l),
or returns a No otherwise.
– In cases where l < 256, the contents of those bytes of the sector between byte l and byte
256 are undeffned, use zero-padded contents.
Hints:
The data format that you must use for c, s, and l above is a regular ASCII string, followed by
a space character. So, for example, a read request for the contents of sector 17 of cylinder 130
would look like R 130 17.
Strings are not equivalent to byte arrays: in a string, 0 represents the end of the string, but
this is not the case in a byte array. Besides, many bytes cannot be printed as visible characters.
Therefore, DO NOT use functions like strlen() or printf() on the data. The disk server will
not interact with users directly, so writing raw bytes to the ffle descriptor is acceptable.
The Disk Client: The client that you write here is mostly for the unit testing of the disk module.
The actual front-end user of this “disk” will be the ffle system implemented in the later steps. The
client should work in a loop, having the user type commands in the format of the above protocol, send
the commands to the disk server, and display the results to the user. Besides, you need to deal with
ASCII invisible characters. For example, add a command that prints data in binary or hexadecimal.
Otherwise, it would be difffcult to debug your disk server without such a debugging tool.
Command line:
a) Server:
./BDS <#cylinders> <#sector per cylinder> <port=10356>
b) Client:
./BDC <port=10356>
2Figure 1: File System Architecture
Step 2: Design a basic ffle system
Implement an inode ffle system. The ffle system should provide operations including:
• Initialize the ffle system
• Create a ffle
• Read data from a ffle
• Write the given data to a ffle
• Append data to a ffle
• Remove a ffle
• Create directories
To provide the above ffle-like concepts, you need to operate on more than just the raw block
numbers that your disk server provides to you. You need to keep track of which blocks of storage are
allocated to which ffle, and the free blocks available on the disk.
Free space management involves maintaining a list of free blocks available on the disk. Two
alternative designs are suggested: a bit vector (1 bit per block) or a chain of free blocks. Associated
with each block is a cylinder# and sector#. “Writing to a ffle” is converted to “writing into a disk
location identiffed by (cylinder#, sector#)”. All this information should be stored on the disk, as the
ffle-system module in the memory is not persistent, and it could be shut down and restarted with data
lost.
The ffle system needs to support ffles of at least 16 KB in size, and fflenames must be at least 8
Bytes long.
Implement this ffle system server as another UNIX-domain socket server. On one hand, this
program will be a server for one UNIX-domain socket; on the other hand, it will be a client to the disk
server UNIX-domain socket from the previous parts, as shown below.
3The File-system Protocol: The ffle system server should be able to accept and respond to the
following commands.
• f: Format. This will format the ffle system on the disk, by initializing any/all of the tables that
the ffle system relies on.
• mk f : Create a ffle. This will create a ffle named f in the ffle system.
• mkdir d: Create a directory. This will create a subdirectory named d in the current directory.
• rm f : Delete ffle. This will delete the ffle named f from the current directory.
• cd path: Change the directory. This will change the current working directory to the path. The
path is in the format in Linux, which can be either a relative path or an absolute path. When the
ffle system starts, the initial working path is “/”. You need to handle “.” and “..” as special
cases.
• rmdir d: Delete a directory. This will delete the subdirectory named d within the current
directory.
• ls: Directory listing. This will return a listing of the ffles and directories in the current directory.
You are also required to return other meta information, such as ffle size, last update time, etc.
• cat f : Catch ffle. This will read the ffle named f, and return the data in it.
• w f l data: Write data into ffle. This will overwrite the contents of the ffle named f with the
l bytes of data. If the new data is longer than the data previously in the ffle, the ffle will be
extended longer. If the new data is shorter than the data previously in the ffle, the ffle will be
truncated to the new length.
• i f pos l data: Insert data to a ffle. This will insert l bytes of data into the ffle after the pos−1
th
character but before the pos
th
character (0-indexed). If the pos is larger than the size of the ffle,
append the l bytes of data to the end of the ffle.
• d f pos l: Delete data in the ffle. This will delete l bytes starting from the pos character
(0-indexed), or till the end of the ffle (if l is larger than the remaining length of the ffle).
• e: Exit the ffle system.
For testing/demonstration purposes, you need to implement a command-line client, similar to the
one that you wrote for the disk server in Step 1. The ffle-system client should work in a loop, allowing
the user to type commands in the format of the above protocol, send the commands to the ffle server,
and display the results to the user.
Hints: We suggest following the below steps to implement your ffle system:
• Implement an in-memory ffle system, reading commands via STDIN, where the data is stored in
memory and disappears after the system reboot.
• Write the data to a raw ffle to achieve the data persistence.
• Implement the ffle system as a server, accepting commands passed through a client.
4• Replace the raw ffle with the disk server implemented in Step 1. This should only require
modifying a few lines of code.
Command line:
a) Disk Server:
./BDS <#cylinders> <#sectors per cylinder> <port=10356>
b) File-system Server:
./FS <BDSPort=10356> <FSPort=12356>
c) File-system Client:
./FC <FSPort=12356>
Step 3: Support multiple users in the ffle system
A real-world ffle system is supposed to store data for more than one user. In this step, you need to
make some changes to your ffle system so that it becomes a multi-user system. The key problem to
solve here is to separate the stored ffles and data for different users. Instead, how to support the
parallel operations by multiple clients is considered in the “Extension 2 - Multi-Clients” in Step 4 as
an optional extension. In other words, in this step, you can safely assume different users only connect
to and access the ffle system in sequential order. One client will ffnish the actions and disconnect
before another client establishes her/his connection and performs actions.
To support the multi-user feature, we ffrst need to design a data structure to store the user information,
which should be stored in the ffle system, instead of memory. Then you need to provide
approaches and interfaces to log into your ffle system, create new users, delete users, etc.
Each user should have her/his home folder, own ffles, and so on. Regarding the ffle access control,
the users can share some ffles with other users, or make ffles only visible to themselves. You need to
design a ffle access protocol to control ffle read/write permissions.
Step 4: Optional extensions
In this step, we provide some optional extra functions to the basic ffle system. Implementing any of
the following functions will help you earn bonus points. Besides, you are encouraged to implement any
other additional functions/optimizations that are useful to your ffle system. Please clearly indicate
the optional functions you have implemented in your ffle system, and demonstrate your extensions
through empirical performance evaluation or qualitative analysis.
Extension 1 - Caching: Reading ffles can be expensive, incurring many I/Os to the (slow) disk.
Imagine an open example: without caching, every ffle open would require at least two reads for every
level in the directory hierarchy (one to read the inode of the directory in question, and at least one to
read its data). With a long pathname (e.g., /1/2/3/ ... /100/ffle.txt), the ffle system would literally
perform hundreds of reads to just open the ffle. To remedy what would be a huge performance issue,
you can use a region in the system memory to cache frequently accessed or important blocks. Similar
to the virtual memory mechanism, strategies such as LRU and different algorithms would decide which
blocks to keep in the cache. Implement a cache in the memory, select your page replacement algorithm
5for the cache, and share your design and performance evaluation (e.g., how many times you have
accelerated your file read operations?)
Extension 2 - Multi-clients: You have supported multi-users in step 3 which mainly focuses on
separating the user folders/files and user account management. Another thing you can do is to support
more than one user to use the file system at the same time. That means your file system server should
support parallel requests from multiple clients. Different from the shell problem in Project1, here we
have more challenges when more than one user logs in to your file system. For example, user A wants
to write a file, while user B wants to read the same file. There is a read/write conflict. You can
come up with a solution to solve this conflict. There are still some other problems like this. It’s really
difficult to handle all such synchronization situations, so we encourage you to try your best to consider
as many issues as possible and explain your solutions in the report.
Extension 3 - Journaling: Consider the crash-consistency problem, that is, how to update persistent
data structures despite the presence of a system crash. The system may crash or accidentally
power off between any two writes, and thus the on-disk state may only partially get updated. After the
crash, the system boots and wishes to mount the file system again. Given that crashes can occur at
arbitrary points through time, how do we ensure the file system keeps the on-disk image in a reasonable
state? Many modern file systems use the idea of journaling (also known as write-ahead logging), where
a sequence of updates is treated as a single indivisible unit (called transaction), ensuring that either
all of the updates are committed to the disk, or none of them are. Try to implement a journaling and
transactions mechanism in your file system.
3 Implementation Details
In general, the execution of the programs above is carried out by specifying the executable program
name followed by the command line arguments.
• Use Ubuntu Linux and GNU C Compiler to compile and debug your programs. GCC-inlineassembly
is allowed.
• See the man pages for more details about specific system or library calls and commands: sleep(3),
gcc(1), mmap(2), usleep(3), nanosleep(3), etc.
• When using system or library calls you have to make sure that your program will exit gracefully
when the requested call cannot be carried out.
• One of the risks of experimenting with forking processes is leaving unwanted processes hanging
which wastes system resources. Please make sure that each process/thread is terminated cleanly
when the program exits. A parent process should wait until all its child processes finish, print a
message, and then quit.
• Your program should be robust. If any of the calls fail, it should print an error message and exit
with an appropriate error code. Please always check for errors when invoking a system or library
call.
• Please ensure that your program can work correctly with the tests we provided and exit properly.
64 Material to be submitted
• Create a folder named Prj3_studentID that includes your source codes, Makefile, and report.
Use meaningful names for the files so their contents are recognizable. Then compress the folder
into a file named Prj3_studentID.tar, ensuring that only the folder itself is included in the tar
file.
• You are not required to submit the executables. Make sure your programs can be compiled with
the Makefile before the submission.
• Enclose a README file that lists the files you have submitted along with a one-sentence explanation.
Call it README.md.
• Please state clearly the purpose of each program at the start of the program file. Add comments
to explain your program.
• Test runs: You must show that your program works for all possible inputs. Submit online a
single typescript file clearly showing the working of all the programs for correct input as well as
a graceful exit on error input. Call it test.md or test.pdf.
• Submit a short report to present the analysis. You can also write down important methods and
observations you find useful when finishing your project. Call it report.pdf.
• We provide a code template for this project. Here’s how the file structure looks (you can feel
free to add other files you need):
Prj3 52202191xxxx
Makefile
README.md
report.pdf
test.md
step1
BDS.c
BDC.c
Makefile
step2
FS.c
FC.c
Makefile
step3
Makefile
...
step4
Makefile
...
• Submit your Prj3_StudentID.tar file on Canvas.
• Due date: 23:59 on May. 31, 2024.
• Demo slots: Around June 1, 2024. Demo slots will be posted later on a shared document.
Please sign up for one of the available slots.
7• You are encouraged to present your design of the project optionally. The presentation time is
16:00-17:40, June 4, 2024 (Course meet time of the 16th week). Please pay attention
to the course website.
WX：codinghelp