LEAN file system specification - Chewing the Fat
Last Update: 2022 Dec 08
This document is a list of items that I am thinking of adding to the LEAN file system specification. It is simply a list of items that I may or may not add, items suggested to me by readers like you, or ideas that I have come up with while driving back and forth to work. I welcome your comments at: fys at fysnet.net
An item will have a date of when it was last modified and a value from 0 to 9 indicating my interest, a value of 9 indicating its addition is imminent. (list is sorted in rank from 9 to 0)
Added in version 1.0.0-rc0
Adding a 32-bit or 64-bit flags member to the Superblock
for optional additions to the specification. For example, allowing the file tail option. A mounting driver could find an offending bit and refuse to mount if it doesn't support that feature.
Advantage: This would give a wider range of abilities and function to the file system without requiring the complexity of supporting this range. A driver could still implement this file system with its simplicity, refusing to mount if a complex optional function was found.
Disadvantage: This takes away the absolute simplicity of the file system, giving an unlimited range of options.
Implementation: Would require an addition to the superblock
, though of itself, this is a simple task.
Notes:
Added in version 1.0.0-rc0
Add a 64-bit member to the Superblock
indicating the next available free block.
Advantage: With media such as thumb drives, where excessive writes to a specific location wear that location, a function to indicate where to look for the next free block would help with wear. For example, if a file system constantly allocated and then freed blocks from the start of the volume, never touching other parts of the volume, this would dramatically wear the first part of the volume. A driver could rotate this value through the whole of the volume.
Disadvantage: Most quality made media devices already account for this and "allocate" blocks from all over the media. With this in mind, there is no need to implement this.
Implementation: This would be a very simple addition and a correctly written driver would easily honor this, updating to the next block, rotating through the whole volume.
Notes:
Added in version 1.0.0-rc0
Give each Extent
a 32-bit checksum on the file data stored within that extent.
Advantage: This would add robustness to the file system. Each Extent
would have its own checksum to help verify that the data was correctly read from the media.
Disadvantage: This would require the whole extent to be read, even when a single block within that extent is requested, obviously adding time to each access. It would also add space to the Inode and Indirect blocks.
Implementation: This would need a count of extentsPerInode
32-bit members added to the Inode, taking away from the space after the Inode, also requiring extentsPerIndirect
members in the Indirect block
, the former not a big deal, though the latter could dramatically shrink the count of Extents
allowed by 25%. For example, a 512-byte block size currently allows 38 Extents
in the Indirect block
. Adding a 32-bit field for each would shrink this value down to 28. A 4096-byte block size would go from 336 to 252. A clarification would have to be given for any Extents
used past end of file.
Notes: This option sounds quite appealing, but adds time to the media access, both read and write, as well as adds 25% to the overhead of heavily fragmented files. Again, this could be an optional function of the specification defined by a (proposed) capabilities
flag.
Create a field in the Superblock to indicate the specification's revision version. i.e.: If the specification version is 0.8.2, this byte field would have a value of 2.
Advantage: Not much except that the driver could indicate what revision version it supported, though it shouldn't make much difference.
Disadvantage: This really isn't needed, other than to simply keep a tight indicator of what version a driver was written for.
Implementation: Either create a new byte field in the Superblock for this value, or re-format the current version
field to include this value. The current version
field is wide enough to hold three values, knowing that one or more of those values will never be more than 15.
Notes: Not really needed.
This idea was suggested to me by a fellow reader at the osdev forum.
Rather than the current function of potentially storing the head of the file in the first block of the file, just after the Inode, rather store the tail of the file.
Advantage: This has a dramatic advantage for slow-growing files such as log files. For example, when appending to the file, only the first block needs to be read. It contains the Inode as well as the tail of the file. More importantly, when writing it back, this block is the only block that needs modified and written since it, again, contains the tail of the file and the Inode, the Inode needing updating due to the file size change as well as a time stamp change.
Disadvantage: Even though this is very appealing, this would only work for certain types of files. It also requires a bit more calculations to be made to be sure that the appended data will still fit within this area. If it won't, the existing data must be appended to the end of the file, possibly a new allocation being made, the Inode updated, then the new tail written to the end of the Inode. This also makes for quite a bit more complexity when modifying random access files, where adding a few bytes to the middle of the file will dramatically change the location of the tail of the file.
Implementation: This actually can be done as a per-Inode only idea. If the user knows this will be a slow-growing file, a flag could be set within the Inode stating that this function is used. This would then require a new version of the specification to require this function. However, this brings in another thought (see a later note) of placing an ext2/3/4 style capabilities bitmap in the superblock
indicating if this volume requires this function. If this function is used, the corresponding flag in the capabilities bitmap must be set, giving a driver the ability to refuse to mount the volume if it didn't support this function.
Notes: As for the Inode, a simple addition to the InodeAttributes
would be needed, along with a qualification that a few other flags were set or clear accordingly.
Compression of data
Advantage: This would allow large files to occupy less of the volume.
Disadvantage: This will give a whole lot more complexity to the driver. A driver would most likely have to read in the whole file and decompress it, just to modify a single block. There would be no way of knowing what block offset to traverse to when accessing random access compressed files.
Implementation: As for the on-media function, there is nothing really needed other than some indication on what compression technique was used. The complexity is added to the driver, not the format of the volume.
Notes: This will add complexity to a file system driver.
Encryption of data
Advantage: Any obvious advantages of encryption.
Disadvantage: This will give a whole lot more complexity to the driver. A driver would most likely have to read in the whole file and decrypt it, just to modify a single block. There would be little to no way of knowing what block offset to traverse to when accessing random access encrypted files.
Implementation: As for the on-media function, there is nothing really needed other than some indication on what encryption technique was used. The complexity is added to the driver, not the format of the volume.
Notes: This will add complexity to a file system driver, as well as national requirements for traveling encryption.