Preliminary ideas for a replacement file system library

This thread is intended to be used to discuss preliminary ideas for what a replacement file system API might look like.

Hopefully something final will be created as a result of the discussion here, though this is intended to merely begin the conversation.

The end goal of this project will be to create a new file system API to replace the old SD File System and Petit Fat File System APIs.

Notably:

Note that you don’t necessarily need to be able to help build the library to contribute something of value - if you simply wish to chime in with what facilities you, personally, as a developer would like to see from a file system library then by all means please do so.

APIs are built to meet a demand, and it is important that developers’ wants and needs are taken into consideration as part of the design process.

2 Likes

The following are my (many) preliminary thoughts about the API.

Firstly, factors that need to be taken into consideration,
including many considerations that I don’t have an answer for:

Considerations
  • Are symbolic links supported?
  • Are hard links supported?
  • Are non-file streaming devices (a.k.a. file-like devices, e.g. console streams) supported?
  • Should asynchronous operations be supported?
    • If so, how?
  • Should a designated Path class be used?
    • Could potentially reduce the amount of URL parsing needed?
    • Difficult to avoid dynamic allocation when manipulated at runtime
    • Can make use of constexpr user-defined literals to construct at compile time, thus helping to mitigate some overhead
  • Is returning std::variants cheaper than using out parameters?
    • For now I’m using std::variant in my examples because it’s slightly nicer to work with
    • std::variant does not use dynamic allocation, so that is not a concern here, the concern is how ARM handles out parameters vs structure returns
  • Is read/write/access time important?
    • My instinct is that this information is going to be irrelevant for the vast majority of games, and that games may record their own read/write/access times in the file itself if necessary, but I’m prepared to be surprised with a reason why this funtionality might be important
  • Resizing files might be a bit of a pain from what I’ve heard…
  • Are file permissions at all relevant?
    • One would assume that as the Pokitto has no concept of ‘admins’ and ‘users’, and that files cannot be ‘executed’ as such, that this would not be relevant, but it’s posible that specifying certain files as ‘read only’ may be beneficial.

Secondly, some example data types and operations that I think the API might need to include.

Most of this is inspired by the C++ standard library’s std::filesystem API, added in C++17,
but the actual std::filesystem is more complete than what the Pokitto will require,
hence much of its functionality has been left out.

Types
  • File
    • Class representing a readable or writable file
    • Auto-closes upon destruction
    • Cannot be copied, but can be moved
    • Only reads raw bytes from the file, all data processing and formattion must be provided by external functions or objects
  • Error
    • Possibly a scoped enumeration (enum class)
    • Represents an error of some kind
    • Does not contain an error message
      • Leaving out an error message makes it cheaper to use
  • Status
    • Represents either an error or success
    • Might be implemented as std::optional<Error>?
  • Path?
    • See ‘Considerations’ above
Filesystem Operations
  • fileExists(path) -> Status, directoryExists(path) -> Status, entryExists(path) -> Status
    • Check if file or directory exists
  • isFile(path) -> Status, isDirectory(path) -> Status
    • Check whether a given entry is a file or a directory
    • Assuming symlinks, hardlinks and other devices don’t exist, it should hold that (ignoring errors) !isFile(a) == isDirectory(a) and isFile(a) == !isDirectory(a), thus a getType(path) -> std::variant<EntryType, Error> function shouldn’t be required, though it may be worth considering that alternative
  • openReadableFile(path) -> std::variant<File, Error>
    • Open file for reading
  • openWritableFile(path) -> std::variant<File, Error>
    • Open file for writing
  • createFile(path) -> Status, createDirectory(path) -> Status
    • Create file or directory
    • Should recursive creations be allowed?
  • deleteFile(path) -> Status, deleteDirectory(path) -> Status, deleteEntry(path) -> Status
    • Delete file or directory
    • Should deleting a directory only work for empty directories?
  • copyFile(from, to) -> Status, copyDirectory(from, to) -> Status
    • Copy file or directory
    • Should directory copying be recursive?
    • Depends on: fileExists, directoryExists, entryExists, openReadableFile, opwnWritableFile
    • Implemented by creating a new file then copying the data across
  • moveFile(from, to) -> Status, moveDirectory(from, to) -> Status
    • Move file or directory
    • Should directory copying be recursive?
    • Depends on: fileExists, directoryExists, entryExists, openReadableFile, opwnWritableFile, deleteFile, deleteDirectory
    • Implemented by performing a copy and then a delete
  • getFileSize(path) -> std::variant<std::uintmax_t, Error>
    • Returns the size of the specified file
    • Hopefully can be implemented more cheaply than opening a file and seeking to the end
  • resizeFile(path, newSize) -> Status
    • Resizes a file
    • For efficiency, do not fill with zeroes on either expansion or truncation
  • renameFile(old, new) -> Status, renameDirectory(old, new)
    • Renames a file or directory
    • Should be cheap?
  • isDirectoryEmpty(path) -> std::variant<bool, Error>
    • Returns whether the specified directory contains any files

In addition to a file system API, I have been considering a ‘storage’ API used to provide information about different storage systems.
By default, the Pokitto only has the one SD card slot, and thus it’s the only storage device that exists,
but there exists the possibility to connect storage devices via the PEX or through some other means,
so it makes sense to at least consider that possibility, even if this possibility is later considered unlikely and thus this API is replaced with one that only considers the SD card slot.

Thirdly, functions pertaining to the abstract idea of a ‘storage device’,
for the eventuality that a Pokitto might need to interface with more than just its SD card slot:

Storage Operations
  • storageDeviceExists(storageIdentifier) -> Status
    • Checks whether the specified device exists and is connected
  • storageDeviceType(storageIdentifier) -> std::variant<DeviceType, Error>
    • Identifies the type of the device
  • getStorageCapacity(storageIdentifier) -> std::variant<std::uintmax_t, Error>
    • Returns the capacity of the specified storage device
  • getUsedStorageCapacity(storageIdentifier) -> std::variant<std::uintmax_t, Error>
    • Returns how many bytes of storage are occupied by files, directories, metadata, et cetera
  • getUnusedStorageCapacity(storageIdentifier) -> std::variant<std::uintmax_t, Error>
    • Returns how many bytes of storage are free for use

Ignoring errors, getStorageCapacity(a) == getUsedStorageCapacity(a) + getUnusedStorageCapacity(a) should ideally always be true.

It may be worth following std::filesystem's example and simply having all three properties returned as a single std::space_info-like data structure.

And finally, a few things I am aware of but haven’t contemplated in depth,
which will most likely need to be discussed:

Yet to be considered
  • The API for iterating through the contents of a directory
  • The member functions of a File object, other than the basic reading, writing, flushing and closing
  • Whether File should be separated into e.g. ReadableFile, WritableFile and BidirectionalFile

All comments and criticisms welcome.
I apologise for the number of proposals and details blocks.

1 Like

One idea I just had is to have a #define in project settings to specify whether to optimize for speed, functionality, or reliability. If a game doesn’t ever need to create/resize/add/remove files but wants to stream images/video then it could define a setting to optimize for raw-reading only, which would save program space. If a library used requires the latter then a simple #ifdef can be used with a #error alerting the developer that the library requires a certain mode.

This would probably require 3 modes (possibly a 4th mode):

  • Mode1: Fast read/write speed, but can’t create/resize/add/remove files (much like using pure PFFS)
  • Mode2: Fast read speed, but create/resize/add/remove/write operations would use the heavier code (like using PFFS for streaming, and SDFileSystem for everything else).
  • Mode3: Best reliability uses heavier code for all operations (much like using pure SDFileSystem) with crc checks and everything else.
  • Mode5: Combination of Mode2 and Mode3 where the program can choose to open a file for fast read, or reliable read on a case-by-case basis.

Obviously the modes should have more appropriate names that better describe what they represent (looking at you PROJ_SCREENMODE).

1 Like

Ideally I would like to start moving the library away from #defines and towards alternatives such as templates.

As is becoming increasingly apparent, #define-based settings result in very brittle, inflexible and somewhat confusing code.
I dare say that the very existance of a PokittoSettings.h is an awkward hack.

Furthermore #ifdef and #error can be replaced with things like if constexpr, template specialisation and static_assert, which are much more powerful tools.

For the most part it should be possible to achieve this just by relying on C++'s principle of ‘you don’t pay for what you don’t use’.
I.e. unused functions are not included in the generated binary executable (.bin).

Otherwise, it could be achieved with templates or alternative functions in separate namespaces.
E.g. Pokitto::Filesystem::copy vs Pokitto::Filesystem::Lightweight::copy, or Pokitto::FileSystem::copy<FSMode::Lightweight> vs Pokitto::FileSystem::copy<FSMode::Full>.

Ultimately to decide that we’ll first need to identify why PFFS is smaller and faster than SDFS and use that information to make an informed decision about how to avoid paying for what isn’t needed.

File<FileOptions::NoCRC> vs File<FileOptions::CRC> as an example of what’s possible with templating.

openReadableFile<FileOpenOptions::Fast> vs openReadableFile<FileOpenOptions::Reliable> as another example.

Or, if you don’t like the template syntax:

openReadableFile(path, FileOpenTags::fast);
openReadableFile(path, FileOpenTags::reliable);

And trust the compiler to optimise the extra tag argument away.
(A variation of ‘tag dispatching’.)

Hopefully this demonstrates why other techniques are more flexible than macros.

Admittedly I haven’t really played around much with some of the newer features in c++11, 14, and 17 yet. Sounds like some of them would come in handy for cleaner, easier to read code, while still allowing for simple customization.

I keep forgetting about this step of the compiler, only really started learning about it when developing for more limited systems such as the Arduino (and by extension Arduboy) and now the Pokitto. All of which require special attention to program size.

The only thing I really know about templates is that the compiler generates specific functions for each type used in a template at compile time (though that might not be used for everything). With small-scale game development I haven’t had a huge need for templates and other layers of abstraction. Though I did once use templates for a resource management system that worked like a smart pointer but loaded the specific resource id when it first gets referenced and unloaded it when all references were freed (with the smart pointer part it usually meant when the reference went out of scope). That system also made it so anytime something modified a resource everything using that resource was able to be notified that a change was made and could refresh themselves accordingly. In a way I always thought of templates as being just a simple, better, alternative to the dreaded #define code-expanding macros (ie. the ones that spanned several lines and often became quite confusing to read).

It is nice that you were able to understand the key concept I was talking about and translate it to utilize the additional features of c++17. Hopefully I’ll get a chance to learn some of the newer features as I continue getting back into programming (it’s been far too long since I’ve really done anything worth-wile, last time I did anything big c++0x was still being standardized and only had experimental support, which later became c++11).

1 Like

To be clear, I think that the cost of functions is certainly an important point and I wasn’t trying to dismiss it at all,
I just wanted to nip any suggestions of using macros in the bud at the first opportunity.
(The Pokitto library already skirts the border of macro hell,
the last thing we need is to push it over the edge. :P)

I’d like to add that most of my suggestions would actually have been viable even in C++98.
Templates and overloading have been there since first standardisation.
The only new part is the implication that FileOpenOptions/FileOptions would now be scoped enumerations (enum classes, a C++11 feature) rather than ‘plain’ enums (though plain enums would have worked).


I was going to address some of your other comments,
but I’ll do that in a PM lest we deviate too far off-topic.

(Not least because I could drone on about C++ standards and templates for hours if let loose. :P)