The FYSOS registry system specification 1.0.0-rc1
Version 1.0.0-rc1, last update: 2023 January 26
Copyright © 1984-2023 by Forever Young Software
Permission to make and distribute verbatim copies of this specification, translations of this specification into another language, and derivative works that comment on or otherwise explain it or assist in its implementation, for any purpose and without fee or royalty is hereby granted, provided that the present copyright notice, licensing terms and disclaimer are preserved on all copies.
In addition, the Author covenants not to assert any claims on implementations of this specification, unless such implementations contain derivative work of implementations by the Author, for which specific licensing terms may apply.
THIS SPECIFICATION IS PROVIDED "AS IS," AND THE AUTHOR MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NON-INFRINGEMENT, OR TITLE; THAT THE CONTENTS OF THIS SPECIFICATION ARE SUITABLE FOR ANY PURPOSE; NOR THAT THE IMPLEMENTATION OF SUCH CONTENTS WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS. TO THE FULLEST EXTENT PERMITTED BY APPLICABLE LAW, THE AUTHOR WILL NOT BE LIABLE FOR ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF ANY USE OF THIS SPECIFICATION OR THE PERFORMANCE OR IMPLEMENTATION OF THE CONTENTS THEREOF.
No other rights are granted by implication, estoppel or otherwise.
This document describes version 1.0.0-rc1 of the FYSOS registry system: a free, simple, portable, personal, fully featured registry system for embedded tasks and hobbyists alike. Minor changes made to this document (e.g. wording) that do not affect the registry system format are tracked by the third number in the document version number.
This registry system is in the release development stage: this document supersedes any previous version of the registry system specification with no care for backward compatibility.
Since this is a new registry system, and some aspects are still to be defined, suggestions or corrections are welcome, for either the registry system or this document. Please contact the author at: fys at fysnet.net
. The author wishes to thank those who have submitted comments and criticism in order to improve this system.
Table of contents
- Differences from the previous versions
- Definitions
- Structure identification and checksum
- Layout of the FYSOS registry system
- The use of names for hives and cells
- The Base Structure
- Ending Marker
- The Hive Structure
- The Cell Structure
- An example registry
- Requirements
Differences from the previous versions
- version 1.0.0-rc1
- Added types: Added a few more data types. This modifies some existing types.
- version 1.0.0-rc0
- Official Release: Official release version 1.0.0-rc0.
Definitions
The following terms and conventions will be used throughout this specification:
- must, must not, should, should not, may: these words are to be interpreted as described in RFC 2119;
- reserved for future use: a data field that has not yet been defined. When creating a new entry that field must be written as zeros. When reading an existing entry, a driver must not make assumption about the content of that field. When writing that field to an existing entry, a driver must preserve the value in that field;
- byte: a group of eight adjacent bits, an octet;
- driver: a system driver, or any other part of the operating system, or any application implementing this specification to access the registry system;
- cell: a block of memory, with a name, used to store data;
- hive: a block of memory, with a name, used to store child hives and cells;
- The C++ syntax is used for code snippets and structure formats, with stdint.h integral types. The
char
type is assumed to be 8-bit. Hex numbers are prefixed with0x
; - Unless otherwise stated, all numbers must be stored in little endian format (least significant byte first).
Structure identification and checksum
The structure of this registry system includes fields to make the system more robust.
Sensitive structures, such as the Base Structure
, store a checksum
field in the first few bytes of the structure. The checksum is computed on all data within the registry block, not counting the checksum
field itself. The definition and technique to calculate this checksum is defined by the official CRC-32 standard. The checksum must be recomputed at least every time a sensitive structure or data area is modified and released to the system.
The following functions show how to calculate the checksum. The data
parameter is a pointer to the data area to be checked. The size
parameter is the size in bytes of the structure pointed to by data
. A driver must initialize the crc32_table
once before calling any of the remaining routines. A call to crc32_initialize()
may be used.
checksum
field itself must not be included in the checksum calculation. Initially setting this field to zero will allow it to be a part of the check./* Predefined polynomial */ #define CRC32_POLYNOMIAL 0x04C11DB7 /* Lookup table. Must be pre-initialized. */ uint32_t crc32_table[256]; /* Initialize table. * no parameters */ void crc32_initialize(void) { // 256 values representing ASCII character codes. for (int i=0; i<256; i++) { crc32_table[i] = crc32_reflect(i, 8) << 24; for (int j=0; j<8; j++) crc32_table[i] = (crc32_table[i] << 1) ^ ((crc32_table[i] & (1 << 31)) ? CRC32_POLYNOMIAL : 0); crc32_table[i] = crc32_reflect(crc32_table[i], 32); } } /* Reflection: * reflect = current value to process * ch = size in bits of value * (Reflection is a requirement for the official CRC-32 standard. * You can create CRCs without it, but they won't conform to the standard.) */ uint32_t crc32_reflect(uint32_t reflect, char ch) { uint32_t ret = 0; // Swap bit 0 for bit ch-1, bit 1 For bit ch-2, etc.... for (int i=1; i<(ch + 1); i++) { if (reflect & 1) ret |= 1 << (ch - i); reflect >>= 1; } return ret; } /* Compute the checksum of an area. * data -> data area to be checked. * len = count in bytes of area to check. */ uint32_t crc32(void *data, uint32_t len) { uint32_t crc = 0xFFFFFFFF; crc32_partial(&crc, data, len); return (crc ^ 0xFFFFFFFF); } /* Compute the checksum of a partial area. * crc -> running checksum value. * ptr -> data area to be checked. * len = count in bytes of area to check. */ void crc32_partial(uint32_t *crc, void *ptr, uint32_t len) { uint8_t *data = (uint8_t *) ptr; while (len--) *crc = (*crc >> 8) ^ crc32_table[(*crc & 0xFF) ^ *data++]; }
Some structures also contain one or more magic
fields storing a 32-bit constant signature identifying the structure. This can be used as a first test to validate a sensitive structure.
Figure 1: Layout of a FYSOS registry system.
Layout of the FYSOS registry system
The layout of a registry is shown in Figure 1, with the three minimally required structures shown in Generation 0: The Base Structure
, the base hive (with a name of System
), and the Base End Structure
.
A registry starts with a Base Structure
allowing to store information about the registry so that it may be written to a media device for storage, as well as other information needed. It then contains a single hive capable of containing many child hives and cells. This main hive must have the case sensitive name of System
. Following this hive, ending the registry, a single Base End Structure
is used to indicate the end of the data.
To allow a hierarchy of hives to be stored within the registry, any hive may contain child hives, each in turn containing children themselves, up to a depth of 256 generations.
Each hive may also contain an arbitrary number of cells, these cells containing the desired data to store within the registry. A cell must not contain child hives or cells.
A delimited character string is used to transverse through the hive generations, ultimately pointing to a single cell.
For example, if an application wants to save a flag indicating if it has been initialized, it could use the following path: /System/Kernel/ApplicationName/Setup/Initialized
When sent to the registry driver, this path would be used to retrieve the TRUE
value from the example shown in Figure 1.
Each name within the delimited path is a generation of hives each generation allowing an arbitrary amount of hives and cells to be stored.
The use of names for hives and cells
Throughout the registry, a name is used to indicate a hive or a cell. For example, a limb on the tree (a generation) will need a name, used as a parent. Each cell (and optionally any hives) within this child generation will need a name as well. A name is stored within the hive and cell structures. To keep it simple, these structures contain a fixed number of bytes used to store this name. This name is stored using the UTF-8 format and must be null terminated.
Since a path uses the '/'
character as a delimiter, all characters except for this forward slash are allowed within a name. Names are case-sensitive. For example, the names "ApplicationName
" and "applicationname
" are two different names and both may appear in a hive.
It is up to the registry driver to make sure that no two identical named hives and/or cells are included within the same generation.
The Base Structure
The format of the Base Structure is the following:
uint32_t magic | This must be equal to 0x42415345 (the 'BASE' characters in ASCII), and it must be used to identify a valid registry system. |
uint32_t checksum | The checksum value for the whole registry. All bytes from the start of this structure to and including the RegisterEnd structure are included in the calculation. |
uint32_t version | This field identifies the version of the registry system, and it is provided for future development. The high word identifies the major version number and the low word the minor version number (for example 0x0120 would mean version 1.32). At present, it must be set to 0x0100 (that is version 1.0) and drivers must not try to access an unknown system version, backward compatibility making no sense. |
uint32_t padding | This field is reserved and must be preserved. |
uint64_t size | This field is the size of the allocated memory used to hold this registry. It is only valid while the registry is loaded into memory. This field is considered reserved and preserved when written to a media device. |
uint64_t length | This is the count of 8-bit bytes used to hold the registry. i.e.: this is the current size of the registry from the start of this structure, through and including the RegistryEnd structure. This field must remain valid both in memory and on media. |
uint64_t lastModified | This is the timestamp of the last time this registry was modified. This field must hold the microseconds from an epoc of 1 Jan 2000, 00:00:00. (Does not include leap seconds) |
uint64_t reserved | This field is reserved and must be preserved. |
There is a marker at the end of the registry to simply ensure the integrity of the registry. Its format is shown below.
The Registry Base End Structure
The format of the Registry Base End Structure is the following:
uint32_t magic | This must be equal to 0x45534142 (the 'ESAB' characters in ASCII), and it must be used to identify the end of a valid registry system. |
The Hive Structure
Between the RegistryBase
and RegistryBaseEnd
structures, there is a single hive. This hive is the base hive, must be the only hive in this first generation, and must have the name of "System
". This hive may and usually does contain child hives and cells.
A hive contains a starting tag, a name, a depth value, enough room to store any child hives and/or cells, and an ending tag.
The format of the Hive Structure is the following:
uint32_t startingTag | This must be equal to 0x48495645 (the 'HIVE' characters in ASCII), and it must be used to identify the start of a hive. |
uint8_t name[32] | This is the name of the hive. It is stored in UTF-8 format and must be null terminated. |
uint32_t depth | This is the generational depth of the hive. For example, if this is the base hive, it will have a value of zero. A child hive will have a value of 1. A grandchild hive will have a value of 2. This is to help keep the registry intact and used for robustness. A maximum hive depth number of 255 must be observed (256 max generations). |
uint32_t reserved | Reserved and preserved. (Future plans: May be used for permissions and other flags.) |
child hives and/or cells | |
uint32_t endingTag | This must be equal to 0x45564948 (the 'EVIH' characters in ASCII), and it must be used to identify the end of a hive. |
A hive must occupy 12 dwords (48 bytes), not counting the dwords used to store the child hives and cells.
The Cell Structure
A cell is used to store the desired information. A cell must not contain any children.
A cell contains a starting tag, a name, a data type, a data length, enough room to store the data, and an ending tag.
The format of the Cell Structure is the following:
uint32_t startingTag | This must be equal to 0x43454C4C (the 'CELL' characters in ASCII), and it must be used to identify the start of a cell. |
uint8_t name[32] | This is the name of the cell. It is stored in UTF-8 format and must be null terminated. |
uint32_t type | This is the type of data stored. See enum dataType below. |
uint32_t length | This is the length, in 32-bit dwords, of the data stored. Must be a value of 0 to 16384 inclusively. |
uint32_t data[length] | This is the data stored. |
uint32_t endingTag | This must be equal to 0x4C4C4543 (the 'LLEC' characters in ASCII), and it must be used to identify the end of a cell. |
So that the length of a cell is always a multiple of a 32-bit dword, the data
member is a count of 32-bit dwords. If the length of the data stored is less than a multiple of sizeof(dword)
, the trailing bytes must be zeros.
A cell must occupy 12 dwords (48 bytes), not counting the dwords used to store the data.
There are eight types of data allowed and listed below.
Symbolic name | Value | Description |
---|---|---|
dtExist | 0 | No data. Simply an existing empty cell used as a marker. Drivers must return a boolean value showing the existence of this cell.
The length field must be zero .
No data is stored. The cell's data[] field is non-existent. |
dtBoolean | 1 | Data type of Boolean. A zero value indicates FALSE. Any non-zero value indicates TRUE. It is recommended, but not required, that all non-zero values be the value of 0x00000001 .
The length field must be 1 .
Data stored in little-endain format. example: 00 00 00 00 or 01 00 00 00 |
dtInteger | 2 | Data type of integer. A 32-bit signed integer in the range of -2,147,483,648 to 2,147,483,647 inclusively.
The length field must be 1 .
Data stored in little-endain format. |
dtUnsigned | 3 | Data type of unsigned integer. A 32-bit unsigned integer in the range of 0 to 0xFFFFFFFF inclusively.
The length field must be 1 .
Data stored in little-endain format. |
dtIntegerLong | 4 | Data type of long integer. A 64-bit signed long integer in the range of -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 inclusively.
The length field must be 2 .
Data stored in little-endain format: low dword first, high dword last. |
dtUnsignedLong | 5 | Data type of unsigned long integer. A 64-bit unsigned long integer in the range of 0 to 0xFFFFFFFFFFFFFFFF inclusively.
The length field must be 2 .
Data stored in little-endain format: low dword first, high dword last. |
dtString | 6 | Data type of a character string. A string of UTF-8 characters and must be null terminated.
The length field must be (utf8_strlen(string) + utf8_strlen('\0') + sizeof(dword) - 1) / sizeof(dword) .
Data stored as consecutive bytes. |
dtBinary | 7 | Data type of binary. A string of 8-bit bytes.
The length field must be (length_of_data + sizeof(dword) - 1) / sizeof(dword) .
Data stored as consecutive bytes. |
An example registry
Here is an example registry, complete with the Base, base hive (with children), and the Ending Base tag.
Figure 2: Example of a complete registry.
Requirements
The following is a list of notes and/or requirements.
- A max generational depth of 256 (last hive having a depth of 255) must be observed.
- Hives and cells may reside beside each other in any order in all generations except for the first. The first, the base hive must be the only hive in the first generation of hives, as well as no cells are allowed in this generation.
- Hives may be empty, having no cells.
- Cells may be empty, using their existence as a bolean return value.
- Strings and Binary Data (types
dtString
anddtBinary
respectfully) must not exceed 65536 bytes in length. i.e.: The length field must be no more than 16384 dwords. Any remaining bytes after the actual data stored, up to 3, must be stored as zeros. - A driver must not retrieve the data from a cell that doesn't match the type requested. i.e.: If the driver is requesting a Boolean value from a cell, if that cell's type field is not
dtBoolean
, the driver should return an error. When checking for the existence of a cell, any valid cell type may exist. - It is recommended that a driver return a negative value for all errors, a zero value for non-error returns without returning data, and a positive value indicating how many bytes were successfully written to/read from a cell.
- A driver must ensure no duplicate names are used within the same generation. If a duplicate name is given on a write, the first cell with that name must be overwritten. Optionally, a driver may return a negative value, ask before overwriting, or simply abort the write (returning zero).
- For simplicity, there is no "double linked list" type of storage within this registry. If there was, every parent hive would have to be modified every time a cell was added/modified/deleted from a child hive. However, this means that to transverse the registry, every generation must be parsed until the correct hive is found. A proper parsing algorithm and a fast machine will make this a simple and fast process.
- Any unused bytes after the null terminated string in the name field of a hive or cell must also be zeros.
- For multi-tasking environments, a spinlock of sorts is recommended so that any part of the registry can only be accessed by one task at a time. The checksum and timestamp (on writes) must be updated before this spinlock is released.