File duplication occurs because of the use of a permanent hierarchical directory structure (full-stop). The solution is to uniquely identify each file in a flat directory-less file system... In place of a permanent hierarchy, a "virtual folder" hierarchy can be utilized purely for incidental use to aid in visualizing relationships between files. In addition to the more traditional folder hierarchy modality, tagging can be more readily exploited. Essentially, what I'm describing IS tagging (Semantic File System) with the added layer of creating hierarchical relationships between the tags, purely for visualization. In reality, the disk is already managed this way. We just ruined it by adding an archaic form of file management on top (in the case of Windows and Linux)... This would eliminate unintentional duplications. This would NOT, however, deal with files with a unique name or ID and identical content. Another layer would need to be added, using checksum, to flag potential duplicates.
VaM file bloat....
The real problem is, why hasn't this type of file management been implemented, considering the benefits of efficient disk usage?
Flat and unique is more efficient... Anything else is masochistic...
It also forces programmers to create unnecessarily complex scripts to manage file duplication. Dysfunction begets more dysfunction........................
Unique identifiers can be created in much the same way we create compound words (something we do naturally already) by concatenating unique attributes of the file (or whatever) in combination with a compression algorithm (like checksum) one could crunch down the essence of a "file" in a concise descriptor. The file name can consist of a concatenation of [creator name | file purpose | creation date/time | and of course ext. (file type)] Not unlike VAR naming, except VAR internally is still using a rigid hierarchical system...
A standard would need to be created for the type of compression algorithm, (otherwise you would get inconsistent naming) and also the attributes to concatenate would have to be standardized as well (agreed upon in advance) for consistency. Each attribute is then run through the compression algorithm to create a concise (size limited) descriptor in each attribute category. The trickier attribute would be "File Purpose." This would be read directly from the file contents, presumably file header, body, etc... Regardless of fine details, it would ensure uniqueness and can easily be automated and would be far less frustrating...
Some people love to live in chaos... I do not....
VaM file bloat....
The real problem is, why hasn't this type of file management been implemented, considering the benefits of efficient disk usage?
Flat and unique is more efficient... Anything else is masochistic...
It also forces programmers to create unnecessarily complex scripts to manage file duplication. Dysfunction begets more dysfunction........................
Unique identifiers can be created in much the same way we create compound words (something we do naturally already) by concatenating unique attributes of the file (or whatever) in combination with a compression algorithm (like checksum) one could crunch down the essence of a "file" in a concise descriptor. The file name can consist of a concatenation of [creator name | file purpose | creation date/time | and of course ext. (file type)] Not unlike VAR naming, except VAR internally is still using a rigid hierarchical system...
A standard would need to be created for the type of compression algorithm, (otherwise you would get inconsistent naming) and also the attributes to concatenate would have to be standardized as well (agreed upon in advance) for consistency. Each attribute is then run through the compression algorithm to create a concise (size limited) descriptor in each attribute category. The trickier attribute would be "File Purpose." This would be read directly from the file contents, presumably file header, body, etc... Regardless of fine details, it would ensure uniqueness and can easily be automated and would be far less frustrating...
Some people love to live in chaos... I do not....
Semantic file system - Wikipedia
en.wikipedia.org
Last edited: