Understanding File Systems

A file system is a method for storing and organizing files on a storage medium such as a hard disk, memory card, CD or DVD.

File systems are very complex, and this article is not going to go into any specific detail. However, every computer user can benefit from understanding in very general terms how data is stored on their computer. This knowledge will help you understand how lost data can be recovered, how the chances of successful recovery depend on what has happened to the disk since the files were lost, and why you need to use data privacy software to ensure that deleted files containing sensitive data are actually deleted.

To keep things simple, this article is written mainly from a Windows standpoint. But it is concerned only with the basic principles of file storage, not the actual methods used. These are the same even if you're using a Mac or Linux.

What is a file?

Although a file may seem like a single entity to you, to a computer it is just a collection of blocks of data. The smallest possible block is called a sector, which is the smallest block the disk drive itself will read or write, and is usually 512 characters or bytes. For convenience, Windows uses larger blocks made up of several sectors, called a cluster. Clusters can be from 1024 bytes (1KB) to 32KB in size. Large clusters are inefficient, and waste disk space, since there is always unused space at the end of the last (or only) cluster in a file.

File system

A file is a collection of clusters, in which are stored the data that represents the file. The clusters in a file do not need to be stored next to each other on the disk, or even in the same order that they appear in the file. Windows maintains a list that shows which clusters belong to which file, and in which order, so whether they are next to each other on the disk doesn't matter.

A folder is another term for a directory, which is a special type of file containing a list of files. In fact, a directory is more than just a list of filenames. It contains information about each file, such as its length, the date it was created, the date it was last accessed, and which users have permission to access it. It also contains pointers to the clusters containing the actual data. How, exactly, the directory is structured and what information it contains, is called the file system.

There are many different file systems. On Windows based computers the most common is New Technology File System (NTFS) or the much older File Allocation Table (FAT) system. But since they all do essentially the same job, their principles of operation are the same.

When choosing data recovery software it's important to know the file system used on the disk containing the lost data, because data recovery tools are written to work with specific file systems. As a general rule, floppy disks, pen drives, flash drives, camera memory cards and many removable drives use the FAT or FAT32 file systems. Most internal hard drives on computers running Windows XP and Windows Vista (and many of those running Windows 2000 or NT) use NTFS.

CDs and DVDs use ISO9660, UDF or Joliet. So data recovery software for magnetic and memory card media cannot be used to recover files from optical media, and vice versa.

Fragmentation

When files are written to a new, blank disk, their clusters are written contiguously, starting at the beginning of the file and finishing at the end. But as files are deleted, reusable storage space appears between other files, that may not be large enough to hold the data being written. A file may then be written using several groups of clusters occupying different locations on the disk. Such a file is said to be fragmented.

When a file is fragmented, it takes longer to read the file from disk, though the difference is hardly noticeable with modern computers. However, it's harder to recover lost files when they are fragmented. If the directory entry for the file has also been lost, there is no information available to link the different groups of clusters together.

When you run a defragmentation utility, it rearranges the clusters of data so that all the clusters belonging to a file are next to each other, in the right order. This speeds up disk access, because the disk drive only has to position the read/write head once to read all the data of the file.

Defragmenting a drive can improve the chances of successfully recovering files that are subsequently deleted or lost. On the other hand, defragmenting the disk will more or less destroy the chances of recovering files that were lost or deleted before it was defragmented. So doing a disk cleanup followed by a defragment, as many people do, is not a good idea, as if you then discover that you deleted something you shouldn't have, the file may not be recoverable.

Deleting

When a file is deleted from your computer, it is not really deleted. All that happens is that the directory entry for the file, which points to the list of clusters on the disk containing the file data, is marked to show it is no longer valid. Right after a file is deleted, nearly all of the information about it, including most of the filename, and the list of clusters that contain the data, still remains on the drive, although the operating system no longer shows it.

If you accidentally erased a file, there's a good chance you can get it back. If you realize straight away that you deleted an important file, and immediately use an undelete tool such as Uneraser, you will almost certainly get the file back. However, the longer you leave it, the greater the chances that the directory entry and/or the data itself will be reused by some other file, and the less are your chances of recovering the data intact.

If you don't have an undelete tool to hand, stop using the drive the lost data was on until you have got one. If the lost files were on the same drive as Windows, avoid using the computer altogether, because Windows writes to the drive all the time. You should try not to install data recovery software to the same drive as your lost files, for the same reason. Either put the hard drive into another computer on which the recovery software has been installed, or use a bootable data recovery CD.

Recycle Bin

Most operating systems provide a measure of protection against accidental deletion, called the Recycle Bin, Wastebin or Trash. When you delete a file, the operating system just moves it to the Recycle Bin folder. As long as the file remains in the Recycle Bin, it can be recovered, or undeleted, exactly as it was, without the need for any data recovery tools whatsoever.

However, not all files are deleted to the Recycle Bin. In Windows this applies only to files that are deleted from Explorer. Files that are deleted from within another application are not deleted to the Recycle Bin. If you press down the Shift key while deleting a file (Shift+Delete), it doesn't go to the Recycle Bin. If a file is overwritten, by saving new data to it, then the old file is not moved to the Recycle Bin first.

Nevertheless, when you think you must have deleted a file that you desperately need, the Recycle Bin should be the first place you look.

Emptying the Recycle Bin deletes the files it contained. The disk space that was occupied by the files in the Recycle Bin folder is marked as available for reuse. However, as mentioned earlier, the contents of this disk space is not immediately overwritten. The files are still recoverable at this point.

When a file is deleted from the Windows Recycle Bin, the link with its original filename is lost. This makes it harder to identify what the files are when doing a data recovery. The deleted files recovery tool Uneraser makes the task easier by displaying previews of all recoverable files so you can see what they are and work out what names to give them if you recover them.

Privacy

What is good news when you delete a file by accident is bad news for privacy. Since data is not physically erased, even if you Shift+Delete to bypass the Recycle Bin, any sensitive or personal information that you want to remove from the computer is still accessible and easily recoverable using data recovery software. To permanently erase data so that it is not recoverable you must use a tool like Privacy Guardian which will physically overwrite all traces of your data.

Another thing you need to bear in mind is that access rights which prevent one user from accessing another's files on the same computer only take effect when files are accessed using the operating system. Data recovery software bypasses the operating system to read the data direct from the disk, ignoring any access rights that may have been set. To properly protect your data files from unauthorized access you need to encrypt them, or store them in an encrypted container. One privacy product that includes this feature is CyberScrub Privacy Suite.