How the algorithms of signature search data recovery software

recovery – one of the most important algorithms that make the modern program of recovery of the information what they are: versatile tools, able to pull files from formatted, corrupted and inaccessible disks.

Let's first look at how Windows deletes and removes files


Files are stored in the form of blocks of information recorded on the hard disk sectors. The sectors can be placed sequentially, one after the other and to be scattered randomly across the disk surface. The location of the sectors depends on what blocks were free at the time of saving the file to disk. If the system is not found on disk continuous free block of sectors of sufficient size in order to save the file as a continuous data sequence, the system will fragment the file by writing its parts in loose blocks.


In order to navigate in the recorded information, Windows creates an entry in the file system with an indication of which sectors on the disk a specific file.

At the moment when the user deletes a file, Windows does not erase or overwrite the contents of sectors on the disk. The contents of the record on file in the filesystem is also not removed but undergoes modifications, the system marks the record as belonging to a remote file. Accordingly, all sectors on the disk belong to the file, are free now Windows can save that space for some other file. But until that happens, you can try to recover the contents of a remote file. This will require special software for recovering data.

Programs to restore deleted files scan file system in the search of records marked as deleted. After analyzing these records, it becomes possible to know the exact address of sectors on the disk, which was recorded contents of the original file. After a quick check – do not belong to these sectors with some other file, program reads the sectors and store them in a new file. The problem is solved!

What happens if the file system had no record that points to a remote file? In this case, the simplest tools do not work. A different approach is required – "signature-based search for data recovery".

Signature search


Recovery allows the recovery programs to work with the damaged and formatted partitions and drives, anew broken on sections. For technology there are many commercial names. “Power Search”, “Content-Aware Analysis”, “Smart Scan” all of these technologies from different manufacturers work on the same principle.

The basic principle of the algorithms of signature search same as the first anti-virus software. As the antivirus scans the file in the search parcel data, coincident with the known code fragments, viruses, and signature search algorithms used in the programs to recover data, read information from a disk surface, hoping to find familiar chunks of data. Headers many file types contain characteristic sequences. For example, JPEG files contain a sequence of characters “JFIF”, ZIP archives begin with “PK”, and PDF documents begin with “%PDF-“.

Some files (for example, text and HTML files) do not have characteristic signatures, but can be determined by circumstantial evidence, since they contain only ASCII characters.

More examples:
the the the the the the the the
File Starts from the signature
avi 5249
bmp 424D
tif 4949
doc D0CF
docx 504B
jpeg FFD8
png 8950

To restore the file a little to find it start, you will need to define its end. The end of the file can be found, knowing the size and the address of the beginning of the file. File size is determined either by analysis of the header (ZIP, JPEG, AVI etc.) or by the reading and analysis of disk sectors, coming immediately after the header. For example, the end of a text or HTML file, the algorithm will take the first sector, which will contain characters not in the ASCII table.

Signature-based search is not a panacea. Rewriting the contents of the disk and fragmenting files (especially larger files) have a negative impact on the possibility of media recovery.

A good example of software implementation of signature search starus Partition Recovery. Introductory video. The program is in a test version allows you to perform media information and preview of recoverable files.

How to delete a file that it could not be restored


There is a whole class of programs that is designed for reliable and secure data destruction. One of the best programs to delete the files and overwrite free disk space with random data – Eraser.



Such programs use arrays of random numbers to overwrite the physical disk space occupied by the file is destroyed. Some security standards (e.g. the standard used in the U.S. army) require multiple cycles of rewriting and insist on using cryptographically strong random number generators. In practice, private users and the majority of commercial organizations are enough and the only cycle of rewriting.

Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

Python-digest #8. News, interesting projects, articles and interviews [20 Dec 2013 — 27 Dec 2013]

google life search

Transport Tycoon Deluxe / Emscripten part 2