Compare Directories

There are situations when you may want to check whether two directories are same or not. (i.e. Whether they contain same file structure and files inside it.) Although there are several utilities to perform such comparison, you as a programmer of the BDUX project (A Bangla Linux Platform) is required to develop another such utility, to check whether two directories contain same content or not. This may enable the recovery of the broken transaction of files over the network. Input The input part is quit complex. Input contains several transactions of directories. Let us first define “File Object” - A file object can be a directory or a general file. File object of type directory occur in a format : ‘DIRECTORY_NAME

FN’, where DIRECTORY_NAME is any valid directory name (at most 255 character) and indicating that this is a directory. The integer F N (at most 65 thousand) indicates the number of files and folders contained in that directory. The next FN line contains descriptions of FN number of file object. File object of type general file occur in a format : ‘FILE_NAME NByte’. File names are all valid names and the size of the files are in bytes. No file can be greater than 2 Giga byte. Each transactions have the following format : • The first line contains the date for the transaction. • Root directory of the first directory to compare. • Description of the first directory. • Root directory of the second directory to compare. • Description of the second directory. Output The output should be properly formatted using “tabs”. A “tab” for current problem is defined as four (4) spaces (‘ ’). A file object should be printed, if required according to the situation described later, using the following conditions: • If it is directory print as ‘DIRECTORY_NAME N object(s)’, where the variable DIRECTORY _NAME is the name of the directory and N is the number of objects present in that directory. • If it is a general file print as ‘FILE_NAME N byte(s)’, where FILE_NAME is the name of the file and N is the size of the file in bytes. At first print ‘==== Begin of Comparison ====’ in a single line. For each transaction print ‘Transaction #’ followed by the transaction number, then the date. Then generate report for each set of directories, start with a tab value 0: • Print tab’s. i.e. 4*tab number of spaces. • Print ‘Comparing "path1" with "path2".’. Replace path1 with the first directory, and path2 with the second directory.

2/4 • For each file object – If they have totally different content, print tab number of tabs and report ‘Totally different.’, then return. – If both have a common directory descend in that directory, with tab+1 and start comparing. – If both have a file name in common but different file size, print tabs of tab + 1 amount, report it immediately, in the format: ‘File size mismatch : "PATH_1/FILE_NAME_1 (FILE_SIZE_1)" and "PATH_2/FILE_NAME_2 (FILE_SIZE_2)".’ – Report the file that is in path2 but not in path1

  • Print tab + 1 tabs.

  • For each object print tab + 2 tabs and this file object according to the rule described previously. – Report the file that is in path1 but not in path2

  • Print tab + 1 tabs.

  • For each object print tab + 2 tabs and this file object according to the rule described previously. • If each file object in path1 equals to each file object in path2, print tab’s, print ‘No difference.’ • Otherwise print tab’s, print ‘Difference(s) encountered.’ Print a blank line after each transaction except the last one. Output should be sorted according to the input. The common File Objects of any two directories to compare are in the same order. You must use longest common subsequences of the file names in the two directory content to get common file names. At the end print ‘==== End of Comparison ====’ in a single line. See the sample output below. Sample Input 12/23/2001 /usr/bin suman

    7 BigBro 2 1.exe 987 2.exe 987 NewFile 109 directory 3 hi 108 thisFile 203 xlog 1 xlog.log 111 extra 1029 underConstruction 3 index.html 12395 p0.html 1333 p1.html 2287 wrt.doc 1987 zlib 0 /home suman 8 AAA.dat 60000

    3/4 BigBro

    2 3.exe 387 4.exe 223 NewFile 109 directory 3 hi 108 thisFile 203 xlog 1 xlog.log 111 extra 1029 underConstruction 3 index.html 11005 p0.htm 1333 p1.htm 2287 wrt.doc 1987 zLib 2 bin 2 gzip 299 gzip.log 300 zlib.so 23098 10/3/2002 /CDrive ACMHelper 1 acmhelper.exe 100 /tmp Helper 1 acmhelper.exe 100 11/2/2000 /CDrive Prog 2 ACMHelper 1 acmhelper.exe 100 newDoc.rtf 2024 /tmp CopyProg 2 Helper 1 acmhelper.exe 100 noname.c 1002 Sample Output ==== Begin of Comparison ==== Transaction #1 : Date 12/23/2001 Comparing "/usr/bin/suman" with "/home/suman". Comparing "/usr/bin/suman/BigBro" with "/home/suman/BigBro". Totally different. Comparing "/usr/bin/suman/directory" with "/home/suman/directory". Comparing "/usr/bin/suman/directory/xlog" with "/home/suman/directory/xlog". No difference. No difference. Comparing "/usr/bin/suman/underConstruction" with "/home/suman/underConstruction".

    4/4 File size mismatch : "/usr/bin/suman/underConstruction/index.html (12395)" ... ... and "/home/suman/underConstruction/index.html (11005)". "/usr/bin/suman/underConstruction" lacks of following file(s) p0.htm 1333 byte(s) p1.htm 2287 byte(s) "/home/suman/underConstruction" lacks of following file(s) p0.html 1333 byte(s) p1.html 2287 byte(s) Difference(s) encountered. "/usr/bin/suman" lacks of following file(s) AAA.dat 60000 byte(s) zLib

    2 object(s) "/home/suman" lacks of following file(s) zlib 0 object(s) Difference(s) encountered. Transaction #2 : Date 10/3/2002 Comparing "/CDrive/ACMHelper" with "/tmp/Helper". No difference. Transaction #3 : Date 11/2/2000 Comparing "/CDrive/Prog" with "/tmp/CopyProg". Totally different. ==== End of Comparison ====