View previous topic :: View next topic |
Author |
Message |
Sujao l33t
Joined: 25 Sep 2004 Posts: 677 Location: Germany
|
Posted: Wed Feb 09, 2005 11:00 pm Post subject: two identical fils with different size |
|
|
Hi,
I have two ASCII files (file says so) that contain EXACTLY the same text, not a single char is different. Still they have different size!
fileA is 808Byte
fileB is 844Byte
fileA was created by a program, file B was born with touch and then the every line of fileA was "echo $x >>" to fileB.
I guess it might be the charset. The one created by the programm is probalby plain ASCII and the touched one is UTF. Would "file" recognize that? Because it says ASCII text. If that is so how can I force a creation of a plain ASCII file? Oh and they are both on the same filesystem (xfs) .
Actually I wouldnt care about 36 Byte but the second file contains a structure that another programm should be able to read. It can read fileA but not fileB and as I dont see any textual difference it has to be the interiors. |
|
Back to top |
|
|
angoraspruce Apprentice
Joined: 08 Jan 2005 Posts: 193 Location: Minnesota, USA
|
Posted: Wed Feb 09, 2005 11:09 pm Post subject: Re: two identical fils with different size |
|
|
Hello,
Open the files in a binary/hex editor, and you'll see all the hidden characters. Myself, I'd use 'vim -b <file>', which usually shows me hidden characters that have crept in when I switch files from the Mac to Windows to Linux.
Best regards |
|
Back to top |
|
|
adaptr Watchman
Joined: 06 Oct 2002 Posts: 6730 Location: Rotterdam, Netherlands
|
Posted: Wed Feb 09, 2005 11:10 pm Post subject: |
|
|
Well, UTF-8 is Unicode, so logically the second file should be twice the size of the first.
But there are easier ways
Code: | diff -a --side-by-side file1 file2 |
_________________ >>> emerge (3 of 7) mcse/70-293 to /
Essential tools: gentoolkit eix profuse screen |
|
Back to top |
|
|
Sujao l33t
Joined: 25 Sep 2004 Posts: 677 Location: Germany
|
Posted: Wed Feb 09, 2005 11:18 pm Post subject: |
|
|
I tried both methods. Absolutely no difference.
I uploaded the files. Maybe you can find out more.
fileA
fileB
file B, subtitles_clean.idx, the bigger one is the one created with touch and ">>" |
|
Back to top |
|
|
adaptr Watchman
Joined: 06 Oct 2002 Posts: 6730 Location: Rotterdam, Netherlands
|
Posted: Wed Feb 09, 2005 11:35 pm Post subject: |
|
|
Alas, uploading will most certainly have reverted them to plain ASCII - or else the webserver will. _________________ >>> emerge (3 of 7) mcse/70-293 to /
Essential tools: gentoolkit eix profuse screen |
|
Back to top |
|
|
Sujao l33t
Joined: 25 Sep 2004 Posts: 677 Location: Germany
|
Posted: Wed Feb 09, 2005 11:41 pm Post subject: |
|
|
FTP still shows the diffrent sizes but here is the archive anyway. |
|
Back to top |
|
|
angoraspruce Apprentice
Joined: 08 Jan 2005 Posts: 193 Location: Minnesota, USA
|
Posted: Thu Feb 10, 2005 1:46 am Post subject: |
|
|
Did you try adaptr's diff command?
Code: | id: de, index: 0 id: de, index: 0
timestamp: 00:00:13:920, filepos: 000000000 | time: timestamp: 00:00:13:920, filepos: 00000 |
'timestamp:' does not equal 'time: timestamp:'
|
|
Back to top |
|
|
Sujao l33t
Joined: 25 Sep 2004 Posts: 677 Location: Germany
|
Posted: Thu Feb 10, 2005 2:02 am Post subject: |
|
|
superdoh! thats like one of those phycho tricks where you cant see the obvious.
EDIT: Damn, I still cant believe it. I have been looking at the text for several minutes and I didnt see the difference. Maybe I should sleep more. |
|
Back to top |
|
|
|