The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Special Forums > Windows & DOS: Issues & Discussions
Google UNIX.COM



View Single Post in UNIX Forums - Click on the Thread or Permalink to View Entire Thread -->
  #5 (permalink)  
Old 05-12-2008
era era is online now
Herder of Useless Cats
 

Join Date: Mar 2008
Location: /there/is/only/bin/sh
Posts: 3,065
In fact transferring the file in ASCII mode should handle the line ending conversion during the transfer for you. It might (but most likely won't) make other changes, too, if the file contains special characters, but if as you say it's basically ASCII text, then transferring it in ASCII mode is really all it takes.

There are various tools to look at the raw bytes in a file; one of the purposes of a hex editor is to be able to inspect the precise bytes in a file so you can spot e.g. line ending anomalies. The control character ctrl-J is called a "line feed" and is used to end a line on Unix systems (and thus on the hosting account you are using) whereas on legacy DOS-based systems you use two characters, a sequence of ctrl-M (carriage return) and line feed. In a hex editor, they will show up as 0D and 0A, respectively.

Here's a hex dump of a fragment of text just to show you an example. You can see how each pair of hexadecimal (base-16) digits on the left correspond to one ASCII character on the right; for example, hex 65 is lower case "e".

Code:
54 68 65 20 63 6f 6e 74 72 6f 6c 20 63 68 61 72  The control char
61 63 74 65 72 20 63 74 72 6c 2d 4a 20 69 73 20  acter ctrl-J is 
63 61 6c 6c 65 64 20 61 20 22 6c 69 6e 65 20 66  called a "line f
65 65 64 22 20 61 6e 64 20 69 73 20 75 73 65 64  eed" and is used
The convention to use hexadecimal (base 16) instead of the familiar base 10 (decimal) is a convenience; it means that all possible byte values can be represented with exactly two digits, and important "computer" numbers -- factors of two -- are easy to spot. Character codes below 32 (hex 20, the space character) are conventionally called "control characters"; this goes way back to the early formation of character sets in the 1950s and ASCII in the 1960s.
Reply With Quote