Should I say "field 8" or "column 8" in this case?


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Should I say "field 8" or "column 8" in this case?
# 8  
Old 03-23-2013
I got your point before, and was agreeing. But I don't get your point this time.

I never suggested or thought cut only works with printing characters.

I don't agree that tab and backspace "seldom take up one column". The discussion is about fields and columns in files. I think tab and BS always take up one column (single byte characters). If you are referring to how the file displays on a monitor or printer, I think that's just an artifact. I don't think it's relevant to the issue that tab might be displayed ^I or eight spaces, or 'A' displayed in binary or hex by od, etc.
# 9  
Old 03-23-2013
Quote:
Originally Posted by hanson44
I got your point before, and was agreeing. But I don't get your point this time.

I never suggested or thought cut only works with printing characters.

I don't agree that tab and backspace "seldom take up one column". The discussion is about fields and columns in files. I think tab and BS always take up one column (single byte characters). If you are referring to how the file displays on a monitor or printer, I think that's just an artifact. I don't think it's relevant to the issue that tab might be displayed ^I or eight spaces, or 'A' displayed in binary or hex by od, etc.
The POSIX and UNIX Standards have definitions for the terms byte, character, and column position that are very different from what you described above. The standard definitions are:
  1. Byte:
    An individually addressable unit of data storage that is exactly an octet, used to store a character or a portion of a character; see also Section 3.87 (on page 47). A byte is composed of a contiguous sequence of 8 bits. The least significant bit is called the ‘‘low-order'' bit; the most significant is called the ‘‘high-order'' bit.

    Note: The definition of byte from the ISO C standard is broader than the above and might accommodate hardware architectures with different sized addressable units than octets.
  2. Character:
    A sequence of one or more bytes representing a single graphic symbol or control code.

    Note: This term corresponds to the ISO C standard term multi-byte character, where a single-byte character is a special case of a multi-byte character. Unlike the usage in the ISO C standard, character here has no necessary relationship with storage space, and byte is used when storage space is discussed.

    See the definition of the portable character set in Section 6.1 (on page 125) for a further explanation of the graphical representations of (abstract) characters, as opposed to character encodings.
  3. Column Position:
    A unit of horizontal measure related to characters in a line.

    It is assumed that each character in a character set has an intrinsic column width independent of any output device. Each printable character in the portable character set has a column width of one. The standard utilities, when used as described in POSIX.1-2008, assume that all characters have integral column widths. The column width of a character is not necessarily related to the internal representation of the character (numbers of bits or bytes).

    The column position of a character in a line is defined as one plus the sum of the column widths of the preceding characters in the line. Column positions are numbered starting from 1.
So your description of a column being equivalent to a byte just doesn't compute with what I believe that term means.
# 10  
Old 03-24-2013
To correct the record, I didn't say "a column is equivalent to a byte". And that's quite an assertion you so boldly made, that the standard definitions of byte and character are "very different" from how I used those terms. I've been using gettext, i8n, Unicode and UTF-8 for many years, and understand bytes and multi-byte chars very well. For the record, so I'm not deemed some kind of radical, I 100% agree with the standard definitions of byte and character you included.

My original post was about the word "column", in the context of a delimited file. We've gotten far off that topic. I previously agreed with you that I failed to think about locales when discussing why the cut option says "--characters" and not "--columns". I was trying to be nice. Could we move on? Smilie
# 11  
Old 03-24-2013
I apologize. I misinterpreted your statement:
Quote:
I think tab and BS always take up one column (single byte characters).
to mean that you were equating bytes to columns.

Sometimes e-mail/forum discussions lead to confusion that would never occur in a face-to-face discussion where a clarification would happen immediately rather than being exacerbated by the delays between posts in a forum like this.
This User Gave Thanks to Don Cragun For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. AIX

Apache 2.4 directory cannot display "Last modified" "Size" "Description"

Hi 2 all, i have had AIX 7.2 :/# /usr/IBMAHS/bin/apachectl -v Server version: Apache/2.4.12 (Unix) Server built: May 25 2015 04:58:27 :/#:/# /usr/IBMAHS/bin/apachectl -M Loaded Modules: core_module (static) so_module (static) http_module (static) mpm_worker_module (static) ... (3 Replies)
Discussion started by: penchev
3 Replies

2. Shell Programming and Scripting

Bash script - Print an ascii file using specific font "Latin Modern Mono 12" "regular" "9"

Hello. System : opensuse leap 42.3 I have a bash script that build a text file. I would like the last command doing : print_cmd -o page-left=43 -o page-right=22 -o page-top=28 -o page-bottom=43 -o font=LatinModernMono12:regular:9 some_file.txt where : print_cmd ::= some printing... (1 Reply)
Discussion started by: jcdole
1 Replies

3. UNIX for Dummies Questions & Answers

Using "mailx" command to read "to" and "cc" email addreses from input file

How to use "mailx" command to do e-mail reading the input file containing email address, where column 1 has name and column 2 containing “To” e-mail address and column 3 contains “cc” e-mail address to include with same email. Sample input file, email.txt Below is an sample code where... (2 Replies)
Discussion started by: asjaiswal
2 Replies

4. Shell Programming and Scripting

Awk,sed : change every 2nd field ":" to "|"

Hi Experts, I have a string with colon delimited, want 2nd colon to be changed to a pipe. data: 101:8:43:4:72:14:41:69:85:3:137:4:3:0:4:0:9:3:0:3:12:3: I am trying with sed, but can change only 1 occurance: echo "101:8:43:4:72:14:41:69:85:3:137:4:3:0:4:0:9:3:0:3:12:3:" | sed 's/:/|/2'... (5 Replies)
Discussion started by: rveri
5 Replies

5. Shell Programming and Scripting

Substituting comma "," for dot "." in a specific column when comma"," is a delimiter

Hi, I'm dealing with an issue and losing a lot of hours figuring out how i would solve this. I have an input file which looks like this: ('BLABLA +200-GRS','Serviço ','TarifaçãoServiço','wap.bla.us.0000000121',2985,0,55,' de conversão em escada','Dia','Domingos') ('BLABLA +200-GRR','Serviço... (6 Replies)
Discussion started by: poliver
6 Replies

6. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

7. Shell Programming and Scripting

"Join" or "Merge" more than 2 files into single output based on common key (column)

Hi All, I have working (Perl) code to combine 2 input files into a single output file using the join function that works to a point, but has the following limitations: 1. I am restrained to 2 input files only. 2. Only the "matched" fields are written out to the "matched" output file and... (1 Reply)
Discussion started by: Katabatic
1 Replies

8. UNIX for Dummies Questions & Answers

Explanation of "total" field in "ls -l" command output

When I do a listing in one particular directory (ls -al) I get: total 43456 drwxrwxrwx 2 root root 4096 drwxrwxrwx 3 root root 4096 -rwxrwxr-x 1 nobody nobody 3701594 -rwxrwxr-x 1 nobody nobody 3108510 -rwxrwxr-x 1 nobody nobody 3070580 -rwxrwxr-x 1 nobody nobody 3099733 -rwxrwxr-x 1... (1 Reply)
Discussion started by: proactiveaditya
1 Replies

9. UNIX for Dummies Questions & Answers

Explain the line "mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'`"

Hi Friends, Can any of you explain me about the below line of code? mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'` Im not able to understand, what exactly it is doing :confused: Any help would be useful for me. Lokesha (4 Replies)
Discussion started by: Lokesha
4 Replies
Login or Register to Ask a Question