Sponsored Content
Top Forums Shell Programming and Scripting Want to remove a line feed depending on number of tabs in a line Post 302892928 by user999991 on Sunday 16th of March 2014 04:50:13 AM
Old 03-16-2014
Want to remove a line feed depending on number of tabs in a line

Hi! I have been struggling with a large file that has stray end of line characters.

I am working on a Mac (Lion). I mention this only because I have been mucking around with fixing my problem using sed, and I have learned far more than I wanted to know about Unix and Mac eol characters.

I can identify easily the number of tabs in each line:

awk '{print gsub(/\t/,"")}' infile > output.txt

BUT I want to selectively process the file. If the number of tabs on a line is 69, this is a legal line.

If it is less than 69, I want to remove the end of line character on that line, take the next line, append it to the end of the first line.

===
At the risk of looking stupid, but perhaps explaining the problem a bit more, I was reading this forum and was able to almost fix the file. In almost every record, a "good" line has a tab preceding the eol character. By brute force, the script below almost solves my problem:

1) changes all Unix eol to Mac eol
2) uses sed to change "tab+eol" to a string
3) uses sed to change remaining "eol" to a different string
4) reverses step 2
5) reverses step 1

I was pleased that I figured this out, but the awk command at the end made me realize that there were in fact a very small number (a few hundred in a million line file) that did not fit the pattern; they were "good" lines and had fields all to the final field. This means I am back to square one, sort of. If I could figure out the question I posed at the top, I could skip this brute force method. If I am stuck with below, I can still manually fix the remaining stray lines.

Code:
LC_CTYPE=C tr -d "\n" < test.txt > test2.txt
gsed -e 's/^I^M/#####ABCDE/g' test2.txt > test3.txt
gsed -e 's/^M/ ABCDE##### /g' test3.txt > test4.txt
gsed -e 's/#####ABCDE/^I^M/g' test4.txt > test5.txt
LC_CTYPE=C tr "\r" "\n" <test5.txt > test6.txt
awk '{print gsub(/\t/,"")}' test6.txt > test6tabs.txt

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

need script-remove line feed

hi all, i have csv file with three comma separated columns i/p file First_Name, Address, Last_Name XXX, "456 New albany \n newyork, Unitedstates \n 45322-33", YYY\n ZZZ, "654 rifle park \n toronto, canada \n 43L-w3b", RRR\n is there any way i can remove \n (newline) from the second... (1 Reply)
Discussion started by: gowrish
1 Replies

2. Shell Programming and Scripting

replace last form feed with line feed

Hi I have a file with lots of line feeds and form feeds (page break). Need to replace last occurrence of form feed (created by - echo "\f" ) in the file with line feed. Please advise how can i achieve this. TIA Prvn (5 Replies)
Discussion started by: prvnrk
5 Replies

3. Shell Programming and Scripting

SED remove line feed and add to certain area

Hi All, I have a xml file and requirement is to remove the line feed and add line feed after some element. <?xml version="1.0" ?> <AUDITRECORDS> <CARF> <HED> <VN1>20090616010622</VN1> <VN2>0</VN2> <VN3>1090</VN3> <VN4>CONFIG_DATA</VN4> ... (8 Replies)
Discussion started by: sreejitnair123
8 Replies

4. Shell Programming and Scripting

Get the 1st 99 characters and add new line feed at the end of the line

I have a file with varying record length in it. I need to reformat this file so that each line will have a length of 100 characters (99 characters + the line feed). AU * A01 EXPENSE 6990370000 CWF SUBC TRAVEL & MISC MY * A02 RESALE 6990788000 Y... (3 Replies)
Discussion started by: udelalv
3 Replies

5. Shell Programming and Scripting

Remove line feed from csv file column

Hi All, My requirement is to remove line (3 Replies)
Discussion started by: r_t_1601
3 Replies

6. Shell Programming and Scripting

Remove line feed from csv file column

Hi All, i have a csv file . In the 7th column i have data that has line feed in it. Requirement is to remove the line feed from the 7th column whenever it appears There are 11 columns in the file C1,C2,C3,C4,C5,C6,C7,C8,C9,C10,C11 The value in C7 contains line feed ( Alt + Enter ),... (2 Replies)
Discussion started by: r_t_1601
2 Replies

7. Shell Programming and Scripting

awk remove line feed

Hi, I've this file: 1, 2, 3, 4, 5, 6, I need to remove the line feed LF every 3 row. 1,2,3, 4,5,6, Thanks in advance, Alfredo (5 Replies)
Discussion started by: alfreale
5 Replies

8. Shell Programming and Scripting

[BASH] read 'line' issue with leading tabs and virtual line breaks

Heyas I'm trying to read/display a file its content and put borders around it (tui-cat / tui-cat -t(ypwriter). The typewriter-part is a 'bonus' but still has its own flaws, but thats for later. So in some way, i'm trying to rewrite cat using bash and other commands. But sadly it fails on... (2 Replies)
Discussion started by: sea
2 Replies

9. Shell Programming and Scripting

Remove line feed in data

Please use code tags for sample data Hi I have a file where there are line feeds in the data. I am not able to read the file from an application. I exported this data from Access database and many columns contain line feed. My data looks like this abcd,efgh,ijkl,mnop abcd,ef... (7 Replies)
Discussion started by: dnat
7 Replies

10. Shell Programming and Scripting

Getting an unexpected newline in my while loop line-by-line feed

Hi, I'm trying to get a line returned as is from the below input.csv file in Bash in Linux, and somehow I get an unexpected newline in the middle of my input. Here's a sample line in input.csv $> more input.csv TEST_SYSTEM,DUMMY@GMAIL.COM|JULIA H|BROWN And here's a very basic while loop... (7 Replies)
Discussion started by: ChicagoBlues
7 Replies
COL(1)							    BSD General Commands Manual 						    COL(1)

NAME
col -- filter reverse line feeds from input SYNOPSIS
col [-bfhpx] [-l num] DESCRIPTION
The col utility filters out reverse (and half reverse) line feeds so that the output is in the correct order with only forward and half for- ward line feeds, and replaces white-space characters with tabs where possible. This can be useful in processing the output of nroff(1) and tbl(1). The col utility reads from the standard input and writes to the standard output. The options are as follows: -b Do not output any backspaces, printing only the last character written to each column position. -f Forward half line feeds are permitted (``fine'' mode). Normally characters printed on a half line boundary are printed on the fol- lowing line. -h Do not output multiple spaces instead of tabs (default). -l num Buffer at least num lines in memory. By default, 128 lines are buffered. -p Force unknown control sequences to be passed through unchanged. Normally, col will filter out any control sequences from the input other than those recognized and interpreted by itself, which are listed below. -x Output multiple spaces instead of tabs. In the input stream, col understands both the escape sequences of the form escape-digit mandated by Version 2 of the Single UNIX Specification (``SUSv2'') and the traditional BSD format escape-control-character. The control sequences for carriage motion and their ASCII values are as follows: ESC-BELL reverse line feed (escape then bell). ESC-7 reverse line feed (escape then 7). ESC-BACKSPACE half reverse line feed (escape then backspace). ESC-8 half reverse line feed (escape then 8). ESC-TAB half forward line feed (escape than tab). ESC-9 half forward line feed (escape then 9). In -f mode, this sequence may also occur in the output stream. backspace moves back one column (8); ignored in the first column carriage return (13) newline forward line feed (10); also does carriage return shift in shift to normal character set (15) shift out shift to alternate character set (14) space moves forward one column (32) tab moves forward to next tab stop (9) vertical tab reverse line feed (11) All unrecognized control characters and escape sequences are discarded. The col utility keeps track of the character set as characters are read and makes sure the character set is correct when they are output. If the input attempts to back up to the last flushed line, col will display a warning message. ENVIRONMENT
The LANG, LC_ALL and LC_CTYPE environment variables affect the execution of col as described in environ(7). EXIT STATUS
The col utility exits 0 on success, and >0 if an error occurs. SEE ALSO
colcrt(1), expand(1), nroff(1), tbl(1) STANDARDS
The col utility conforms to Version 2 of the Single UNIX Specification (``SUSv2''). HISTORY
A col command appeared in Version 6 AT&T UNIX. BSD
May 10, 2015 BSD
All times are GMT -4. The time now is 07:14 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy