|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
How to identify broken lines in a file?
Hi,
I have a 100 byte length fixed width file . In that three rows are broken and went off to next line. How can I identify the broken lines? E.g. ABCD1234MNRD4321 abcd1234mnrd 4321 As you can see in my example my second row with small case alphabets is broken and went off to next line. How can I identify the broken line. |
| Sponsored Links | ||
|
|
#2
|
|||
|
|||
|
If it is a "fixed width file" as you said, then you can identify the broken ones by the deviating number of characters in it. The regexp: Code:
. matches exactly a single character. So you just have to repeat this the number of times you expect the line to be wide to find all non-broken lines: Code:
.\{n\}where "n" is the number of characters. Now you just reverse the search by using i.e "grep -v": Code:
grep -v '^.\{n\}$' /path/to/inputfileBut you probably want to correct this circumstance, so "grep" is not the right tool - but "sed" is, and it works with the same syntax: Code:
sed ': start
s/\n//
/^.\{n\}$/! {
N
b start
}
p' /path/to/inputfileFirst every line has line feeds deleted. Then every line NOT consisting of n characters (the "!") - that is: the broken ones - will cause the next line to be read in and added to the line before. Then control branches to the beginning of the script again. If the line still is too short, even the next line will be read in, etc.., until the correct line length is reached. Then the line is printed in the last statement. I hope this helps. bakunin |
| Sponsored Links | ||
|
|
#3
|
||||
|
||||
|
This should list the lines that are less than 100 characters in length: Code:
while read line
do
pos=`echo ${#line}`;
if [ "$pos" -lt "100" ]; then
echo $line;
fi
done < test.txt |
|
#4
|
|||
|
|||
|
display all lines with less then 100 characters prefixed with the line number. Code:
awk 'length($0)<100{print NR,$0}' foo.bar |
| Sponsored Links | |
|
|
#5
|
||||
|
||||
|
One more .. Code:
$ sed '/^.\{100\}/d' infile |
| Sponsored Links | |
|
|
#6
|
|||
|
|||
|
Quote:
thank you. |
| Sponsored Links | ||
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Merge broken lines | ashwin_winwin | Shell Programming and Scripting | 22 | 04-18-2012 11:31 AM |
| Scripting help to identify words count in lines | Giorgio C | Shell Programming and Scripting | 4 | 11-10-2011 09:59 AM |
| Joining broken lines | ratheeshjulk | Shell Programming and Scripting | 8 | 06-22-2011 10:08 AM |
| awk / shell - Fix broken lines and data | rimss | Shell Programming and Scripting | 3 | 06-01-2006 03:02 AM |
|
|