Sponsored Content
Top Forums Shell Programming and Scripting sed and awk not working on a large record file Post 302762961 by Gurkamal83 on Tuesday 29th of January 2013 09:54:42 AM
Old 01-29-2013
This command worked, but with an issue

Code:
||13455;2013-01-04 15:06:49|abc@ser01:/tmp/WrkSpc/Archive

||13455;2013-01-04 15:06:49
abc@ser01:/tmp/WrkSpc/Archive

wc -l on orig file returns
Code:
0 origfile

wc -l on output file returns
Code:
1 newfile

My job won't pick this newfile if it has a line feed.. how to remove the line feed after this command?

Last edited by Scrutinizer; 01-29-2013 at 11:08 AM.. Reason: code tags
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Sed working on lines of small length and not large length

Hi , I have a peculiar case, where my sed command is working on a file which contains lines of small length. sed "s/XYZ:1/XYZ:3/g" abc.txt > xyz.txt when abc.txt contains lines of small length(currently around 80 chars) , this sed command is working fine. when abc.txt contains lines of... (3 Replies)
Discussion started by: thanuman
3 Replies

2. UNIX for Advanced & Expert Users

fatal: record too large

while running a script to process a file , it ended in termination . I able to find from the logs that it ended due to the error : fatal: record too large 20480 . Then I tried lots to fix the issue and left in vain . Can any one help me in finding reason for it ? . If you provide me a solution , I... (1 Reply)
Discussion started by: sakthifire
1 Replies

3. Shell Programming and Scripting

Sed or awk script to remove text / or perform calculations from large CSV files

I have a large CSV files (e.g. 2 million records) and am hoping to do one of two things. I have been trying to use awk and sed but am a newbie and can't figure out how to get it to work. Any help you could offer would be greatly appreciated - I'm stuck trying to remove the colon and wildcards in... (6 Replies)
Discussion started by: metronomadic
6 Replies

4. Shell Programming and Scripting

Updating a line in a large csv file, with sed/awk?

I have an extremely large csv file that I need to search the second field, and upon matches update the last field... I can pull the line with awk.. but apparently you cant use awk to directly update the file? So im curious if I can use sed to do this... The good news is the field I want to... (5 Replies)
Discussion started by: trey85stang
5 Replies

5. Shell Programming and Scripting

how to delete \n in a large file with sed

Hello i have a big file with a specific format and delimiter is "§" : §field1§$field2§$field3§$field4§$field5§$field6§$field§ in this file we have a field which are very long (more than 20000 chars !!!!) so through vi i cant manipulate them. despite this i managed to suppress lines that... (11 Replies)
Discussion started by: ade05fr
11 Replies

6. Shell Programming and Scripting

How to delete 1 record in large file!

Hi All, I'm a newbie here, I'm just wondering on how to delete a single record in a large file in unix. ex. file1.txt is 1000 records nikki1 nikki2 nikki3 what i want to do is delete the nikki2 record in file1.txt. is it possible? Please advise, Thanks, (3 Replies)
Discussion started by: nikki1200
3 Replies

7. Shell Programming and Scripting

SED/AWK to edit/add field values in a record

Hi Experts, I am new to shell scripting. Need some help in doing one task given by the customer. The sample record in a file is as follows: 3538,,,,,,ID,ID1,,,,,,,,,,, It needs to be the following: 3538,,353800,353800,,,ID,ID1,,,,,COLX,,,,,COLY, And i want to modify this record in... (3 Replies)
Discussion started by: sugarcane
3 Replies

8. Shell Programming and Scripting

Split a large file in n records and skip a particular record

Hello All, I have a large file, more than 50,000 lines, and I want to split it in even 5000 records. Which I can do using sed '1d;$d;' <filename> | awk 'NR%5000==1{x="F"++i;}{print > x}'Now I need to add one more condition that is not to break the file at 5000th record if the 5000th record... (20 Replies)
Discussion started by: ibmtech
20 Replies

9. UNIX for Beginners Questions & Answers

sed awk: split a large file to unique file names

Dear Users, Appreciate your help if you could help me with splitting a large file > 1 million lines with sed or awk. below is the text in the file input file.txt scaffold1 928 929 C/T + scaffold1 942 943 G/C + scaffold1 959 960 C/T +... (6 Replies)
Discussion started by: kapr0001
6 Replies

10. UNIX for Advanced & Expert Users

How to split large file with different record delimiter?

Hi, I have received a file which is 20 GB. We would like to split the file into 4 equal parts and process it to avoid memory issues. If the record delimiter is unix new line, I could use split command either with option l or b. The problem is that the line terminator is |##| How to use... (5 Replies)
Discussion started by: Ravi.K
5 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 05:29 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy