Sponsored Content
Top Forums Shell Programming and Scripting Help with data processing, maybe awk Post 302443419 by kurumi on Sunday 8th of August 2010 10:38:42 PM
Old 08-08-2010
Code:
sed 's/,/|/5' file|sed 's/.*|//;s/,[ \t]*"Check with admin"//;s/,[ \t]*"".*//'

 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

data processing

hi i am having a file of following kind: 20015#67143645#143123#4214 62014#67143148#67143159#456 15432#67143568#00143862#4632 54112#67143752#0067143657#143 54623#67143357#167215#34531 65446#67143785#143598#7456 75642#67143546#156146#845 24464#67143465#172532#6544... (5 Replies)
Discussion started by: rochitsharma
5 Replies

2. UNIX for Dummies Questions & Answers

Data File Processing Help

I need to read contents of directory and create a list of data files that match a certain pattern and process by renaming it and calling a existing .ksh script then archiving off to file another directory. Any suggestions or samples u could point me to on using .ksh perl or other to process... (5 Replies)
Discussion started by: mavsman
5 Replies

3. Shell Programming and Scripting

How should i know that the process is still processing data

I have some process . How should i know that the process is still processing data or got hanged even though it is showing that it is running in background I know of a command called truss. how should i use this command and determine 1) process is still processing data 2) process got hanged... (7 Replies)
Discussion started by: ali560045
7 Replies

4. Shell Programming and Scripting

awk script processing data from 2 files

Hi! I have 2 files containing data that I need to process at the same time, I have problems in reading a different number of lines from the different files. Here is an explanation of what I need to do (possibly with an awk script). File "samples.txt" contains data in the format: time_instant... (6 Replies)
Discussion started by: Alice236
6 Replies

5. Programming

Data processing

Hello guys! I have some issue in how to processing some data. I have some files with 3 columns. The 1st column is a name of my sample. The 2nd column is a numerical sequence (very big sequence) starting from "1". And the 3rd column is a feature of each line, represented for a number (completely... (2 Replies)
Discussion started by: bfantinatti
2 Replies

6. UNIX for Dummies Questions & Answers

Genomic data processing

Dear fellow members, I've just joined the forum and am a newbie to shell scripting and programming. I'm stuck on the following problem. I'm working with large scale genomic data and need to do some analyses on it. Essentially it is text processing problem, so please don't mind the scientific... (0 Replies)
Discussion started by: mvaishnav
0 Replies

7. Shell Programming and Scripting

Data processing using awk

Hello, I have some bitrate data in a csv which is in an odd format and is difficult to process in Excel when I have thousands of rows. Therefore, I was thinking of doing this in bash and using awk as the primary application except that due to its complication, I'm a little stuck. ... (24 Replies)
Discussion started by: shadyuk
24 Replies

8. Programming

awk processing / Shell Script Processing to remove columns text file

Hello, I extracted a list of files in a directory with the command ls . However this is not my computer, so the ls functionality has been revamped so that it gives the filesizes in front like this : This is the output of ls command : I stored the output in a file filelist 1.1M... (5 Replies)
Discussion started by: ajayram
5 Replies

9. Shell Programming and Scripting

awk processing of variable number of fields data file

Hy! I need to post-process some data files which have variable (and periodic) number of fields. For example, I need to square (data -> data*data) the folowing data file: -5.34281E-28 -3.69822E-29 8.19128E-29 9.55444E-29 8.16494E-29 6.23125E-29 4.42106E-29 2.94592E-29 1.84841E-29 ... (5 Replies)
Discussion started by: radudownload
5 Replies

10. Shell Programming and Scripting

Data Processing

I have below Data *************************************************** ********************BEGINNING-1******************** directive url is : https://coursera-eu.mokar.com/directives/96df29ff-176a-35f7-8b1b-4ce483d15762 Src urls are :... (8 Replies)
Discussion started by: nikhil jain
8 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 02:26 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy