Wouldn't 8.000.000 kB (8 * 10^6 * 10^3) be 8 GB? And thus (sort of) manageable? How come we're talking terabytes?
Now, that you mention it: i think you are right. I just read Dons "8TB" and didn't recalculate myself. My bad.
I just counted one record of the posted saple to have 260 characters. As a size of 15-25 million records were mentioned: 15 * 10^6 * 260 ~ 4GB, 25 * 10^6 * 260 ~ 6GB. This should indeed be feasible to sort in memory.
I'm sorry for all of the confusion. I had originally intended to type 8GB, but hit the T instead of the G key. Then while I was reviewing it, I decided to spell it out and converted the 8TB to 8 terabytes compounding, instead of correcting, the error.
With the BSD based awk on macOS, I don't have the asorti() function and only the 1st character of values assigned to RS matters. So, the following is completely untested, but if I understand the GNU awk page correctly, I think the pipeline:
should be replaceable by the following single invocation of awk:
as long as there are no duplicates in the 7th colon separated field in any of the records in your input file. (If there are duplicates, I think all but the last record in each set of duplicates will be missing in the output produced by the above script.)
I would appreciate it if someone with access to GNU awk could try this out with the sample data in post #1 in this thread and let me know if I came close to getting it right.
Hello,
Searched for a while and found some "line-to-column" script. My case is similar but with multiple fields each row:
S02 Length Per
S02 7043 3.864
S02 54477 29.89
S02 104841 57.52
S03 Length Per
S03 1150 0.835
S03 1321 0.96
S03 ... (9 Replies)
input:
ref001, Europe, Belgium, 1001
ref001, Europe, Spain, 203
ref001, Europe, Germany, 457
ref002, America, Canada, 234
ref002, America, US, 87
ref002, America, Alaska, 652
Without using an END section, I need to write all the info related to the same ref number ($1)and continent ($2) on... (9 Replies)
I have searched in a variety of ways in a variety of places but have come up empty.
I would like to prepend a portion of a section header to each following line until the next section header. I have been using sed for most things up until now but I'd go for a solution in just about anything--... (7 Replies)
Hello,
I have a file like this:
FILE.TXT:
(define argc :: int)
(assert ( > argc 1))
(assert ( = argc 1))
<check>
#
(define c :: float)
(assert ( > c 0))
(assert ( = c 0))
<check>
#
now, i want to separate each block('#' is the delimeter), make them separate files, and then send them as... (5 Replies)
Hi Guys...
I am using the following codes in my script:
SID_L=`cat /var/opt/oracle/oratab|grep -v "^#"|cut -f1 -d: -s`
SID_VAR=$SID_L
for SID_RUN in $SID_VAR
do
ORACLE_HOME=`grep ^$SID_RUN /var/opt/oracle/oratab | \
awk -F: '{print $2}'` ;export ORACLE_HOME
export... (2 Replies)
I have a list of Servers in no particular order as follows:
virtualMachines="IIBSBS IIBVICDMS01 IIBVICMA01"And I am generating some output from a pre-existing script that gives me the following (this is a sample output selection).
9/17/2010 8:00:05 PM: Normal backup using VDRBACKUPS... (2 Replies)
I'm Unix. I'm looking at "df" on Unix now and below is an example. It's lists the filesystems out in 512-blocks, I need this in 4k blocks. Is there a way to do this in Unix or do I manually convert and how?
So for container 1 there is 7,340,032 in size in 512-blocks. What would the 4k block be... (2 Replies)
Hi
I have already gone through this topic on this forum, but still i am getting same problem.
I am using solaris 10. my login shell is /usr/bash
i have got a script as below
/home/gyan> cat 3.cm
#!/usr/bin/ksh
export PROG_NAME=rpaa001
if i run this script as below , it works fine... (3 Replies)
Hello all,
Below is what I am trying to accomplish:
I have a file that looks like this
/* ----------------- xxxx.y_abcd_00000050 ----------------- */
jdghjghkla
sadgsdags
asdgsdgasd
asdgsagasdg
/* ----------------- xxxx.y_abcd_00000055 ----------------- */
sdgsdg
sdgxcvzxcbv... (8 Replies)
Hi all
My text file looks like this:
start doc
... (certain number of records)
REC3|Emma|info|
REC3|Lukas|info|
REC3|Arthur|info|
... (certain number of records)
end doc
start doc
... (certain number of records)... (4 Replies)