Sponsored Content
Full Discussion: Huge files manipulation
Top Forums UNIX for Advanced & Expert Users Huge files manipulation Post 302255554 by chatwizrd on Thursday 6th of November 2008 04:41:33 PM
Old 11-06-2008
Hmm cant you do:

Code:
cat file1 | awk '!L[$0]++' "$@" >> file2

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Comparing two huge files

Hi, I have two files file A and File B. File A is a error file and File B is source file. In the error file. First line is the actual error and second line gives the information about the record (client ID) that throws error. I need to compare the first field (which doesnt start with '//') of... (11 Replies)
Discussion started by: kmkbuddy_1983
11 Replies

2. UNIX for Dummies Questions & Answers

Difference between two huge files

Hi, As per my requirement, I need to take difference between two big files(around 6.5 GB) and get the difference to a output file without any line numbers or '<' or '>' in front of each new line. As DIFF command wont work for big files, i tried to use BDIFF instead. I am getting incorrect... (13 Replies)
Discussion started by: pyaranoid
13 Replies

3. High Performance Computing

Huge Files to be Joined on Ux instead of ORACLE

we have one file (11 Million) line that is being matched with (10 Billion) line. the proof of concept we are trying , is to join them on Unix : All files are delimited and they have composite keys.. could unix be faster than Oracle in This regards.. Please advice (1 Reply)
Discussion started by: magedfawzy
1 Replies

4. Shell Programming and Scripting

Split a huge data into few different files?!

Input file data contents: >seq_1 MSNQSPPQSQRPGHSHSHSHSHAGLASSTSSHSNPSANASYNLNGPRTGGDQRYRASVDA >seq_2 AGAAGRGWGRDVTAAASPNPRNGGGRPASDLLSVGNAGGQASFASPETIDRWFEDLQHYE >seq_3 ATLEEMAAASLDANFKEELSAIEQWFRVLSEAERTAALYSLLQSSTQVQMRFFVTVLQQM ARADPITALLSPANPGQASMEAQMDAKLAAMGLKSPASPAVRQYARQSLSGDTYLSPHSA... (7 Replies)
Discussion started by: patrick87
7 Replies

5. Shell Programming and Scripting

Splitting the Huge file into several files...

Hi I have to write a script to split the huge file into several pieces. The file columns is | pipe delimited. The data sample is as: 6625060|1420215|07308806|N|20100120|5572477081|+0002.79|+0000.00|0004|0001|......... (3 Replies)
Discussion started by: lakteja
3 Replies

6. Shell Programming and Scripting

Compare 2 folders to find several missing files among huge amounts of files.

Hi, all: I've got two folders, say, "folder1" and "folder2". Under each, there are thousands of files. It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command. However, if I change the above question a... (1 Reply)
Discussion started by: jiapei100
1 Replies

7. Shell Programming and Scripting

Comparing 2 huge text files

I have this 2 files: k5login sanwar@systems.nyfix.com jjamnik@systems.nyfix.com nisha@SYSTEMS.NYFIX.COM rdpena@SYSTEMS.NYFIX.COM service/backups-ora@SYSTEMS.NYFIX.COM ivanr@SYSTEMS.NYFIX.COM nasapova@SYSTEMS.NYFIX.COM tpulay@SYSTEMS.NYFIX.COM rsueno@SYSTEMS.NYFIX.COM... (11 Replies)
Discussion started by: linuxgeek
11 Replies

8. Shell Programming and Scripting

Compression - Exclude huge files

I have a DB folder which sizes to 60GB approx. It has logs which size from 500MB - 1GB. I have an Installation which would update the DB. I need to backup this DB folder, just incase my Installation FAILS. But I do not need the logs in my backup. How do I exclude them during compression (tar)? ... (2 Replies)
Discussion started by: DevendraG
2 Replies

9. UNIX for Dummies Questions & Answers

File comparison of huge files

Hi all, I hope you are well. I am very happy to see your contribution. I am eager to become part of it. I have the following question. I have two huge files to compare (almost 3GB each). The files are simulation outputs. The format of the files are as below For clear picture, please see... (9 Replies)
Discussion started by: kaaliakahn
9 Replies

10. Shell Programming and Scripting

Aggregation of Huge files

Hi Friends !! I am facing a hash total issue while performing over a set of files of huge volume: Command used: tail -n +2 <File_Name> |nawk -F"|" -v '%.2f' qq='"' '{gsub(qq,"");sa+=($156<0)?-$156:$156}END{print sa}' OFMT='%.5f' Pipe delimited file and 156 column is for hash totalling.... (14 Replies)
Discussion started by: Ravichander
14 Replies
GMT_SHELL_FUNCTIONS.SH(1gmt)				       Generic Mapping Tools				      GMT_SHELL_FUNCTIONS.SH(1gmt)

NAME
gmt_shell_functions.sh - Practical functions to be used in GMT bourne shell scripts SYNOPSIS
gmt_init_tmpdir gmt_remove_tmpdir gmt_clean_up [prefix] gmt_message message gmt_abort message gmt_nrecords file(s) gmt_nfields string gmt_get_field string gmt_get_region file(s) [options] gmt_get_gridregion file [options] gmt_get_map_width -R -J gmt_get_map_height -R -J gmt_set_psfile file gmt_set_framename prefix framenumber gmt_set_framenext framenumber DESCRIPTION
gmt_shell_functions.sh provides a set of functions to Bourne (again) shell scripts in support of GMT. The calling shell script should include the following line, before the functions can be used: . gmt_shell_functions.sh Once included in a shell script, gmt_shell_functions.sh allows GMT users to do some scripting more easily than otherwise. The functions made available are: gmt_init_tmpdir Creates a temporary directory in /tmp or (when defined) in the directory specified by the environment variable TMPDIR. The name of the temporary directory is returned as environment variable GMT_TMPDIR. This function also causes GMT to run in `isolation mode', i.e. all temporary files will be created in GMT_TMPDIR and the .gmtdefaults file will not be adjusted. gmt_remove_tmpdir Removes the temporary directory and unsets the GMT_TMPDIR environment variable. gmt_cleanup Remove all files and directories in which the current process number is part of the file name. If the optional prefix is given then we also delete all files and directories that begins with the given prefix. gmt_message Send a message to standard error. gmt_abort Send a message to standard error and exit the shell. gmt_nrecords Returns the total number of lines in file(s) gmt_nfields Returns the number of fields or words in string gmt_get_field Returns the given field in a string. Must pass string between double quotes to preserve it as one item. gmt_get_region Returns the region in the form w/e/s/n based on the data in table file(s). Optionally add -Idx/dy to round off the answer. gmt_get_gridregion Returns the region in the form w/e/s/n based on the header of a grid file. Optionally add -Idx/dy to round off the answer. gmt_map_width Expects the user to give the desired -R -J settings and returns the map width in the current measurement unit. gmt_map_height Expects the user to give the desired -R -J settings and returns the map height in the current measurement unit. gmt_set_psfile Create the output PostScript file name based on the base name of a given file (usually the script name $0). gmt_set_framename Returns a lexically ordered filename stem (i.e., no extension) given the file prefix and the current frame number, using a width of 6 for the integer including leading zeros. Useful when creating animations and lexically sorted filenames are required. gmt_set_framenext Accepts the current frame integer counter and returns the next integer counter. NOTES
1. These functions only work in the bourne shell (sh) and their derivatives (like ash, bash, ksh and zsh). These functions do not work in the C shell (csh) or their derivatives (like tcsh), and cannot be used in DOS batch scripts either. 2. gmt_shell_functions.sh were first introduced in GMT version 4.2.2 and have since been regularly expanded with other practical scripting short-cuts. If you want to suggest other functions, please do so by mailing to the GMT mailing list: gmt-help@lists.hawaii.edu. SEE ALSO
GMT(1), sh(1), bash(1), minmax(1), grdinfo(1) GMT 4.5.7 15 Jul 2011 GMT_SHELL_FUNCTIONS.SH(1gmt)
All times are GMT -4. The time now is 04:54 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy