Sponsored Content
Top Forums Shell Programming and Scripting Split a file using 2-D indexing system Post 302775461 by Don Cragun on Tuesday 5th of March 2013 12:38:40 AM
Old 03-05-2013
If the awk on your system only supports single character settings for RS, or if you'd like to base the output filenames on the input filenames, be able to specify more than one input file, and be able to specify the number of files to be produced before updating the value of the 1st numeric value in the output filename, you could try the following script:
Code:
#!/bin/ksh
cnt=3
Usage="Usage: $(basename $0) [-n cnt] file..."
# Split input file(s) into files named file.X.Y where X and Y reset to 1
# and 1, respectively, for each file operand.  A new file is created
# when a line in an input file starts with a <greater-than> character
# (">").  Lines starting with a <greater-than> character are not
# included in any of the output files, but all other lines are copied 
# unchanged into the corresponding output file.  When a new file is
# created, Y is incremented until it exceeds cnt (which defaults to 3 if
# the -n option is not given on the command line.  When Y exceeds cnt, X
# is incremented and Y is reset to 1.
while getopts n: opt
do      case $opt in
        (n)     cnt="$OPTARG";;
        (?)     echo "$Usage" >&2
                exit 1
        esac
done
shift $(($OPTIND - 1))
if [ $# -lt 1 ]
then    echo "$(basename $0): At least one file operand is required." >&2
        echo "$Usage" >&2
        exit 2
fi
awk -v cnt=$cnt '
FNR == 1 {
        # This is the first record of a new input file.
        # If this is not the first input file, close the last output file for
        # the previous input file.
        if(NR != FNR) close(fn)
        # Create output filename based on input filename.
        x = y = 1
        fn = FILENAME "." x "." y
}
/^>/ {  # Close current output file
        close(fn)
        if(y == cnt) {
                y = 1
                x++
        } else  y++
        fn = FILENAME "." x "." y
        next
}
{       print > fn
}' "$@"

It uses the Korn shell, but will also work with any other shell that accepts parameter expansions specified by the POSIX Standards (including bash).

Note that if the first line in an input file or two or more adjacent lines in an input file start with a >, empty files will not be created; the corresponding filename will just be skipped.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Split a file with no pattern -- Split, Csplit, Awk

I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this. For example: split -l 3000000 filename.txt This is very slow and it splits the file with 3 million records in each... (10 Replies)
Discussion started by: madhunk
10 Replies

2. Shell Programming and Scripting

Array indexing in shell

Hi , I have 4 array as below Input: servernames=(10.144.0.129 10.144.0.130 10.144.0.131) subfolder_129=(PSTN_SigtranCamel_03 PSTN_SigtranCamel_04 PSTN_SigtranCamel_05) subfolder_130=(SigtranCamel_11 SigtranCamel_12 SigtranCamel_13 SigtranCamel_14 SigtranCamel_15)... (4 Replies)
Discussion started by: sushmab82
4 Replies

3. Shell Programming and Scripting

[ask]filtering file to indexing...

dear all, i have file with format like this file_master.txt 20110212|231213|rio|apri|23112|222222 20110212|312311|jaka|dino|31223|543234 20110301|343322|alfan|budi|32131|333311 ... i want filter with output like this index_nm.txt rio|apri jaka|dino ... index_years.txt 20110212... (7 Replies)
Discussion started by: zvtral
7 Replies

4. Shell Programming and Scripting

indexing list of words in a file

Hey all, I'm doing a project currently and want to index words in a webpage. So there would be a file with webpage content and a file with list of words, I want an output file with true and false that would show which word exists in the webpage. example: Webpage content data.html ... (2 Replies)
Discussion started by: Johanni
2 Replies

5. Shell Programming and Scripting

indexing a file

hello guys, I have a file like this: input.dat Push-to-talk No Coonection IP support Support for IP telephony Yes Built-in SIP stack Yes Support via software Yes Microsoft Support for Microsoft Exchange Yes UMA (5 Replies)
Discussion started by: Johanni
5 Replies

6. UNIX for Dummies Questions & Answers

awk, array indexing

cat filename|nawk ' { FS="="; if (!a++ == 0) print $0 } ' can anyone plz explain how does array inexing works,how it is evaluating if (!a++ == 0)?? (2 Replies)
Discussion started by: dreamzalive
2 Replies

7. Shell Programming and Scripting

Indexing Variable Names

Hi All I think I might have bitten off more than I can chew here and I'm hoping some of you guys with advanced pattern matching skills can help me. What I want to do is index the occurrence of variable names within a library of scripts that I have. Don't ask why, I'm just sad like that... ... (3 Replies)
Discussion started by: bbq
3 Replies

8. UNIX for Dummies Questions & Answers

Single Liner for indexing

Hello, This is pretty simple, I`m looking for a faster and better method than brute force that I`m doing. I have a 20GB file looks like Name1,Var1,Val1 Name1,Var2,Val2 Name2,Var1,Val3 Name2,Var2,Val4 I want 3 files. Nameindex 1 Name1 2 Name2 ... (2 Replies)
Discussion started by: senhia83
2 Replies

9. Solaris

Split a big file system to several files

Gents Actually I have question and i need your support. I have this NAS file system mounted as /coresys has size of 7 TB I need to Split this file system into several file systems as mount points I mean how to can I Split it professionally to different NAS mount points how to can I decide... (2 Replies)
Discussion started by: AbuAliiiiiiiiii
2 Replies

10. Solaris

Split huge File System

Gents I have huge NAS File System as /sys with size 10 TB and I want to Split each 1TB in spirit File System to be mounted in the server. How to can I do that without changing anything in the source. Please your support. (1 Reply)
Discussion started by: AbuAliiiiiiiiii
1 Replies
dat.conf(4)							   File Formats 						       dat.conf(4)

NAME
dat.conf - DAT static registry SYNOPSIS
/etc/dat/dat.conf DESCRIPTION
The DAT static registry, /etc/dat/dat.conf is a system-wide data resource maintained by the system administrative command datadm(1M). /etc/dat/dat.conf contains a list of interface adapters supported by uDAPL service providers. An interface adapter on Infiniband (IB) cor- responds to an IPoIB device instance, for example, ibd0. An IPoIB device name represents an IP interface plumbed by ifconfig(1M) on an IB partition/Host Channel Adapter port combination. Each entry in the DAT static registry is a single line that contains eight fields. Fields are separated by a SPACE. Lines that begin with a pound sign (#) are considered comments. All characters that follow the # are ignored. Enclose Solaris specific strings (Solaris_spe- cific_string) and service provider's instance data (service _provider_instance_data) in quotes. The following shows the order of the fields in a dat.conf entry: "interface_adapter_name" "API_version" "threadsafe | nonthreadsafe" "default | nondefault" "service_provider_library_pathname" "service_provider_version" "service _provider_instance_data" "Solaris_specific_string" The fields are defined as follows: interface_adapter_name Specifies the Interface Adapter (IA) name. In IB, this is the IPoIB device instance name, for example, ibd0. This represents an IP interface plumbed on an IB partition/port combination of the HCA. API_version Specifies the API version of the service provide library: For example, "u"major.minor is u1.2. threadsafe | nonthreadsafe Specifies a threadsafe or non-threadsafe library. default | nondefault Specifies a default or non-default version of library. A service provider can offer several versions of the library. If so, one version is designated as default with the rest as nondefault. service_provider_library_pathname Specifies the pathname of the library image. service_provider_version Specifies the version of the service provider. By convention, specify the company stock symbol as the service provider, followed by major and minor version numbers, for example, SUNW1.0. service _provider_instance_data Specifies the service provider instance data. Solaris_specific_string Specifies a platform specific string, for example, the device name in the service_provider.conf file. EXAMPLES
Example 1 Sample dat.conf File The following dat.conf file shows a uDAPL 1.2 service provider for tavor, udapl_tavor.so.1 supporting two interfaces, ibd0 and ibd1: # # dat.conf for uDAPL 1.2 # ibd0 u1.2 nonthreadsafe default udapl_tavor.so.1 SUNW.1.0 "" "driver_name=tavor" ibd1 u1.2 nonthreadsafe default udapl_tavor.so.1 SUNW.1.0 "" "driver_name=tavor" ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWudaplr | +-----------------------------+-----------------------------+ |Interface Stability |Standard | +-----------------------------+-----------------------------+ SEE ALSO
datadm(1M), ifconfig(1M), libdat(3LIB), service_provider.conf(4), attributes(5) NOTES
An empty dat.conf is created during the package SUNWudaplr installation if no file is present beforehand. Entries in the file are added or removed by running datadm(1M). The content of the platform specific string does not constitute an API. It is generated by datadm(1M) and might have a different content or interpretation in a future release. SunOS 5.11 18 Jun 2004 dat.conf(4)
All times are GMT -4. The time now is 06:08 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy