I have a list file1 like
I want to classify the elements of file1 based on constrains applied on file2. Additionally the number of elements (words) in the each line of file2 is not fixed. This is my file2
explanation of position constrains:
- For searching first line of file1 "dog" in file 2, first its is needed to check if is first or last element of the line1 in file2 then again if is first or last element of the line2 in file2 and so on till the end of file. If the position of "dog" occurs every time (whenever it occurs) as a first or last position then it will go to a new file called file_external. If not then it will go to new file file_internal. In this case dog also comes at line 3 position 4 so it will move to file_internal.
- The second element in file1 is cow. In the file2 cow is every-time at 1st or last position (whenever occurs), so cow will append to file_external.
- fox comes every time in middle positions so it will be in file_internal.
- cat comes in middle, end and first positions, so will append to file_internal.
- fish and duck will move to file_external as these are strictly as first or last element.
- crow will fly into file_internal.
I am looking for help to do this with awk/perl.
I am using ubuntu and windows. I am a biologist and do experiments most of the time but sometimes I write programs to analyze my data. But not efficient in programming. This is difficult for me to write. I tried but could not produce correct outputs. This is a sample file I prepared to test. I need to run this on a very big dataset.
If the position of "dog" occurs every time (whenever it occurs) as a first or last position then it will go to a new file called file_external. If not then it will go to new file file_internal.
Quote:
cat comes in middle, end and first positions, so will append to file_internal.
To me, this is not clear... dog occurs as a first and/or last position and goes to file_ext, if not, he goes to file_int. Question: Why does the cat go to file_int no matter of the position?
Anyhow, you might want to try this code:
Demo:
If the output is correct, in the awk END section you can simply change print e to print e >> "file.ext" and print i to print i >> "file.int" to redirect the output to appropriate files.
The logic I used was to test for target (file1) in each line (file2); if target
exists and is not in first or last position, then it has to be internal. Print and get next
target.
This assumes that all items in file1 do always exist in file2.
Last edited by ongoto; 12-11-2014 at 05:20 PM..
Reason: opted for 'while read X' over 'for X'
Hi All,
I'm having a hard time finding a starting point for my issue. I have a 30k line file (fspsec.txt) that I would like to parse into smaller files based on any character existing in field 1.
ACCOUNTANT LEVEL 1 (ACCT.ACCOUNTANT)
OPERATORS: DOEJO (418)
TOOLS: Branch Maintenance
... (2 Replies)
I have a file1 like
ID E2AK1_HUMAN Reviewed; 630 AA.
CC -!- SUBCELLULAR LOCATION: Host nucleus {ECO:0000305}.
ID E1A_ADEM1 Reviewed; 200 AA.
ID E1A_ADES7 Reviewed; 266 AA.
CC -!- SUBCELLULAR LOCATION: Host nucleus... (8 Replies)
Hello,
I have a XML file and need to update the data for a specific XML Attribute in the file. I need a Perl or Awk command to look for <INTERCHANGE_CONTROL_NO>000000601</INTERCHANGE_CONTROL_NO>
in the XML file and change the first two 0 of the value to 9.
For instance ... (4 Replies)
Gurus,
I am relatively new to Unix scripting and am struck with a problem in my script. I have positional input file which has a FLAG indicator in at position 11 in every record of the file.
If the Flag has value =Y, then the record from the input needs to be written to a new file.However if... (3 Replies)
I have a datafile that is formatted as fixed.
I know that each line should contain 880 characters.
I want to separate the file into 2 files, one that has lines with 880 characters and the other file with everything else.
Is this possible ? (9 Replies)
I'm still up trying to figure this out and it is driving me nuts.
I have a log file which has a basic format of this...
2010-10-10 22:25:42
Init block 'UA Deployment Date': Dynamic refresh of repository scope variables has failed.
The ODBC function has returned an error. The database... (4 Replies)
Hi,
I need to split positional flat file, based on value at position 43-45.( in red "410")
Example:
12345678907886421689 200920184820410200920020092002007
12345678907886421689 200920184820411200920020092002007
12345678907886421689 200920184820411200920020092002007... (6 Replies)
Hi,
I have one flat file with delimited field as pipe(|) symbol. The file contains header,detail lines. Header is the first line in the file.
I want to read the value for the position from 15 to 18 in first line of the file.
Pls help me to get the value from position 15 to 18 in... (3 Replies)
Hey guys,
I have this file generated by me... i want to create some HTML output from it.
The problem is that i am really confused about how do I go about reading the file.
The file is in the following format:
TID1 Name1 ATime=xx AResult=yyy AExpected=yyy BTime=xx BResult=yyy... (8 Replies)