Sponsored Content
Top Forums Shell Programming and Scripting Add unique identifier from file to filetype in directory Post 302986517 by cmccabe on Saturday 26th of November 2016 11:17:40 AM
Old 11-26-2016
The full code is below, since I want to process only 1 directory at a time, as you knew, I use the first portion to ensure this.

Code:
#!/bin/bash

# get oldest folder
dir=/home/cmccabe/Desktop/index
{
  read -r -d $'\t' time && read -r -d '' filename
} < <(find "$dir" -maxdepth 1 -mindepth 1 -printf '%T+\t%P\0' | sort -z )
printf "The oldest folder is $filename and was created on $time, analysis was performed using v1.3 of the medex pipeline by $USER at $(date "+%D %r")\n" >> /home/cmccabe/Desktop/index/log

# rename bam
cd /home/cmccabe/Desktop/index/$filename
   rename 's/^([^_]+_[^_]+)_.+$/$1.bam/' *.bam
   
# rename vcf files
cd /home/cmccabe/Desktop/index/$filename
   rename 's/^([^_]+_[^_]+)_.+$/$1.vcf/' *.vcf
   
# rename .bam.bai files
cd /home/cmccabe/Desktop/index/$filename
   rename 's/^([^_]+_[^_]+)_.+$/$1.bam.bai/' *.bam.bai

# add identifier to bam and vcf
BaseDir=/home/cmccabe/Desktop/index  # search dir
TranslationFile=/home/cmccabe/s5_files/identifier/input #input
SubDir=${1:-$filename} # specific subdir

cd "$BaseDir/$SubDir" # look in this folder
printf '%s\n' *.bam *.vcf *.bam.bai | awk '
FNR == NR  {  # process all rows and columns
    if(NF == 2) # 2 columns in input
        old[$2] = $1  # old identifier
    next  # next line
}
{    prefix = substr($0, 1, length($0) - 4)
    if(prefix in old)
        printf("mv \"%s\" \"%s%s\"\n", $0, old[prefix])
    else    printf("# No translation found for \"%s\"\n", $0) #not found
}' "$TranslationFile" - # update from

since there is no suffix in $1 of input I get:

I removed them from the code and added a third file to search .bam.bai

input format
Code:
IonXpress_001 MEC2
IonXpress_002 MEC3
IonXpress_003 MEV48
R_2016_10_21_09_52_37_user_S5-00580-10-Medexome

IonXpress_007 MEV21
IonXpress_008 MEV22
IonXpress_009 MEV23
R_2016_09_21_14_01_15_user_S5-00580-9-Medexome

Code:
# No translation found for "IonXpress_007.bam"
# No translation found for "IonXpress_007.bam"
# No translation found for "IonXpress_007.bam"
# No translation found for "IonXpress_008.vcf"
# No translation found for "IonXpress_008.vcf"
# No translation found for "IonXpress_008.vcf"
# No translation found for "IonXpress_009.bam.bai"
# No translation found for "IonXpress_009.bam.bai"
# No translation found for "IonXpress_009.bam.bai"

Also I am not sure what you mean by output to a shell, as the files in the subdirectory, should be updated with the name from input. I tried to follow your code aand made comments that I hope are correct, but do not quite understand the portion in bold. I think that is what updates the identifiers, but not quite sure.

Example
Code:
IonXpress_007.bam  >>> MEV21.bam   ---- since the IonXpress_007 in the .bam located in the subdir matches $1 of input that .bam file is updated with $2 of input
IonXpress_007.vcf >>> MEV21.vcf   ---- since the IonXpress_007 in the .vcf  located in the subdir matches $1 of input  that .vcf file is updated with $2 of  input
IonXpress_007.bam.bai >>> MEV21.bam.bai  ---- since the IonXpress_007 in the .bam.bai  located in the subdir matches $1 of input  that .bam.bai file is updated with $2 of  input
IonXpress_008.bam  >>> MEV22.bam   ---- since the IonXpress_008 in the .bam  located in the subdir matches $1 of input  that .bam file is updated with $2 of  input
IonXpress_008.vcf >>> MEV22.vcf  ---- since the IonXpress_008 in the .vcf  located in the subdir matches $1 of input  that .vcf file is updated with $2 of  input
IonXpress_008.bam.bai >>> MEV22.bam.bai   ---- since the IonXpress_008 in the .bam.bai  located in the subdir matches $1 of input  that .bam.bai file is updated with $2 of  input
IonXpress_009.bam  >>> MEV23.bam   ---  since the IonXpress_009 in the .bam  located in the subdir matches $1 of input  that .bam file is updated with $2 of  input
IonXpress_009.vcf >>> MEV23.vcf   ---  since the IonXpress_009 in the .vcf  located in the subdir matches $1 of input  that .vcf file is updated with $2 of  input
IonXpress_009.bam.bai >>> MEV23.bam.bai  ---  since the IonXpress_009 in the .bam.bai  located in the subdir matches $1 of input  that .bam.bai file is updated with $2 of  input

Thank you for your help Smilie.

Last edited by cmccabe; 11-26-2016 at 12:30 PM.. Reason: added details
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Directory Inode Number Not Unique

Hi, I know that inode for each file is unique, but is it the for the directory? So far I found different directories has the same inode nubmer when you do ls -i, could some one explain why? Thanks a lot. (9 Replies)
Discussion started by: nj302
9 Replies

2. UNIX for Dummies Questions & Answers

Shell Script Unique Identifier Question

i All I have scripting question. I have a file "out.txt" which is generated by another script the file contains the following my_identifier8859574 logout The number is generated in the script and I have put the my_identifier bit in front of it as a unique identifier I now have... (7 Replies)
Discussion started by: grahambo2005
7 Replies

3. Shell Programming and Scripting

Unique Directory and Folder Deletion Script

Ok, so I just got charged with the task of deleting some 300 user folders in a FTP server to free up some space. I managed to grep and cut the list of user folders to delete into a list of one user folder per line. Example: bob00 jane01 sue03 In the home folder, there are folders a-z, and... (5 Replies)
Discussion started by: b4sher
5 Replies

4. Shell Programming and Scripting

get part of file with unique & non-unique string

I have an archive file that holds a batch of statements. I would like to be able to extract a certain statement based on the unique customer # (ie. 123456). The end for each statement is noted by "ENDSTM". I can find the line number for the beginning of the statement section with sed. ... (5 Replies)
Discussion started by: andrewsc
5 Replies

5. Shell Programming and Scripting

Unique files in a given directory

I keep all my files on a NAS device and copy files from it to usb or local storage when needed. The bad part about this is that I often have the same file on numerous places. I'd like to write a script to check if the files in a given directory exist in another. An example: say I have a... (7 Replies)
Discussion started by: cue
7 Replies

6. Shell Programming and Scripting

Change unique file names into new unique filenames

I have 84 files with the following names splitseqs.1, spliseqs.2 etc. and I want to change the .number to a unique filename. E.g. change splitseqs.1 into splitseqs.7114_1#24 and change spliseqs.2 into splitseqs.7067_2#4 So all the current file names are unique, so are the new file names.... (1 Reply)
Discussion started by: avonm
1 Replies

7. Shell Programming and Scripting

HPUX find string in directory and filetype and replace string

Hi, Here's my dilemma. I need to replace the string Sept_2012 to Oct_2012 in all *config.py files within the current directory and below directories Is this possible? Also I am trying to find all instances of the string Sept_2012 within files in the current directory and below I have... (13 Replies)
Discussion started by: pure_jax
13 Replies

8. Shell Programming and Scripting

Change everything in a file that maps to {module::name.filetype} to _modules/name/applicat

path = content.txt filename = application directory = _modules define create $(eval from := $(shell echo $$1)) \ $(eval to := $(shell echo $$2)) \ sed -i '' 's/$(from)/$(to)/g' content.txt endef all: clear $(eval modules := $(shell egrep -o "{module+\}" $(path))) ... (1 Reply)
Discussion started by: bmson
1 Replies

9. UNIX for Advanced & Expert Users

File command return wrong filetype while file holds group separator char.

hi, I am trying to get the FileType using the File command. I have one file, which holds Group separator along with ASCII character. It's a Text file. But when I ran the File command the FileType is coming as "data". It should be "ASCII, Text file". Is the latest version of File... (6 Replies)
Discussion started by: Arpitak29
6 Replies

10. Shell Programming and Scripting

Bash to create new directory by date followed by identifier and additional subdirectories

I have a bash that downloads a list and if that list has data in it then a new main directory is created (with the date) with several subdirectories (example1, example2, example3). My question is in that list there are portion of specific file types (.vcf.gz) - identifier towards the end that have... (0 Replies)
Discussion started by: cmccabe
0 Replies
All times are GMT -4. The time now is 11:42 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy