Shell Scripting | Return list of unique characters in files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Shell Scripting | Return list of unique characters in files
# 1  
Old 11-28-2016
Question Shell Scripting | Return list of unique characters in files

Hi,

I am trying to script the below, but I am not very good at it Smilie

Your help would be greatly appreciated.

1. read all files in the directory in strings
Code:
strings *.*

2. in each file, for each line that contains "ABCD", store characters located at position 521 and 522 of this line (this is where I am stuck)

3. once all files have been read, print a list of unique values (I guess I would have to use uniq).
# 2  
Old 11-28-2016
This specification is far from clear or complete. Please explain again in detail, supported by input and output samples, where you come from and what you want to achieve.
# 3  
Old 11-28-2016
Dear clippertm,

I have a few to questions pose in response first:-
  • Is this homework/assignment? There are specific forums for these.
  • What have you tried so far?
  • What output/errors do you get?
  • What OS and version are you using?
  • What are your preferred tools? (C, shell, perl, awk, etc. & strings, of course!)
  • What logical process have you considered? (to help steer us to follow what you are trying to achieve)
Most importantly, What have you tried so far?

There are probably many ways to achieve most tasks, so giving us an idea of your style and thoughts will help us guide you to an answer most suitable to you so you can adjust it to suit your needs in future.


We're all here to learn and getting the relevant information will help us all.


Kind regards,
Robin
# 4  
Old 11-28-2016
It's difficult to tell you how you to accomplish a task, if you don't even specify what programming language you are going to use.

As to your question regarding extracting a single character:

Assuming that you have read the line in question to a shell variable, extracting a character at a certain position from a variable can be done in bash or zsh with
Code:
    ${line:522:1}

In Zsh, you also have the option to write
Code:
    ${line[521,521]}

In your case, you might also consider to not solve this within the respective shell language, but pipe the selected lines into 'cut'. Given that your problem description is somewhat fuzzy, I can't recommend which solution is the better one. In any case, have a look at the the man-page for 'cut'.

Last edited by rbatte1; 11-29-2016 at 05:33 AM.. Reason: Added CODE tags
# 5  
Old 11-28-2016
Hi,

Thank you all for your replies. My apologies if I was not not specific enough, I will do my best. No, this is not homework, just a bunch of files I need to analyse for one of my hobbies. I usually use a combination of strings and grep, but it would just take too long this time. Here is what I typically use:
Code:
strings *.* | egrep --color 'ABC.{521}10'

I guess you call it shell, correct? My environment is Cygwin. It shows me all lines (string formatted) starting with 'ABC' with value 10 at the 134 position. It allows me to see if value 10 occurred in this bunch of files. I usually only look for a few values, like 10, 11 and 20, so it does not take long.

The issue is that there are now too many files and too many values, which range from 00 to 99. The output I am looking for would simply be:
Code:
10
11
20
25
26

which is a list of values that occurred at the 134 position (without duplicates). I do not know if this can be achieve with shell. I do not mind at all trying new things like awk and perl.

Last edited by clippertm; 11-28-2016 at 09:33 PM..
# 6  
Old 11-28-2016
You could try this:

Code:
strings *.* | grep "ABCD" | cut -c 521-522 | egrep '(10|11|20)' | sort | uniq

If your list of values is quite long you could put them in a file with 1 line per value and use -f option of grep like this:

Code:
strings *.* | grep "ABCD" | cut -c 521-522 | grep -F -f want.txt | sort | uniq

-F is for fixed string matching (faster than using regex) and -f <filename> fetches list of matching strings from file filename

Last edited by Chubler_XL; 11-28-2016 at 09:31 PM..
This User Gave Thanks to Chubler_XL For This Post:
# 7  
Old 11-28-2016
Code:
'ABC.{134}10'
ABC = 3
.{134} = 134
10 = 2

The 1 from the number 10 is found WITHIN a string of 138 matched elements and the 0 from 10 will be WITHIN a string of 139 matches.
There is no anchor that says that ABC is the start of the line, but the start of the matched string.

I do not see how it can work with what you said:
Quote:
'ABC' with value 10 at the 134 position
In that case, it would be something like grep -E '^ABC.{130}(1[01]|2[056])' *

Last edited by rbatte1; 11-29-2016 at 05:35 AM.. Reason: Corrected CODE tags
This User Gave Thanks to Aia For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Shell Scripting - Select multiple files from numbered list

I am trying to have the user select two files from a numbered list which will eventually be turned into a variable then combined. This is probably something simple and stupid that I am doing. clear echo "Please Select the Show interface status file" select FILE1 in *; echo "Please Select the... (3 Replies)
Discussion started by: dis0wned
3 Replies

2. Shell Programming and Scripting

Find a string and then return the next 20 characters in multiple files

Hello all, I have a directory with 2000+ files. I need to look in each file for an invoice number. To identify this, i can search for the string 'BIG' and then retrieve the next 30 characters. I was thinking awk for this, but not sure how to do it. Each file contains one long string and in... (8 Replies)
Discussion started by: jdinero
8 Replies

3. Shell Programming and Scripting

compare 2 files and return unique lines in each file (based on condition)

hi my problem is little complicated one. i have 2 files which appear like this file 1 abbsss:aa:22:34:as akl abc 1234 mkilll:as:ss:23:qs asc abc 0987 mlopii:cd:wq:24:as asd abc 7866 file2 lkoaa:as:24:32:sa alk abc 3245 lkmo:as:34:43:qs qsa abc 0987 kloia:ds:45:56:sa acq abc 7805 i... (5 Replies)
Discussion started by: anurupa777
5 Replies

4. Shell Programming and Scripting

Shell Scripting Function call return value

Hi I have a function : Make_Report() { trx_report=`sqlplus -s $conn_str << @@ set echo off; set pages 0; set feedback off; set verify off; select srv_trx_s_no,... (1 Reply)
Discussion started by: neeraj617
1 Replies

5. Shell Programming and Scripting

Return error if - or certain characters are present in a list of strings

I have a list of strings, for example: set strLst = "file1 file2 file3 file4" I want to log an error if some of the fields happen to begin with -, or have characters like ; : ' , ? ] { = Which means for example setting set ierr = 1 (2 Replies)
Discussion started by: kristinu
2 Replies

6. UNIX for Dummies Questions & Answers

remove characters from list of files

done some homework on this-- after i remove up to and including the ) i want to take newfile.txt and use that list to remove the files from a file in my the directory pwd i have a input.txt file cat input,txt 1)mary.jpg 12)john.jpg 100)frankkfkdf .jpg i want to remove the characters in the... (1 Reply)
Discussion started by: plener
1 Replies

7. Shell Programming and Scripting

Help to find string and return following characters from list of files

Hi, I'm fairly new to UNIX, but hopefully some-one can help me with this: I am using the following code to find files with the name "example.xml": find . -name "example.xml" -print that would print me a list like the example here: ./dir1/dir2/example.xml... (5 Replies)
Discussion started by: boijie
5 Replies

8. Shell Programming and Scripting

return a list of unique values of a column from csv format file

Hi all, I have a huge csv file with the following format of data, Num SNPs, 549997 Total SNPs,555352 Num Samples, 157 SNP, SampleID, Allele1, Allele2 A001,AB1,A,A A002,AB1,A,A A003,AB1,A,A ... ... ... I would like to write out a list of unique SNP (column 1). Could you... (3 Replies)
Discussion started by: phoeberunner
3 Replies

9. Shell Programming and Scripting

How to capture return value from java in shell scripting

Hi All, My shell script will call a java component with some arguments , the java component returns a string value to the shell script. How to assign the return value to the shell variable. Here is the sample code. In my shell script i am calling the java as fallows. --exporting... (1 Reply)
Discussion started by: rajeshorpu
1 Replies

10. Shell Programming and Scripting

Comparing 2 files and return the unique lines in first file

Hi, I have 2 files file1 ******** 01-05-09|java.xls| 02-05-08|c.txt| 08-01-09|perl.txt| 01-01-09|oracle.txt| ******** file2 ******** 01-02-09|windows.xls| 02-05-08|c.txt| 01-05-09|java.xls| 08-02-09|perl.txt| 01-01-09|oracle.txt| ******** (8 Replies)
Discussion started by: shekhar_v4
8 Replies
Login or Register to Ask a Question