Replacement to cut command


Login or Register to Reply

 
Thread Tools Search this Thread
# 1  
Replacement to cut command

Experts,

Its been a long never programmed on Shell, thought this might be the opportunity to ask your valuable suggestion on one of the challenges I'm going through, regarding the parsing the string to variable with the usage of "CUT"



Code:
#Azure DataLake Path Of the File

DATASET_PATH="adl://xyz123.azuredatalakestore.net/devhdfs/DataWareHouse/sf_inbound/serviceappointment"

#Shell Class

hiveClass () {
hadoop fs -ls ${DATASET_PATH}
}

#Variable that Stores the Complete Path of the File

var=`hiveClass | grep -i "parquet" | cut -d' ' -f15`

Example of Hive Class If executed Explicitly

Code:
spark@hn0-emrazs:~$ DATASET_PATH="adl://xyz123.azuredatalakestore.net/devhdfs/DataWareHouse/sf_inbound/serviceappointment"

spark@hn0-emrazs:~$ hiveClass () {
> hadoop fs -ls ${DATASET_PATH}
> }

spark@hn0-xyz1:~$ var=`hiveClass | grep -i "parquet" | cut -d' ' -f15`

spark@hn0-xyz1:~$ echo "$var"

spark@hn0-xyz1:~$ var=`hiveClass | grep -i "parquet" | cut -d' ' -f14`

spark@hn0-xyz1:~$ echo "$var"

spark@hn0-xyz1:~$ var=`hiveClass | grep -i "parquet" | cut -d' ' -f16`

spark@hn0-xyz1:~$ echo "$var"

spark@hn0-xyz1:~$ var=`hiveClass | grep -i "parquet" | cut -d' ' -f13`

spark@hn0-xyz1:~$ echo "$var"
adl://xyz123.azuredatalakestore.net/devhdfs/DataWareHouse/sf_inbound/appointment/part-00000-aa60bb3c-6780-44fa-b93d-4232df81faa1-c000.snappy.parquet

Raw Execution of the Command will have this Result

Code:
sparksshuser@hn0-xyz1:~$ hadoop fs -ls adl://xyz123.azuredatalakestore.net/devhdfs/DataWareHouse/sf_inbound/appointment

Found 2 items
-rw-r-----+  1 sparksshuser sparksshuser          0 2018-10-25 02:08 adl://xyz123.azuredatalakestore.net/devhdfs/DataWareHouse/sf_inbound/appointment/_SUCCESS

-rw-r-----+  1 sparksshuser sparksshuser     594663 2018-10-25 02:07 adl://xyz123.azuredatalakestore.net/devhdfs/DataWareHouse/sf_inbound/appointment/part-00000-aa60bb3c-6780-44fa-b93d-4232df81faa1-c000.snappy.parquet

Now the Real Challenge is CUT with Field Value. The result will not have constant field value to ensure that I can schedule my script. Every time it changes because of the increase in file size byte.

Code:
var=`hiveClass | grep -i "parquet" | cut -d' ' -f13`

Now My question I wanted to cast complete File Url into Variable, so that I can use this as a feeder into Hive table without using "cut -d' ' -f???"

HTML Code:
adl://xyz123.azuredatalakestore.net/devhdfs/DataWareHouse/sf_inbound/appointment/part-00000-aa60bb3c-6780-44fa-b93d-4232df81faa1-c000.snappy.parquet
# 2  
Code:
awk '{print $13}'


Last edited by Scrutinizer; 10-25-2018 at 09:43 AM..
# 3  
$13 is based on empty field if Im not wrong ?

But empty spaces may increase or decrease and more dynamic in nature. This is due to the field "size of the file" = "594663". If size of the file increases the by another 6 digits extra , something like "111222594663" , situation may come I have decrease from $13 to $10/11/12, which is not practically possible when jobs are scheduled and vice versa if the size decreases $13 to $14/15/16. Kindly advise , what would be the best approach.

Code:
-rw-r-----+  1 sparksshuser sparksshuser     594663 2018-10-25 02:07 adl://xyz123.azuredatalakestore.net/devhdfs/DataWareHouse/sf_inbound/appointment/part-00000-aa60bb3c-6780-44fa-b93d-4232df81faa1-c000.snappy.parquet

# 4  
Hi, try:
Code:
... | grep -i "parquet" | awk '{print $5}'

Instead.

awk - with the default file separator values - clusters whitespace together, whereas cut counts each space character individually.

Last edited by Scrutinizer; 10-25-2018 at 10:46 AM..
# 5  
Funny, I count: field#8:
Code:
hadoop fs -ls | awk '/parquet/ {print $8}'

Or take the last field:
Code:
hadoop fs -ls | awk '/parquet/ {print $NF}'

If you need case-insensitive you can keep the grep -i
Code:
hadoop fs -ls | grep -iw "parquet" | awk '{print $NF}'

Login or Register to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Cut command: can't make it cut fields
scrutinizerix
I'm a complete beginner in UNIX (and not a computer science student either), just undergoing a tutoring course. Trying to replicate the instructions on my own I directed output of the ls listing command (lists all files of my home directory ) to My_dir.tsv file (see the screenshot) to make use of...... UNIX for Beginners Questions & Answers
9
UNIX for Beginners Questions & Answers
Cut pid from ps using cut command
ran ber
hay i am trying to get JUST the PID from the ps command. my command line is: ps -ef | grep "mintty" | cut -d' ' -f2 but i get an empty line. i assume that the delimiter is not just one space character, but can't figure out what should i do in order to do that. i know i can use awk or cut...... UNIX for Dummies Questions & Answers
8
UNIX for Dummies Questions & Answers
Cut Command error cut: Bad range
dgmm
Hi Can anyone what I am doing wrong while using cut command. for f in *.log do logfilename=$f Log "Log file Name: $logfilename" logfile1=`basename $logfilename .log` flength=${#logfile1} Log "file length $flength" from_length=$(($flength - 15)) Log "from...... Shell Programming and Scripting
2
Shell Programming and Scripting
how to use a command in sed s/match/replacement
AbhishekG
hi, how can i make use of a command in the replacement segment.. cat a | sed '/^*]\{3\}$/{ s/\(.*\)/REPLACEMENT/g }' suppose if I want to use a awk command in the replacement section , how to achieve that ? Thanks... Shell Programming and Scripting
1
Shell Programming and Scripting