Replacement to cut command

Tags
advanced, command, cut, cut command, hivescript, parsing, replacement, shell bash

 
Thread Tools Search this Thread
# 1  
Old 10-25-2018
Replacement to cut command

Experts,

Its been a long never programmed on Shell, thought this might be the opportunity to ask your valuable suggestion on one of the challenges I'm going through, regarding the parsing the string to variable with the usage of "CUT"



Code:
#Azure DataLake Path Of the File

DATASET_PATH="adl://xyz123.azuredatalakestore.net/devhdfs/DataWareHouse/sf_inbound/serviceappointment"

#Shell Class

hiveClass () {
hadoop fs -ls ${DATASET_PATH}
}

#Variable that Stores the Complete Path of the File

var=`hiveClass | grep -i "parquet" | cut -d' ' -f15`

Example of Hive Class If executed Explicitly

Code:
spark@hn0-emrazs:~$ DATASET_PATH="adl://xyz123.azuredatalakestore.net/devhdfs/DataWareHouse/sf_inbound/serviceappointment"

spark@hn0-emrazs:~$ hiveClass () {
> hadoop fs -ls ${DATASET_PATH}
> }

spark@hn0-xyz1:~$ var=`hiveClass | grep -i "parquet" | cut -d' ' -f15`

spark@hn0-xyz1:~$ echo "$var"

spark@hn0-xyz1:~$ var=`hiveClass | grep -i "parquet" | cut -d' ' -f14`

spark@hn0-xyz1:~$ echo "$var"

spark@hn0-xyz1:~$ var=`hiveClass | grep -i "parquet" | cut -d' ' -f16`

spark@hn0-xyz1:~$ echo "$var"

spark@hn0-xyz1:~$ var=`hiveClass | grep -i "parquet" | cut -d' ' -f13`

spark@hn0-xyz1:~$ echo "$var"
adl://xyz123.azuredatalakestore.net/devhdfs/DataWareHouse/sf_inbound/appointment/part-00000-aa60bb3c-6780-44fa-b93d-4232df81faa1-c000.snappy.parquet

Raw Execution of the Command will have this Result

Code:
sparksshuser@hn0-xyz1:~$ hadoop fs -ls adl://xyz123.azuredatalakestore.net/devhdfs/DataWareHouse/sf_inbound/appointment

Found 2 items
-rw-r-----+  1 sparksshuser sparksshuser          0 2018-10-25 02:08 adl://xyz123.azuredatalakestore.net/devhdfs/DataWareHouse/sf_inbound/appointment/_SUCCESS

-rw-r-----+  1 sparksshuser sparksshuser     594663 2018-10-25 02:07 adl://xyz123.azuredatalakestore.net/devhdfs/DataWareHouse/sf_inbound/appointment/part-00000-aa60bb3c-6780-44fa-b93d-4232df81faa1-c000.snappy.parquet

Now the Real Challenge is CUT with Field Value. The result will not have constant field value to ensure that I can schedule my script. Every time it changes because of the increase in file size byte.

Code:
var=`hiveClass | grep -i "parquet" | cut -d' ' -f13`

Now My question I wanted to cast complete File Url into Variable, so that I can use this as a feeder into Hive table without using "cut -d' ' -f???"

HTML Code:
adl://xyz123.azuredatalakestore.net/devhdfs/DataWareHouse/sf_inbound/appointment/part-00000-aa60bb3c-6780-44fa-b93d-4232df81faa1-c000.snappy.parquet
# 2  
Old 10-25-2018
Code:
awk '{print $13}'


Last edited by Scrutinizer; 10-25-2018 at 10:43 AM..
# 3  
Old 10-25-2018
$13 is based on empty field if Im not wrong ?

But empty spaces may increase or decrease and more dynamic in nature. This is due to the field "size of the file" = "594663". If size of the file increases the by another 6 digits extra , something like "111222594663" , situation may come I have decrease from $13 to $10/11/12, which is not practically possible when jobs are scheduled and vice versa if the size decreases $13 to $14/15/16. Kindly advise , what would be the best approach.

Code:
-rw-r-----+  1 sparksshuser sparksshuser     594663 2018-10-25 02:07 adl://xyz123.azuredatalakestore.net/devhdfs/DataWareHouse/sf_inbound/appointment/part-00000-aa60bb3c-6780-44fa-b93d-4232df81faa1-c000.snappy.parquet

# 4  
Old 10-25-2018
Hi, try:
Code:
... | grep -i "parquet" | awk '{print $5}'

Instead.

awk - with the default file separator values - clusters whitespace together, whereas cut counts each space character individually.

Last edited by Scrutinizer; 10-25-2018 at 11:46 AM..
# 5  
Old 10-25-2018
Funny, I count: field#8:
Code:
hadoop fs -ls | awk '/parquet/ {print $8}'

Or take the last field:
Code:
hadoop fs -ls | awk '/parquet/ {print $NF}'

If you need case-insensitive you can keep the grep -i
Code:
hadoop fs -ls | grep -iw "parquet" | awk '{print $NF}'


|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Cut command: can't make it cut fields scrutinizerix UNIX for Beginners Questions & Answers 9 04-25-2016 05:34 AM
CUT command tusharzaware1 Shell Programming and Scripting 4 12-06-2015 11:47 AM
Cut command Antony Ankrose Shell Programming and Scripting 3 05-20-2015 07:14 PM
Need help on cut command sumanthupar AIX 5 10-09-2013 08:14 AM
Cut command help talashil Shell Programming and Scripting 6 10-03-2013 03:02 AM
Cut pid from ps using cut command ran ber UNIX for Dummies Questions & Answers 8 03-11-2013 06:37 PM
Cut Command error cut: Bad range dgmm Shell Programming and Scripting 2 05-22-2011 01:56 PM
command to cut nagendramv Shell Programming and Scripting 3 07-22-2009 06:51 AM
Need help with the cut command happyrain UNIX for Dummies Questions & Answers 2 05-14-2009 01:50 PM
Cut Command Help Please Gboy Shell Programming and Scripting 2 11-17-2008 11:27 PM
how to use a command in sed s/match/replacement AbhishekG Shell Programming and Scripting 1 06-10-2008 10:44 AM
cut command everurs789 Shell Programming and Scripting 6 11-20-2007 07:02 AM
cut command sanjay92 UNIX for Dummies Questions & Answers 1 01-17-2003 11:01 AM