Shell Script for HDFS Ingestion Using JDBC


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Shell Script for HDFS Ingestion Using JDBC
# 1  
Old 10-23-2018
Shell Script for HDFS Ingestion Using JDBC

Peers,

I was in process of building a script that connects to salesforce using jdbc and pull the data using spark and process in hive table. During this process I have encountered a problem where and variable assigned with hadoop command that list files in Azure Data lake is not parsing the value and to variable and in return the it's parsing null and hadoop command lists local file infact. Is there a work around where someone has faced similar such situation.

Code:
spark@hn0-xyz:~$ hadoop fs -ls adl://ayz.xyz12345.net/hdfs/DataWareHouse/salesforce_jars/DataWareHouse.jar

-rwxrwx---+  1 spark  spark      98011 2018-10-23 12:07 adl://ayz.xyz12345.net/hdfs/DataWareHouse/salesforce_jars/DataWareHouse.jar

DataWareHouse.jar is the result I was looking at it . Below is script that does pickup similar few jars and parse that as a command to spark submit job.


Code:
#!/bin/bash

#Identification for Salesforce Class and Object
now () {
 date -d "4 hours" "+Date: %Y-%m-%d Time: %H:%M:%S"
}
echo " -----------------Job Run Time------------------------------"
echo " `now` "
echo " Spark Job For Salesforce Account Object %n Class for Account ${SALESFORCE_ACCOUNT_CLASS} "
echo " -----------------------------------------------------------"

#Check All Maven Build Configs

set MVN_LIB_PATH = "adl://ayz.xyz123.net/hdfs/DataWareHouse/salesforce_jars/DataWareHouse.jar"
mvnClass () {
hadoop fs -ls $MVN_LIB_PATH
}
mvnClass 
if [ echo $? == 0]
  then
    echo "Finding Maven Build Jar is Successful"
elif
    echo " Aborting the Job Process"
fi

#Checking for Driver and Executor Class

set JDBC_LIB_PATH = "adl://ayz.xyz123.net/hdfs/DataWareHouse/salesforce_jars/sforce.jar"
jdbcClass () {
hadoop fs -ls $JDBC_LIB_PATH 
}
jdbcClass
if [ echo $? == 0]
then
    echo " Finding JDBC Driver and Executor Jar is Successfull"
elif
    echo " Aborting the Job Process"
fi

#Compiling Spar Submit for Spark API


set SALESFORCE_ACCOUNT_CLASS = "--class com.yxzar.property.SalesForceAccount"
set ENVIRONMENT = "--master yarn"
set DEPLOY_MODE = "--deploy-mode client"
set EXECUTOR_CLASS = "--conf "spark.executor.extraClassPath=adl://ayz.xyz123.net/hdfs/DataWareHouse/salesforce_jars/sforce.jar""
set DRIVER_CLASS = "--conf "spark.driver.extraClassPat=adl://ayz.xyz123.net/emaardevhdfs/DataWareHouse/salesforce_jars/sforce.jar""
set CONNECTOR_JAR = "--jars adl://ayz.xyz123.net/hdfs/DataWareHouse/salesforce_jars/sforce.jar"
set MVN_JAR = "--verbose adl://ayz.xyz123.net/hdfs/DataWareHouse/salesforce_jars/DataWareHouse.jar"

sparkSubmit (){
 spark-submit ${SALESFORCE_ACCOUNT_CLASS} ${ENVIRONMENT} ${DEPLOY_MODE} ${EXCUTOR_CLASS} ${DRIVER_CLASS} ${MVN_JAR}
}

sparkSubmit 2>~/stdout

From the above code

Code:
mvnClass 

jdbcClass

Resulting invalid output. Any help would be appreciated Smilie




Moderator's Comments:
Mod Comment Please use CODE (not HTML) tags as required by forum rules!

Last edited by RudiC; 10-23-2018 at 11:46 AM.. Reason: Changed HTML to CODE tags.
# 2  
Old 10-23-2018
Welcome to the forum.


I can't talk on hadoop nor jdbc or such, but as you seem to be using bash, I can comment on a few sysntax errors in your script:
- set MVN_LIB_PATH = "adl://ayz...jar" : that's not bash, make it VAR="Value", no set, no spaces.
- if [ echo $? == 0] : no echo needed; make it [ $? == 0 ] (with all spaces!) or use "command substitution" like [ $(echo $?) == 0 ] (less effective).
- elif needs a condition, and a separate fi. Methinks else would do in this case...


Aside: while functions are a very valuable tool for scripting / programming, I can't see the benefit in above, as they all are single-lined and single call only.


Correct your errors, run the script and report back.
This User Gave Thanks to RudiC For This Post:
# 3  
Old 10-25-2018
Thanks RudiC..

Your suggestion worked. I could able to achieve this.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

To get latest hdfs file system

Hi All, I am having below hdfs file system /user/home/dte=2019_01_30/part_1 /user/home/dte=2019_01_30/part_2 /user/home/dte=2019_01_31/part_1 I need to take the latest month hdfs folder while passing date as parameter. For eg . if i pass as Feb month i.e. 20190201(YYYYMMDD), then... (0 Replies)
Discussion started by: Master_Mind
0 Replies

2. Shell Programming and Scripting

How to check total files size in hdfs directory?

Is there a way to calculate the total file size of HDFS file directory in GB or MB? I dont want to use du/df command. Without that is there a way HDFS Directory - /test/my_dir (1 Reply)
Discussion started by: rohit_shinez
1 Replies

3. UNIX for Beginners Questions & Answers

Help with HDFS Linux permission

Hi, I am unable to change the permissions for a directory in HDFS. from what i understand acl's supersede all other permissions. even if a directory is not owned by me, but there is an acl for me with rwx then i must be able to change the permissions of that directory. Please find the... (8 Replies)
Discussion started by: desind
8 Replies

4. UNIX for Beginners Questions & Answers

UNIX and HDFS - file systems on same partition.

I am learning Hadoop. As a part of that, Hdfs - Hadoop distributed file system has commands similar to unix where we can create,copy,move files from unix/linux file system to HDFS. My question is 1) how two file systems (unix and hdfs) can coexist on thr same partition.. 2)What if block... (1 Reply)
Discussion started by: Narendra Eliset
1 Replies

5. UNIX for Advanced & Expert Users

UNIX and HDFS - file systems on same partition.

I am learning Hadoop. As a part of that, Hdfs - Hadoop distributed file system has commands similar to unix where we can create,copy,move files from unix/linux file system to HDFS. My question is 1) how two file systems (unix and hdfs) can coexist on thr same partition.. 2)What if block used... (0 Replies)
Discussion started by: Narendra Eliset
0 Replies

6. Shell Programming and Scripting

Need help how to copy few records from hdfs to UNIX

Hi All , I am facing one issue here...I have a huge file in hadoop file system.Some disk space issues are thr ,thatswhy I want to copy 1st 100 records from hdfs to local unix.I tried below command but it is no working .Its giving error like cat: Unable to write to output stream.if any one can... (2 Replies)
Discussion started by: STCET22
2 Replies

7. Shell Programming and Scripting

Downloading hdfs file to local UNIX through UNIX script

Hi All , I am very new to unix script.I am aware of unix commands but never put together in unix script level.If any one can suggest me technical guidance in the below scenario that will highly beneficial. Data have been already migrated from mainframe to Hadoop file system(HDFS).HDFS server... (15 Replies)
Discussion started by: STCET22
15 Replies

8. Ubuntu

JDBC Connection

Hi, I am using ubuntu os.I need some information about JDBC connection. 1.how to connect eclipse with apache tomcat server 2.I am using mysql database,how to to find the database location 3.How to set the classpath 4.How to connect Mysql with java using eclipse.I need step by step procedure... (1 Reply)
Discussion started by: snallusami
1 Replies

9. Shell Programming and Scripting

Perl script to connect to database using JDBC driver?

How to connect to SQL Server database from perl script using JDBC driver? ---------- Post updated at 05:33 PM ---------- Previous update was at 05:07 PM ---------- i have sqljdbc.jar file. by setting the class path how can i connect to the database using perl script? (2 Replies)
Discussion started by: laknar
2 Replies
Login or Register to Ask a Question