amazon

Unix and Linux Discussions Tagged with amazon
	Thread / Thread Starter	Last Post	Replies	Views	Forum
	Amazon Censorship on Book Reviews Neo	05-25-2018 by retattr	1	869	What is on Your Mind?
	AWS Cloud EBS limit script! amar0000	01-01-2014 by amar0000	2	1,321	Shell Programming and Scripting
	Org Charts Neo	11-16-2013 by Neo	0	4,737	What is on Your Mind?
	ulimit values not being remembered (Ubuntu 11.10) primal	02-15-2012 by fpmurphy	1	7,649	Ubuntu
	anyone running SELinux on amazon EC2? fun_indra	08-21-2009 by fpmurphy	5	13,221	Virtualization and Cloud Computing
	CEP as a Service (CEPaaS) with MapReduce on Amazon EC2 and Amazon S3 Linux Bot	11-25-2008 by Linux Bot	0	6,551	Virtualization and Cloud Computing
	Link to Amazon wishlist or similar, acceptable? era	04-26-2008 by Neo	2	2,555	Post Here to Contact Site Administrators and Moderators
	TCP/IP Black Book happyfish	10-28-2001 by Perderabo	3	5,873	IP Networking
	DHCP, DNS and LDAP ollyparkhouse	08-20-2001 by Neo	2	5,252	IP Networking
	Help!! jainnik	08-19-2001 by Neo	1	2,320	Programming
	about wave file integrating. livic	08-19-2001 by Neo	1	3,134	Programming
	X-programming for beginners jfsuminist	08-19-2001 by Neo	1	5,897	Programming
	Regarding Make Scripts S.Vishwanath	08-15-2001 by PxT	5	2,955	UNIX for Dummies Questions & Answers
	Unix basic commands Peter Spellman	08-10-2001 by Peter Spellman	3	8,329	UNIX for Dummies Questions & Answers
	oracle 8i for linux gan	08-09-2001 by LivinFree	1	2,218	UNIX for Dummies Questions & Answers
	thread asutoshch	03-29-2001 by mib	1	2,181	UNIX for Dummies Questions & Answers
	IRIX books, manuals, references... lozinit	11-29-2000 by lozinit	6	6,887	UNIX for Dummies Questions & Answers
LEARN ABOUT DEBIAN

pegasus-s3

PEGASUS-S3(1)															     PEGASUS-S3(1)

NAME

       pegasus-s3 - Upload, download, delete objects in Amazon S3

SYNOPSIS

       pegasus-s3 help
       pegasus-s3 ls [options] URL
       pegasus-s3 mkdir [options] URL...
       pegasus-s3 rmdir [options] URL...
       pegasus-s3 rm [options] [URL...]
       pegasus-s3 put [options] FILE URL
       pegasus-s3 get [options] URL [FILE]
       pegasus-s3 lsup [options] URL
       pegasus-s3 rmup [options] URL [UPLOAD]

DESCRIPTION

       pegasus-s3 is a client for the Amazon S3 object storage service and any other storage services that conform to the Amazon S3 API, such as
       Eucalyptus Walrus.

OPTIONS

   Global Options
       -h, --help
	   Show help message for subcommand and exit

       -d, --debug
	   Turn on debugging

       -v, --verbose
	   Show progress messages

       -C FILE, --conf=FILE
	   Path to configuration file

   rm Options
       -f, --force
	   If the URL does not exist, then ignore the error.

       -F FILE, --file=FILE
	   File containing a list of URLs to delete

   put Options
       -c X, --chunksize=X
	   Set the chunk size for multipart uploads to X MB. A value of 0 disables multipart uploads. The default is 10MB, the min is 5MB and the
	   max is 1024MB. This parameter only applies for sites that support multipart uploads (see multipart_uploads configuration parameter in
	   the CONFIGURATION section). The maximum number of chunks is 10,000, so if you are uploading a large file, then the chunk size is
	   automatically increased to enable the upload. Choose smaller values to reduce the impact of transient failures.

       -p N, --parallel=N
	   Use N threads to upload FILE in parallel. The default value is 0, which disables parallel uploads. This parameter is only valid if the
	   site supports mulipart uploads and the --chunksize parameter is not 0.

       -b, --create-bucket
	   Create the destination bucket if it does not already exist

   get Options
       -c X, --chunksize=X
	   Set the chunk size for parallel downloads to X megabytes. A value of 0 will avoid chunked reads. This option only applies for sites
	   that support ranged downloads (see ranged_downloads configuration parameter). The default chunk size is 10MB, the min is 1MB and the
	   max is 1024MB. Choose smaller values to reduce the impact of transient failures.

       -p N, --parallel=N
	   Use N threads to upload FILE in parallel. The default value is 0, which disables parallel downloads. This parameter is only valid if
	   the site supports ranged downloads and the --chunksize parameter is not 0.

   rmup Options
       -a, --all
	   Cancel all uploads for the specified bucket

SUBCOMMANDS

       pegasus-s3 has several subcommands for different storage service operations.

       help
	   The help subcommand lists all available subcommands.

       ls
	   The ls subcommand lists the contents of a URL. If the URL does not contain a bucket, then all the buckets owned by the user are listed.
	   If the URL contains a bucket, but no key, then all the keys in the bucket are listed. If the URL contains a bucket and a key, then all
	   keys in the bucket that begin with the specified key are listed.

       mkdir
	   The mkdir subcommand creates one or more buckets.

       rmdir
	   The rmdir subcommand deletes one or more buckets from the storage service. In order to delete a bucket, the bucket must be empty.

       rm
	   The rm subcommand deletes one or more keys from the storage service.

       put
	   The put subcommand stores the file specified by FILE in the storage service under the bucket and key specified by URL. If the URL
	   contains a bucket, but not a key, then the file name is used as the key.

	   If a transient failure occurs, then the upload will be retried several times before pegasus-s3 gives up and fails.

	   The put subcommand can do both chunked and parallel uploads if the service supports multipart uploads (see multipart_uploads in the
	   CONFIGURATION section). Currently only Amazon S3 supports multipart uploads.

	   This subcommand will check the size of the file to make sure it can be stored before attempting to store it.

	   Chunked uploads are useful to reduce the probability of an upload failing. If an upload is chunked, then pegasus-s3 issues separate PUT
	   requests for each chunk of the file. Specifying smaller chunks (using --chunksize) will reduce the chances of an upload failing due to
	   a transient error. Chunksizes can range from 5 MB to 1GB (chunk sizes smaller than 5 MB produced incomplete uploads on Amazon S3). The
	   maximum number of chunks for any single file is 10,000, so if a large file is being uploaded with a small chunksize, then the chunksize
	   will be increased to fit within the 10,000 chunk limit. By default, the file will be split into 10 MB chunks if the storage service
	   supports multipart uploads. Chunked uploads can be disabled by specifying a chunksize of 0. If the upload is chunked, then each chunk
	   is retried independently under transient failures. If any chunk fails permanently, then the upload is aborted.

	   Parallel uploads can increase performance for services that support multipart uploads. In a parallel upload the file is split into N
	   chunks and each chunk is uploaded concurrently by one of M threads in first-come, first-served fashion. If the chunksize is set to 0,
	   then parallel uploads are disabled. If M > N, then the actual number of threads used will be reduced to N. The number of threads can be
	   specified using the --parallel argument. If --parallel is 0 or 1, then only a single thread is used. The default value is 0. There is
	   no maximum number of threads, but it is likely that the link will be saturated by 4 threads. Very high-bandwidth, long-delay links may
	   get better results with up to8 threads.

	   Under certain circumstances, when a multipart upload fails it could leave behind data on the server. When a failure occurs the put
	   subcommand will attempt to abort the upload. If the upload cannot be aborted, then a partial upload may remain on the server. To check
	   for partial uploads run the lsup subcommand. If you see an upload that failed in the output of lsup, then run the rmup subcommand to
	   remove it.

       get
	   The get subcommand retrieves an object from the storage service identified by URL and stores it in the file specified by FILE. If FILE
	   is not specified, then the key is used as the file name (Note: if the key has slashes, then the file name will be a relative
	   subdirectory, but pegasus-s3 will not create the subdirectory if it does not exist).

	   If a transient failure occurs, then the download will be retried several times before pegasus-s3 gives up and fails.

	   The get subcommand can do both chunked and parallel downloads if the service supports ranged downloads (see ranged_downloads in the
	   CONFIGURATION section). Currently only Amazon S3 has good support for ranged downloads. Eucalyptus Walrus supports ranged downloads,
	   but the current release, 1.6, is inconsistent with the Amazon interface and has a bug that causes ranged downloads to hang in some
	   cases. It is recommended that ranged downloads not be used with Eucalyptus until these issues are resolved.

	   Chunked downloads can be used to reduce the probability of a download failing. When a download is chunked, pegasus-s3 issues separate
	   GET requests for each chunk of the file. Specifying smaller chunks (using --chunksize) will reduce the chances that a download will
	   fail to do a transient error. Chunk sizes can range from 1 MB to 1 GB. By default, a download will be split into 10 MB chunks if the
	   site supports ranged downloads. Chunked downloads can be disabled by specifying a --chunksize of 0. If a download is chunked, then each
	   chunk is retried independently under transient failures. If any chunk fails permanently, then the download is aborted.

	   Parallel downloads can increase performance for services that support ranged downloads. In a parallel download, the file to be
	   retrieved is split into N chunks and each chunk is downloaded concurrently by one of M threads in a first-come, first-served fashion.
	   If the chunksize is 0, then parallel downloads are disabled. If M > N, then the actual number of threads used will be reduced to N. The
	   number of threads can be specified using the --parallel argument. If --parallel is 0 or 1, then only a single thread is used. The
	   default value is 0. There is no maximum number of threads, but it is likely that the link will be saturated by 4 threads. Very
	   high-bandwidth, long-delay links may get better results with up to8 threads.

       lsup
	   The lsup subcommand lists active multipart uploads. The URL specified should point to a bucket. This command is only valid if the site
	   supports multipart uploads. The output of this command is a list of keys and upload IDs.

	   This subcommand is used with rmup to help recover from failures of multipart uploads.

       rmup
	   The rmup subcommand cancels and active upload. The URL specified should point to a bucket, and UPLOAD is the long, complicated upload
	   ID shown by the lsup subcommand.

	   This subcommand is used with lsup to recover from failures of multipart uploads.

URL FORMAT

       All URLs for objects stored in S3 should be specified in the following format:

	   s3[s]://USER@SITE[/BUCKET[/KEY]]

       The protocol part can be s3:// or s3s://. If s3s:// is used, then pegasus-s3 will force the connection to use SSL and override the setting
       in the configuration file. If s3:// is used, then whether the connection uses SSL or not is determined by the value of the endpoint
       variable in the configuration for the site.

       The USER@SITE part is required, but the BUCKET and KEY parts may be optional depending on the context.

       The USER@SITE portion is referred to as the "identity", and the SITE portion is referred to as the "site". Both the identity and the site
       are looked up in the configuration file (see CONFIGURATION) to determine the parameters to use when establishing a connection to the
       service. The site portion is used to find the host and port, whether to use SSL, and other things. The identity portion is used to
       determine which authentication tokens to use. This format is designed to enable users to easily use multiple services with multiple
       authentication tokens. Note that neither the USER nor the SITE portion of the URL have any meaning outside of pegasus-s3. They do not refer
       to real usernames or hostnames, but are rather handles used to look up configuration values in the configuration file.

       The BUCKET portion of the URL is the part between the 3rd and 4th slashes. Buckets are part of a global namespace that is shared with other
       users of the storage service. As such, they should be unique.

       The KEY portion of the URL is anything after the 4th slash. Keys can include slashes, but S3-like storage services do not have the concept
       of a directory like regular file systems. Instead, keys are treated like opaque identifiers for individual objects. So, for example, the
       keys a/b and a/c have a common prefix, but cannot be said to be in the same directory.

       Some example URLs are:

	   s3://ewa@amazon
	   s3://juve@skynet/gideon.isi.edu
	   s3://juve@magellan/pegasus-images/centos-5.5-x86_64-20101101.part.1
	   s3s://ewa@amazon/pegasus-images/data.tar.gz

CONFIGURATION

       Each user should specify a configuration file that pegasus-s3 will use to look up connection parameters and authentication tokens.

   Search Path
       This client will look in the following locations, in order, to locate the user's configuration file:

	1. The -C/--conf argument

	2. The S3CFG environment variable

	3. $HOME/.s3cfg

       If it does not find the configuration file in one of these locations it will fail with an error.

   Configuration File Format
       The configuration file is in INI format and contains two types of entries.

       The first type of entry is a site entry, which specifies the configuration for a storage service. This entry specifies the service endpoint
       that pegasus-s3 should connect to for the site, and some optional features that the site may support. Here is an example of a site entry
       for Amazon S3:

	   [amazon]
	   endpoint = http://s3.amazonaws.com/

       The other type of entry is an identity entry, which specifies the authentication information for a user at a particular site. Here is an
       example of an identity entry:

	   [pegasus@amazon]
	   access_key = 90c4143642cb097c88fe2ec66ce4ad4e
	   secret_key = a0e3840e5baee6abb08be68e81674dca

       It is important to note that user names and site names used are only logical--they do not correspond to actual hostnames or usernames, but
       are simply used as a convenient way to refer to the services and identities used by the client.

       The configuration file should be saved with limited permissions. Only the owner of the file should be able to read from it and write to it
       (i.e. it should have permissions of 0600 or 0400). If the file has more liberal permissions, then pegasus-s3 will fail with an error
       message. The purpose of this is to prevent the authentication tokens stored in the configuration file from being accessed by other users.

   Configuration Variables
       endpoint (site)
	   The URL of the web service endpoint. If the URL begins with https, then SSL will be used.

       max_object_size (site)
	   The maximum size of an object in GB (default: 5GB)

       multipart_uploads (site)
	   Does the service support multipart uploads (True/False, default: False)

       ranged_downloads (site)
	   Does the service support ranged downloads? (True/False, default: False)

       access_key (identity)
	   The access key for the identity

       secret_key (identity)
	   The secret key for the identity

   Example Configuration
       This is an example configuration that specifies a two sites (amazon and magellan) and three identities (pegasus@amazon,juve@magellan, and
       voeckler@magellan). For the amazon site the maximum object size is 5TB, and the site supports both multipart uploads and ranged downloads,
       so both uploads and downloads can be done in parallel.

	   [amazon]
	   endpoint = https://s3.amazonaws.com/
	   max_object_size = 5120
	   multipart_uploads = True
	   ranged_downloads = True

	   [pegasus@amazon]
	   access_key = 90c4143642cb097c88fe2ec66ce4ad4e
	   secret_key = a0e3840e5baee6abb08be68e81674dca

	   [magellan]
	   # NERSC Magellan is a Eucalyptus site. It doesn't support multipart uploads,
	   # or ranged downloads (the defaults), and the maximum object size is 5GB
	   # (also the default)
	   endpoint = https://128.55.69.235:8773/services/Walrus

	   [juve@magellan]
	   access_key = quwefahsdpfwlkewqjsdoijldsdf
	   secret_key = asdfa9wejalsdjfljasldjfasdfa

	   [voeckler@magellan]
	   # Each site can have multiple associated identities
	   access_key = asdkfaweasdfbaeiwhkjfbaqwhei
	   secret_key = asdhfuinakwjelfuhalsdflahsdl

EXAMPLE

       List all buckets owned by identity user@amazon:

	   $ pegasus-s3 ls s3://user@amazon

       List the contents of bucket bar for identity user@amazon:

	   $ pegasus-s3 ls s3://user@amazon/bar

       List all objects in bucket bar that start with hello:

	   $ pegasus-s3 ls s3://user@amazon/bar/hello

       Create a bucket called mybucket for identity user@amazon:

	   $ pegasus-s3 mkdir s3://user@amazon/mybucket

       Delete a bucket called mybucket:

	   $ pegasus-s3 rmdir s3://user@amazon/mybucket

       Upload a file foo to bucket bar:

	   $ pegasus-s3 putfoo s3://user@amazon/bar/foo

       Download an object foo in bucket bar:

	   $ pegasus-s3 get s3://user@amazon/bar/foo foo

       Upload a file in parallel with 4 threads and 100MB chunks:

	   $ pegasus-s3 put --parallel 4 --chunksize 100 foo s3://user@amazon/bar/foo

       Download an object in parallel with 4 threads and 100MB chunks:

	   $ pegasus-s3 get --parallel 4 --chunksize 100 s3://user@amazon/bar/foo foo

       List all partial uploads for bucket bar:

	   $ pegasus-s3 lsup s3://user@amazon/bar

       Remove all partial uploads for bucket bar:

	   $ pegasus-s3 rmup --all s3://user@amazon/bar

RETURN VALUE

       pegasus-s3 returns a zero exist status if the operation is successful. A non-zero exit status is returned in case of failure.

AUTHOR

       Gideon Juve <juve@usc.edu>

       Pegasus Team http://pegasus.isi.edu

								    05/24/2012							     PEGASUS-S3(1)