Showing posts with label time. Show all posts
Showing posts with label time. Show all posts

Thursday, June 26, 2014

Amazon EMR termination using Data Pipeline

Data Pipeline provides a “terminateAfter” functionality for all activities, including EmrActivity. It is possible to set terminateAfter to be relative to the start time.   It is all possible to wrap your existing EMR jobflow in a Data Pipeline EmrActivity and then set the terminateAfter on the EmrCluster object. 

Monday, June 23, 2014

EC2 Instance create date and time

Some times you may want to retrieve the creation date and time of an EC2 instance.
From the docs, an EC2 instance has the property launchTime(http://docs.aws.amazon.com/AWSEC2/latest/APIReference/ApiReference-ItemType-RunningInstancesItemType.html), you can easily build a boto script that queries for all instances and reports the launchTime(http://boto.readthedocs.org/en/latest/ref/ec2.html#module-boto.ec2.instance).

Thursday, July 18, 2013

SQS message retention and visibility

There are two important attributes (http://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/Query_QuerySetQueueAttributes.html set when working with AWS SQS.  The first is the MessageRetentionPeriod and the other is the VisibilityTimeout.  These are two very different and distinct attributes.  The MessageRetentionPeriod is the length of time (in seconds) the message will stay in the queue (unless it is deleted).  The value can be 60 (1 minute) to 1209600 (14 days).  The VisibilityTimeout is the length of time no other applications can NOT see the message while an application is processing the message.  This can be set to be 0 to 43200 (12 hours). The longer the time this is set the longer you expect one process to be working on this message.

SQS automatically deletes messages that have been in a queue for more than maximum message retention period. The default message retention period is 4 days. However, you can set the message retention period to a value from 60 seconds to 1209600 seconds (14 days) with SetQueueAttributes

More on why messages are not deleted once read and the visibility of a message while it is being processed by an application:
http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/AboutVT.html

Tuesday, May 28, 2013

Auto Scaling based upon schedule

Sometime you may have nightly, monthly, year end, or periodic (calendar based) schedules where you would like to scale out your AWS infrastructure.  Here is more information on schedule based auto scaling:
http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/schedule_time.html

Wednesday, April 24, 2013

Oracle RDS timezone


The default time zone for your RDS instance is UTC and cannot be changed on the DB. You can set the desired timezone (EST) per database connection. The time zone will only be valid for the connection therefore it needs to be done for each connection (and each time you connect).



Wednesday, December 5, 2012

Copying 1 TB of data from on premise RDBMS to AWS/EC2


I often get asked how long it will take to transmit a 1 TB database (assumes that the database is actually 1 TB in size and not just allocated at 1 TB). Elapsed time is going to depend on many factors: speed of connection, compression used (if any), can load be done in parallel etc.  The fastest load times will happen using AWS import/export (http://aws.amazon.com/importexport/ ).  The next fastest will be using AWS Direct Connect (http://aws.amazon.com/directconnect). Direct Connect provides speeds of 1 Gbps to 10Gbps.  Using a 1 Gbps line, it would take about 3-4 hours (here is a good site: http://fasterdata.es.net/fasterdata-home/requirements-and-expectations/).    Assuming an internet connection speed of 100 Mbps (100Base-T connection) it would take about a day to transfer using a traditional connection to AWS/EC2  Both DirectConnect and Import/export provide a secure method of transfer.