Enterprise in the Cloud: spot

Showing posts with label spot. Show all posts

Sunday, October 6, 2013

AWS Cost Allocation Reporting

https://event.on24.com/eventRegistration/EventLobbyServlet?target=registration.jsp&eventid=681641&sessionid=1&key=6926138D02D575421B054D213DEE4E7C&partnerref=AWSevents_v1&sourcepage=register

Simplify Cost Allocation Reporting with CloudCheckr on Amazon Web Services
As deployments scale, utilization and purchasing strategies become increasing important. Amazon Web Services (AWS) and CloudCheckr, an AWS Technology partner invite you to join this presentation to learn strategies that can enable finer control and cost management structures to improve your price-to-performance metrics. We'll provide an overview how customers today evaluate resource distribution and configurations to efficiently scale deployments using CloudCheckr. In particular, we'll address the differences between on-demand, reserved and spot instances and the most appropriate use for each.

Friday, June 7, 2013

AWS EMR : Getting started for Oracle DBAs

Newer technologies such as MapReduce (AWS EMR, Hadoop) and noSQL (MongoDB, AWS DynamoDB...) can be confusing to Oracle DBAs. This blog post takes a quick look at AWS Elastic Map Reduce (EMR) and attempts to demystify it for Oracle DBAs. Going back before RDBMs products, MapReduce is like a mainframe batch job with no restart ability built in. MapReduce facilities the processing of large volumes of data in one large batch. This one large batch, however, is broken into tens or hundreds of smaller pieces of work and processed by MapReduce worker nodes. This makes MapReduce a great solution for processing web logs, sensor data, genome data, large volumes of transactions, telephone call detail records, vote ballots, and other instances where large volumes of data need to be processed once and the results stored.MapReduce is a framework so you have to write to an API in your application in order to take advantage of MapReduce. There are a number of implementations of this framework including Apache Hadoop and AWS Elastic Map Reduce (EMR). Apache Hadoop has no native data store associates with it (although Hadoop Distributed File System - HDFS can be used natively).As mentioned, you need to code your own application using the MapReduce framework. AWS makes getting started with MapReduce by providing sample applications for EMR. One of the five sample EMR applications is a Java application for processing for AWS CloudFront logs. The is a Java application that uses Cascading to analyze and generate usage reports from Amazon CloudFront http access logs. You specify the EMR input source (CloudFront log location in S3) in the JAR arguments and you also specify the S3 bucket that will hold the results (output).

For the CloudFront HTTP LogAnalyzer the input and output files use S3. However, HDFS or AWS DynamoDB are commonly used as input sources and sometimes used as output sources. You may want to use DynamoDB as an output source if you which to load the results into RedShift or do future BI analysis on the results. You could also send the results to an AWS SQS queue to be handled later for processing to S3, DynamoDB, RDS or some other persistent data store.

Monday, May 13, 2013

AWS Auto Scaling for batch processing

Can I use Auto Scaling for scaling out a set of servers that run periodically (auto scaling of zero) ? This is not the typical auto scaling use case but it is possible. This would most likely be used for use cases such as batch processing where instances would be spun up to support processing that happens periodically. Here are some ways to solve this use case:

Auto Scaling, SQS and CloudWatch : You can but it would require to use AWS SQS with Auto Scaling and you would set up an alarm that uses SQS instead of CloudWatch or a custom metrics. See more here:
http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/as-using-sqs-queue.html
It is possible to set up an auto scaling group with min EC2 size of ):
http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/US_BasicSetup.html. You can issue this command: as-create-auto-scaling-group my-test-asg --launch-configuration my-test-lc --availability-zones us-east-1a --min-size 0 --max-size 10 --desired-capacity 1
OpsWorks : OpsWorks provides a graphical tool for scheduling time-based auto scaling (e.g. “start an instance every Monday at 8:00am and shut it down at 5:00p.m.”).
Auto Scaling and Cron : The auto scaling API supports a cron-style "recurrence" attribute for scheduled actions:

http://docs.aws.amazon.com/AutoScaling/latest/APIReference/API_PutScheduledUpdateGroupAction.html